In [None]:
import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
import torch.backends.cudnn as cudnn
from torch.optim.lr_scheduler import MultiStepLR
from torchvision.utils import make_grid
from torchvision import datasets, transforms
import cv2


import math
import numpy as np
import csv
from PIL import Image
import matplotlib.pyplot as plt
import pdb
import argparse
from tqdm import tqdm
import os


In [None]:
# note: This notebook has been developed and tested for pytorch 
print(torch. __version__)

# Cutout data augmentation

In this notebook, we will reproduce the results of the paper

> DeVries, T. and Taylor, G.W., 2017. Improved regularization of convolutional neural networks with Cutout. arXiv preprint [arXiv:1708.04552](https://arxiv.org/abs/1708.04552).

We will use the author’s implementation of their technique, from <https://github.com/uoguelph-mlrg/Cutout>, which is licensed under an Educational Community License version 2.0.

## 1. Learning outcomes

After working through this notebook, you should be able to:

-   Describe how Cutout works as a regularization technique,
-   Enumerate specific claims (both quantitative claims, qualitative claims, and claims about the underlying mechanism behind a result) from the Cutout paper,
-   Execute experiments (following the given procedure) to try and validate each claim about Cutout data augmentation,
-   Evaluate whether your own result matches quantitative claims in the Cutout paper (i.e. whether it is within the confidence intervals for each reported numeric result),
-   Evaluate whether your own result validates qualitative claims in the Cutout paper,
-   Evaluate whether your own results support the author’s claim about the underlying mechanism behind the result.

In the sections that follow, we will identify and evaluate claims from the original Cutout paper:

1.  Cutout improves the robustness and overall performance of convolutional neural networks.
2.  Cutout can be used in conjunction with existing forms of data augmentation and other regularizers to further improve model performance.
3.  Cutout aimed to remove maximally activated features in order to encourage the network to consider less prominent features

## 2. Cutout as a regularization technique

This Jupyter notebook is designed to illustrate the implementation and usage of the Cutout data augmentation technique in deep learning, specifically in the context of Convolutional Neural Networks (CNNs).

Cutout is a regularization and data augmentation technique for convolutional neural networks (CNNs). It involves randomly masking out square regions of input during training. This helps to improve the robustness and overall performance of CNNs by encouraging the network to better utilize the full context of the image, rather than relying on the presence of a small set of specific visual features.

Cutout is computationally efficient as it can be applied during data loading in parallel with the main training task. It can be used in conjunction with existing forms of data augmentation and other regularizers to further improve model performance.

The technique has been evaluated with state-of-the-art architectures on popular image recognition datasets such as CIFAR-10, CIFAR-100, and SVHN, often achieving state-of-the-art or near state-of-the-art results.

### Implementation of Cutout

In [None]:
# Import necessary libraries
from torchvision.transforms import RandomHorizontalFlip, RandomCrop, ColorJitter
import numpy as np
import torch
from PIL import Image
from torchvision import transforms
import matplotlib.pyplot as plt

In the following cells, we will see how Cutout works when applied to a sample image.

<!-- To do: explain the code with reference to section 3.2. Implementation Details -->

In the code provided above, we see a Python class named Cutout defined. This class is designed to apply the Cutout data augmentation technique to an image. Below is an explanation of the class and its methods:

-   The Cutout class is initialized with two parameters:

    -   `n_holes`: the number of patches to cut out of each image.
    -   `length`: the length (in pixels) of each square patch.

-   The `__call__` method implements the Cutout technique. This method takes as input a tensor `img` representing an image, and returns the same image with `n_holes` number of patches of dimension `length` x `length` cut out of it.

Here’s a step-by-step explanation of what’s happening inside the `__call__` method:

1.  The method first retrieves the height h and width w of the input image.

2.  A mask is then initialized as a 2D numpy array of ones with the same dimensions as the input image.

3.  The method then enters a loop which runs for n_holes iterations. In each iteration:

    -   A pair of coordinates y and x are randomly selected within the height and width of the image.

    -   The method then calculates the coordinates of a square patch around the (y, x) coordinate. The patch has a length of length pixels, and the method ensures that the patch doesn’t fall outside the image by using the np.clip function.

    -   The corresponding area in the mask is set to zero.

4.  The mask is then converted to a PyTorch tensor and expanded to the same number of channels as the input image.

5.  Finally, the method applies the mask to the input image, effectively setting the pixels in the masked regions to zero, and returns the result.

Remember to import necessary libraries like numpy (np) and PyTorch (torch) before running this class definition. The class Cutout can then be used as part of your data augmentation pipeline when training your models.

The Cutout code we are using comes from this specific file in the original GitHub repository: \[https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py\].

In [None]:
# to do: link to the file in the original repo that this comes from
# Source Code from https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py
class Cutout(object):
    """Randomly mask out one or more patches from an image.

    Args:
        n_holes (int): Number of patches to cut out of each image.
        length (int): The length (in pixels) of each square patch.
    """
    def __init__(self, n_holes, length):
        self.n_holes = n_holes
        self.length = length

    def __call__(self, img):
        """
        Args:
            img (Tensor): Tensor image of size (C, H, W).
        Returns:
            Tensor: Image with n_holes of dimension length x length cut out of it.
        """
        h = img.size(1)
        w = img.size(2)

        mask = np.ones((h, w), np.float32)

        for n in range(self.n_holes):
            y = np.random.randint(h)
            x = np.random.randint(w)

            y1 = np.clip(y - self.length // 2, 0, h)
            y2 = np.clip(y + self.length // 2, 0, h)
            x1 = np.clip(x - self.length // 2, 0, w)
            x2 = np.clip(x + self.length // 2, 0, w)

            mask[y1: y2, x1: x2] = 0.

        mask = torch.from_numpy(mask)
        mask = mask.expand_as(img)
        img = img * mask

        return img

To see how it works, in the following cell, you will upload an image of your choice to this workspace. To prevent any distortion due to resizing, it is advised to use an image that is approximately square in shape, as we will be resizing the image to a square format (100x100 pixels) later on:

<!-- to do - add instructions for uploading image on Colab, or on Chameleon -->

To see how Cutout works, let’s upload an image and apply Cutout to it. Follow these steps to upload an image in this Google Colab notebook:

1.  Click on the folder icon in the left sidebar to open the ‘Files’ tab.
2.  Click the ‘Upload to session storage’ button (the icon looks like a file with an up arrow).
3.  Select the image file from your local machine that you want to upload.
4.  Wait for the upload to finish. The uploaded file should now appear in the ‘Files’ tab. After the image is uploaded, we can use Python code to load it into our notebook and apply the Cutout augmentation

If you are using Chameleon, here are the steps: <!-- to do - add instructions for uploading image on Chameleon -->

1.  Click on the upload icon in the left sidebar.
2.  Select the image file from your local machine that you want to upload.
3.  Wait for the upload to finish. The uploaded file should now appear in the ‘Files’ tab. After the image is uploaded, we can use Python code to load it into our notebook and apply the Cutout augmentation to the image.

In [None]:
# TODO: Replace 'sample.png' with the filename of your own image. 
# If your image is inside a directory, include the directory's name in the path.
img = Image.open('./sample.png')

# Resize the image to 100x100
img = img.resize((100, 100))

Then, the following cell will display your image directly, without any data augmentation:

In [None]:
# Convert the image to a PyTorch tensor
img_tensor = transforms.ToTensor()(img)

# Display the original image
plt.figure(figsize=(6,6))
plt.imshow(img_tensor.permute(1, 2, 0))
plt.show()

and the next cell will display your image with Cutout applied:

In [None]:
# Create a Cutout object
cutout_obj = Cutout(n_holes=1, length=50)

# Apply Cutout to the image
img_tensor_Cutout = cutout_obj(img_tensor)

# Convert the tensor back to an image for visualization
img_Cutout = transforms.ToPILImage()(img_tensor_Cutout)

# Display the image with Cutout applied
plt.figure(figsize=(6,6))
plt.imshow(img_tensor_Cutout.permute(1, 2, 0))
plt.show()

Things to try:

-   You can re-run the cell above several times to see how the occlusion is randomly placed in a different position each time.
-   You can try changing the `length` parameter in the cell above, and re-running, to see how the size of the occlusion can change.
-   You can try changing the `n_holes` parameter in the cell above, and re-running, to see how the number of occlusions can change.

In [None]:
 #TODO: Set the number of patches ("holes") to cut out of the image.
n_holes = 

#TODO: Set the size (length of a side) of each patch.
length = 


# Create a Cutout object
Cutout = Cutout(n_holes, length)

# Apply Cutout to the image
img_tensor_Cutout = Cutout(img_tensor)

# Convert the tensor back to an image for visualization
img_Cutout = transforms.ToPILImage()(img_tensor_Cutout)

# Display the image with Cutout applied
plt.figure(figsize=(6,6))
plt.imshow(img_tensor_Cutout.permute(1, 2, 0))
plt.show()

Cutout was introduced as an alternative to two closely related techniques:

-   Data Augmentation for Images: Data augmentation is a strategy used to increase the diversity of the data available for training models, without actually collecting new data. For image data, this could include operations like rotation, scaling, cropping, flipping, and adding noise. The goal is to make the model more robust by allowing it to see more variations of the data.

-   Dropout in Convolutional Neural Networks: Dropout is a regularization technique for reducing overfitting in neural networks. During training, some number of layer outputs are randomly ignored or “dropped out”. This has the effect of making the layer look-like and be treated-like a layer with a different number of nodes and connectivity to the prior layer. In effect, dropout simulates ensembling a large number of neural networks with different architectures, which makes the model more robust.

<!-- to do - expand on these -->

In the following code snippet, we demonstrate some “standard” data augmentation techniques commonly used in image preprocessing. These techniques include random horizontal flipping, random cropping, and color jittering (random variation in brightness, contrast, saturation, and hue). The augmented image is then displayed alongside the original image for comparison.

In [None]:
# to do - show the same image with "standard" data augmentation techniques
# discussed in the related work section of the paper

# Define standard data augmentation techniques
transforms_data_augmentation = transforms.Compose([
    RandomHorizontalFlip(),
    RandomCrop(size=(100, 100), padding=4),  # assuming input image is size 100x100
    ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1),
])

# Apply transformations to the image
augmented_img = transforms_data_augmentation(img)

# Display the original and augmented image
fig, ax = plt.subplots(1, 2, figsize=(10, 5))
ax[0].imshow(img)
ax[0].set_title('Original Image')
ax[1].imshow(augmented_img)
ax[1].set_title('Augmented Image')
plt.show()


# 02. ResNet

Note: for faster training, use Runtime \> Change Runtime Type to run this notebook on a GPU.

In the Cutout paper, the authors claim that:

1.  Cutout improves the robustness and overall performance of convolutional neural networks.
2.  Cutout can be used in conjunction with existing forms of data augmentation and other regularizers to further improve model performance.

In this section, we will evaluate these claims using a ResNet model. For the ResNet model, the specific quantitative claims are given in the following table:

Test error (%, flip/translation augmentation, mean/std normalization, mean of 5 runs) and “+” indicates standard data augmentation (mirror + crop)

| **Network**       | **CIFAR-10** | **CIFAR-10+** | **CIFAR-100** | **CIFAR-100+** |
|-------------|--------------|----------------|--------------|----------------|
| ResNet18          | 10.63        | 4.72          | 36.68         | 22.46          |
| ResNet18 + cutout | 9.31         | 3.99          | 34.98         | 21.96          |

The provided table displays the results of experiments conducted on the CIFAR-10 and CIFAR-100 datasets using the ResNet18 architecture, revealing the impact of standard and cutout data augmentation techniques. The “CIFAR-10+” and “CIFAR-100+” labels indicate the use of standard data augmentation, which involves mirror and crop techniques.

With the use of standard data augmentation on CIFAR-10, the ResNet18 model’s test error is significantly reduced from 14.04% to 5.72%. Further enhancement is achieved when cutout augmentation is applied, bringing down the error to 4.86%. A similar pattern is observed in the case of the CIFAR-100 dataset, where standard augmentation reduces the ResNet18 model’s test error from 40.13% to 24.36%. Upon applying cutout augmentation, a slight further reduction in error to 23.9% is noted.

These findings emphasize the efficacy of both standard and cutout data augmentation techniques in enhancing the model’s performance, evidenced by the reduction in test error rates on both CIFAR-10 and CIFAR-100 datasets. The results also highlight that the impact of data augmentation can vary based on the complexity of the dataset, illustrated by the differing rates of error reduction between CIFAR-10 and CIFAR-100.

## Import Library

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
import torch.backends.cudnn as cudnn
from torch.optim.lr_scheduler import MultiStepLR
from torchvision import datasets, transforms
import numpy as np
import os
from tqdm import tqdm

Check Cuda GPU availability and set seed number

In [None]:
cuda = torch.cuda.is_available()
print(cuda)
cudnn.benchmark = True  # Should make training should go faster for large models

seed = 1
torch.manual_seed(seed)
np.random.seed(seed)

If you are using Google Colab, here’s a step-by-step how to connect with your google drive:

1.  On the left sidebar of the Colab notebook interface, you will see a folder icon with the Google Drive logo. Click on this folder icon to open the file explorer.

2.  If you haven’t connected your Google Drive to Colab yet, it will prompt you to do so. Click the “Mount Drive” button to connect your Google Drive to Colab.

3.  Once your Google Drive is mounted, you can use the file explorer to navigate to the file you want to open. Click on the folders to explore the contents of your Google Drive.

4.  When you find the file you want to open, click the three dots next to the name of the file in the file explorer. From the options that appear, choose “Copy path.” This action will copy the full path of the file to your clipboard. Paste the copy path into the ‘current_path’ below.

In [None]:
current_path ="./"

This code block is used for creating a directory named ‘checkpoints’. This directory will be used to store the weights of our models, which are crucial for both preserving our progress during model training and for future use of the trained models.

Creating such a directory and regularly saving model weights is a good practice in machine learning, as it ensures that you can resume your work from where you left off, should the training process be interrupted.

In [None]:
# Create file names 'checkpoints' to save the weight of the models
if not os.path.exists(current_path + 'checkpoints'):
    os.makedirs(current_path + 'checkpoints')

## 2.1 Implementation Code

### 2.1.1 ResNet Code

In [None]:
# ResNet
# From https://github.com/uoguelph-mlrg/Cutout/blob/master/model/resnet.py

def conv3x3(in_planes, out_planes, stride=1):
    return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, padding=1, bias=False)


class BasicBlock(nn.Module):
    expansion = 1

    def __init__(self, in_planes, planes, stride=1):
        super(BasicBlock, self).__init__()
        self.conv1 = conv3x3(in_planes, planes, stride)
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = conv3x3(planes, planes)
        self.bn2 = nn.BatchNorm2d(planes)

        self.shortcut = nn.Sequential()
        if stride != 1 or in_planes != self.expansion*planes:
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_planes, self.expansion*planes, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(self.expansion*planes)
            )

    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.bn2(self.conv2(out))
        out += self.shortcut(x)
        out = F.relu(out)
        return out


class Bottleneck(nn.Module):
    expansion = 4

    def __init__(self, in_planes, planes, stride=1):
        super(Bottleneck, self).__init__()
        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, bias=False)
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(planes)
        self.conv3 = nn.Conv2d(planes, self.expansion*planes, kernel_size=1, bias=False)
        self.bn3 = nn.BatchNorm2d(self.expansion*planes)

        self.shortcut = nn.Sequential()
        if stride != 1 or in_planes != self.expansion*planes:
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_planes, self.expansion*planes, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(self.expansion*planes)
            )

    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = F.relu(self.bn2(self.conv2(out)))
        out = self.bn3(self.conv3(out))
        out += self.shortcut(x)
        out = F.relu(out)
        return out


class ResNet(nn.Module):
    def __init__(self, block, num_blocks, num_classes=10):
        super(ResNet, self).__init__()
        self.in_planes = 64

        self.conv1 = conv3x3(3,64)
        self.bn1 = nn.BatchNorm2d(64)
        self.layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)
        self.layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)
        self.layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)
        self.layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)
        self.linear = nn.Linear(512*block.expansion, num_classes)

    def _make_layer(self, block, planes, num_blocks, stride):
        strides = [stride] + [1]*(num_blocks-1)
        layers = []
        for stride in strides:
            layers.append(block(self.in_planes, planes, stride))
            self.in_planes = planes * block.expansion
        return nn.Sequential(*layers)

    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.layer1(out)
        out = self.layer2(out)
        out = self.layer3(out)
        out = self.layer4(out)
        out = F.avg_pool2d(out, 4)
        out = out.view(out.size(0), -1)
        out = self.linear(out)
        return out


def ResNet18(num_classes=10):
    return ResNet(BasicBlock, [2,2,2,2], num_classes)

### 2.1.2. Model Evaluate Test Code

This function evaluates the performance of the model on a given data loader (loader). It sets the model to evaluation mode (eval), calculates the accuracy on the dataset, and returns the validation accuracy. It then switches the model back to training mode (train) before returning the validation accuracy.

In [None]:
def test(loader, cnn):
    cnn.eval()    # Change model to 'eval' mode (BN uses moving mean/var).
    correct = 0.
    total = 0.
    for images, labels in loader:
        images = images.cuda()
        labels = labels.cuda()

        with torch.no_grad():
            pred = cnn(images)

        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels).sum().item()

    val_acc = correct / total
    cnn.train()
    return val_acc

## 2.2 Training ResNet-18 in CIFAR-10

### 2.2.1. Training ResNet-18 in CF10 without Cutout

Image Processing for CIFAR-10

In [None]:
# Image Preprocessing

normalize_image_cifar10 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar10 = transforms.Compose([])

train_transform_cifar10.transforms.append(transforms.ToTensor())
train_transform_cifar10.transforms.append(normalize_image_cifar10)



test_transform_cifar10 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar10])

Import the dataset of CIFAR-10

In [None]:
train_dataset_cifar10 = datasets.CIFAR10(root=current_path + 'data/',
                                     train=True,
                                     transform=train_transform_cifar10,
                                     download=True)

test_dataset_cifar10 = datasets.CIFAR10(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar10,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar10 = 128
train_loader_cifar10 = torch.utils.data.DataLoader(dataset=train_dataset_cifar10,
                                           batch_size=batch_size_cifar10,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar10 = torch.utils.data.DataLoader(dataset=test_dataset_cifar10,
                                          batch_size=batch_size_cifar10,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_resnet18_cifar10 = "resnet18_cifar10"

num_classes_cifar10 = 10
resnet18_cifar10 = ResNet18(num_classes=num_classes_cifar10)


resnet18_cifar10 = resnet18_cifar10.cuda()
learning_rate_resnet18_cifar10 = 0.1
criterion_resnet18_cifar10 = nn.CrossEntropyLoss().cuda()
cnn_optimizer_resnet18_cifar10 = torch.optim.SGD(resnet18_cifar10.parameters(), lr=learning_rate_resnet18_cifar10,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_resnet18_cifar10 = MultiStepLR(cnn_optimizer_resnet18_cifar10, milestones=[60, 120, 160], gamma=0.2)

Training ResNet-18 withuout Cutout

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar10)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        resnet18_cifar10.zero_grad()
        pred = resnet18_cifar10(images)

        xentropy_loss = criterion_resnet18_cifar10(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_resnet18_cifar10.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_accr_resnet18_cifar10 = test(test_loader_cifar10, resnet18_cifar10)
    tqdm.write('test_acc: %.3f' % (test_accr_resnet18_cifar10))

    scheduler_resnet18_cifar10.step()    

    
torch.save(resnet18_cifar10.state_dict(), current_path + 'checkpoints/' + file_name_resnet18_cifar10 + '.pt')


final_test_acc_resnet18_cifar10 = (1 - test(test_loader_cifar10, resnet18_cifar10))*100
print('Final Result ResNet-18 without Cutout for Test CIFAR-10 Dataset: %.3f' % (final_test_acc_resnet18_cifar10))

### 2.2.2. Training ResNet-18 in CF10 with Cutout

Cutout Code

In [None]:
# to do: link to the file in the original repo that this comes from
# Source Code from https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py
class Cutout(object):
    """Randomly mask out one or more patches from an image.

    Args:
        n_holes (int): Number of patches to cut out of each image.
        length (int): The length (in pixels) of each square patch.
    """
    def __init__(self, n_holes, length):
        self.n_holes = n_holes
        self.length = length

    def __call__(self, img):
        """
        Args:
            img (Tensor): Tensor image of size (C, H, W).
        Returns:
            Tensor: Image with n_holes of dimension length x length cut out of it.
        """
        h = img.size(1)
        w = img.size(2)

        mask = np.ones((h, w), np.float32)

        for n in range(self.n_holes):
            y = np.random.randint(h)
            x = np.random.randint(w)

            y1 = np.clip(y - self.length // 2, 0, h)
            y2 = np.clip(y + self.length // 2, 0, h)
            x1 = np.clip(x - self.length // 2, 0, w)
            x2 = np.clip(x + self.length // 2, 0, w)

            mask[y1: y2, x1: x2] = 0.

        mask = torch.from_numpy(mask)
        mask = mask.expand_as(img)
        img = img * mask

        return img

Image Processing for CIFAR-10

In [None]:
# Image Preprocessing

normalize_image_cifar10 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar10_cutout = transforms.Compose([])

train_transform_cifar10_cutout.transforms.append(transforms.ToTensor())
train_transform_cifar10_cutout.transforms.append(normalize_image_cifar10)

#Add Cutout to the image transformer pipeline
n_holes_cifar10 = 1
length_cifar10 = 16
train_transform_cifar10_cutout.transforms.append(Cutout(n_holes=n_holes_cifar10, length=length_cifar10))


test_transform_cifar10 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar10])

Import the dataset of CIFAR-10

In [None]:
train_dataset_cifar10_cutout = datasets.CIFAR10(root=current_path + 'data/',
                                     train=True,
                                     transform=train_transform_cifar10_cutout,
                                     download=True)

test_dataset_cifar10 = datasets.CIFAR10(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar10,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar10_cutout = 128
train_loader_cifar10_cutout = torch.utils.data.DataLoader(dataset=train_dataset_cifar10_cutout,
                                           batch_size=batch_size_cifar10_cutout,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar10 = torch.utils.data.DataLoader(dataset=test_dataset_cifar10,
                                          batch_size=batch_size_cifar10_cutout,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_resnet18_cifar10_cutout = "resnet18_cifar10_cutout"

num_classes_cifar10 = 10
resnet18_cifar10_cutout = ResNet18(num_classes=num_classes_cifar10)


resnet18_cifar10_cutout = resnet18_cifar10_cutout.cuda()
learning_rate_resnet18_cifar10_cutout = 0.1
criterion_resnet18_cifar10_cutout = nn.CrossEntropyLoss().cuda()
cnn_optimizer_resnet18_cifar10_cutout = torch.optim.SGD(resnet18_cifar10_cutout.parameters(), lr=learning_rate_resnet18_cifar10_cutout,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_resnet18_cifar10_cutout = MultiStepLR(cnn_optimizer_resnet18_cifar10_cutout, milestones=[60, 120, 160], gamma=0.2)

Training ResNet-18 with Cutout

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar10_cutout)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        resnet18_cifar10_cutout.zero_grad()
        pred = resnet18_cifar10_cutout(images)

        xentropy_loss = criterion_resnet18_cifar10_cutout(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_resnet18_cifar10_cutout.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_acc_cifar10 = test(test_loader_cifar10,resnet18_cifar10_cutout)
    tqdm.write('test_acc: %.3f' % (test_acc_cifar10))
    scheduler_resnet18_cifar10_cutout.step()     
torch.save(resnet18_cifar10_cutout.state_dict(), current_path + 'checkpoints/' + file_name_resnet18_cifar10_cutout + '.pt')


final_test_acc_resnet18_cifar10_cutout = (1 - test(test_loader_cifar10,resnet18_cifar10_cutout))*100
print('Final Result ResNet-18 using Cutout for CIFAR-10 Test Dataset: %.3f' % (final_test_acc_resnet18_cifar10_cutout))

### 2.2.3. Training ResNet-18 in CF10 with Data Augmentation

Image Processing for CIFAR-10

In [None]:
# Image Preprocessing

normalize_image_cifar10 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar10_da = transforms.Compose([])
train_transform_cifar10_da.transforms.append(transforms.RandomCrop(32, padding=4))
train_transform_cifar10_da.transforms.append(transforms.RandomHorizontalFlip())
train_transform_cifar10_da.transforms.append(transforms.ToTensor())
train_transform_cifar10_da.transforms.append(normalize_image_cifar10)


test_transform_cifar10 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar10])

Import the dataset of CIFAR-10

In [None]:
train_dataset_cifar10_da = datasets.CIFAR10(root=current_path + 'data/',
                                     train=True,
                                     transform=train_transform_cifar10_da,
                                     download=True)

test_dataset_cifar10 = datasets.CIFAR10(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar10,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar10_da = 128
train_loader_cifar10_da = torch.utils.data.DataLoader(dataset=train_dataset_cifar10_da,
                                           batch_size=batch_size_cifar10_da,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar10 = torch.utils.data.DataLoader(dataset=test_dataset_cifar10,
                                          batch_size=batch_size_cifar10_da,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_resnet18_cifar10_da = "resnet18_cifar10_da"

num_classes_cifar10 = 10
resnet18_cifar10_da = ResNet18(num_classes=num_classes_cifar10)


resnet18_cifar10_da = resnet18_cifar10_da.cuda()
learning_rate_resnet18_cifar10_da = 0.1
criterion_resnet18_cifar10_da = nn.CrossEntropyLoss().cuda()
cnn_optimizer_resnet18_cifar10_da = torch.optim.SGD(resnet18_cifar10_da.parameters(), lr=learning_rate_resnet18_cifar10_da,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_resnet18_cifar10_da = MultiStepLR(cnn_optimizer_resnet18_cifar10_da, milestones=[60, 120, 160], gamma=0.2)

Training ResNet-18 with Data Augmentation

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar10_da)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        resnet18_cifar10_da.zero_grad()
        pred = resnet18_cifar10_da(images)

        xentropy_loss = criterion_resnet18_cifar10_da(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_resnet18_cifar10_da.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_acc_resnet18_cifar10_da = test(test_loader_cifar10,resnet18_cifar10_da)
    tqdm.write('test_acc: %.3f' % (test_acc_resnet18_cifar10_da))
    scheduler_resnet18_cifar10_da.step()     
torch.save(resnet18_cifar10_da.state_dict(), current_path + 'checkpoints/' + file_name_resnet18_cifar10_da + '.pt')


final_test_acc_resnet18_cifar10_da = (1 - test(test_loader_cifar10,resnet18_cifar10_da))*100
print('Final Result ResNet-18 using Data Augmentation for CIFAR-10 Test Dataset: %.3f' % (final_test_acc_resnet18_cifar10_da))

### 2.2.4. Training ResNet-18 in CF10 with Data Augmentation with Cutout

Image Processing for CIFAR-10

In [None]:
# Image Preprocessing

normalize_image_cifar10 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar10_da_co = transforms.Compose([])
train_transform_cifar10_da_co.transforms.append(transforms.RandomCrop(32, padding=4))
train_transform_cifar10_da_co.transforms.append(transforms.RandomHorizontalFlip())
train_transform_cifar10_da_co.transforms.append(transforms.ToTensor())
train_transform_cifar10_da_co.transforms.append(normalize_image_cifar10)

#Add Cutout to the image transformer pipeline
n_holes_cifar10_da_co = 1
length_cifar10_da_co = 16
train_transform_cifar10_da_co.transforms.append(Cutout(n_holes=n_holes_cifar10_da_co, length=length_cifar10_da_co))


test_transform_cifar10 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar10])

Import the dataset of CIFAR-10

In [None]:
train_dataset_cifar10_da_co = datasets.CIFAR10(root=current_path + 'data/',
                                     train=True,
                                     transform=train_transform_cifar10_da_co,
                                     download=True)

test_dataset_cifar10 = datasets.CIFAR10(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar10,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar10_da_co = 128
train_loader_cifar10_da_co = torch.utils.data.DataLoader(dataset=train_dataset_cifar10_da_co,
                                           batch_size=batch_size_cifar10_da_co,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar10 = torch.utils.data.DataLoader(dataset=test_dataset_cifar10,
                                          batch_size=batch_size_cifar10_da_co,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_resnet18_cifar10_da_cutout = "resnet18_cifar10_da_cutout"

num_classes_cifar10 = 10
resnet18_cifar10_da_cutout = ResNet18(num_classes=num_classes_cifar10)


resnet18_cifar10_da_cutout = resnet18_cifar10_da_cutout.cuda()
learning_rate_cifar10_da_cutout = 0.1
criterion_cifar10_da_cutout = nn.CrossEntropyLoss().cuda()
cnn_optimizer_cifar10_da_cutout = torch.optim.SGD(resnet18_cifar10_da_cutout.parameters(), lr=learning_rate_cifar10_da_cutout,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_cifar10_da_cutout = MultiStepLR(cnn_optimizer_cifar10_da_cutout, milestones=[60, 120, 160], gamma=0.2)

Training ResNet-18 with Cutout

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar10_da_co)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        resnet18_cifar10_da_cutout.zero_grad()
        pred = resnet18_cifar10_da_cutout(images)

        xentropy_loss = criterion_cifar10_da_cutout(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_cifar10_da_cutout.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_acc_cifar10_da_cutout = test(test_loader_cifar10,resnet18_cifar10_da_cutout)
    tqdm.write('test_acc: %.3f' % (test_acc_cifar10_da_cutout))
    scheduler_cifar10_da_cutout.step()     
torch.save(resnet18_cifar10_da_cutout.state_dict(), current_path + 'checkpoints/' + file_name_resnet18_cifar10_da_cutout + '.pt')


final_test_acc_resnet18_cifar10_da_cutout = (1 - test(test_loader_cifar10,resnet18_cifar10_da_cutout))*100
print('Final Result ResNet-18 using Data Augmentation and  Cutout for CIFAR-10 Test Dataset: %.3f' % (final_test_acc_resnet18_cifar10_da_cutout))

In [None]:
print('Final Result ResNet-18 without Cutout for Test CIFAR-10 Dataset: %.3f' % (final_test_acc_resnet18_cifar10))
print('Final Result ResNet-18 using Cutout for CIFAR-10 Test Dataset: %.3f' % (final_test_acc_resnet18_cifar10_cutout))
print('Final Result ResNet-18 using Data Augmentation for CIFAR-10 Test Dataset: %.3f' % (final_test_acc_resnet18_cifar10_da))
print('Final Result ResNet-18 using Data Augmentation and  Cutout for CIFAR-10 Test Dataset: %.3f' % (final_test_acc_resnet18_cifar10_da_cutout))

## 2.3 Training ResNet-18 in CIFAR-100

### 2.3.1. Training ResNet-18 in CF100 without Cutout

Image Processing for CIFAR-100

In [None]:
# Image Preprocessing

normalize_image_cifar100 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar100 = transforms.Compose([])

train_transform_cifar100.transforms.append(transforms.ToTensor())
train_transform_cifar100.transforms.append(normalize_image_cifar100)



test_transform_cifar100 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar100])

Import the dataset of CIFAR-100

In [None]:
train_dataset_cifar100 = datasets.CIFAR100(root=current_path + 'data/',
                                     train=True,
                                     transform=train_transform_cifar100,
                                     download=True)

test_dataset_cifar100 = datasets.CIFAR100(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar100,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar100 = 128
train_loader_cifar100 = torch.utils.data.DataLoader(dataset=train_dataset_cifar100,
                                           batch_size=batch_size_cifar100,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar100 = torch.utils.data.DataLoader(dataset=test_dataset_cifar100,
                                          batch_size=batch_size_cifar100,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_resnet18_cifar100 = "resnet18_cifar100"

num_classes_cifar100 = 100
resnet18_cifar100 = ResNet18(num_classes=num_classes_cifar100)


resnet18_cifar100 = resnet18_cifar100.cuda()
learning_rate_resnet18_cifar100 = 0.1
criterion_resnet18_cifar100 = nn.CrossEntropyLoss().cuda()
cnn_optimizer_resnet18_cifar100 = torch.optim.SGD(resnet18_cifar100.parameters(), lr=learning_rate_resnet18_cifar100,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_resnet18_cifar100 = MultiStepLR(cnn_optimizer_resnet18_cifar100, milestones=[60, 120, 160], gamma=0.2)

Training ResNet-18 withuout Cutout

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar100)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        resnet18_cifar100.zero_grad()
        pred = resnet18_cifar100(images)

        xentropy_loss = criterion_resnet18_cifar100(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_resnet18_cifar100.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_accr_resnet18_cifar100 = test(test_loader_cifar100, resnet18_cifar100)
    tqdm.write('test_acc: %.3f' % (test_accr_resnet18_cifar100))

    scheduler_resnet18_cifar100.step()   
    
torch.save(resnet18_cifar100.state_dict(), current_path + 'checkpoints/' + file_name_resnet18_cifar100 + '.pt')


final_test_acc_resnet18_cifar100 = (1 - test(test_loader_cifar100, resnet18_cifar100))*100
print('Final Result ResNet-18 without Cutout for Test CIFAR-100 Dataset: %.3f' % (final_test_acc_resnet18_cifar100))

### 2.2.2. Training ResNet-18 in CF100 with Cutout

Image Processing for CIFAR-100

In [None]:
# Image Preprocessing

normalize_image_cifar100 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar100_cutout = transforms.Compose([])

train_transform_cifar100_cutout.transforms.append(transforms.ToTensor())
train_transform_cifar100_cutout.transforms.append(normalize_image_cifar100)

#Add Cutout to the image transformer pipeline
n_holes_cifar100 = 1
length_cifar100 = 8
train_transform_cifar100_cutout.transforms.append(Cutout(n_holes=n_holes_cifar100, length=length_cifar100))


test_transform_cifar100 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar100])

Import the dataset of CIFAR-0

In [None]:
train_dataset_cifar100_cutout = datasets.CIFAR100(root=current_path + 'data/',
                                     train=True,
                                     transform=train_transform_cifar100_cutout,
                                     download=True)

test_dataset_cifar100 = datasets.CIFAR100(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar100,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar100_cutout = 128
train_loader_cifar100_cutout = torch.utils.data.DataLoader(dataset=train_dataset_cifar100_cutout,
                                           batch_size=batch_size_cifar100_cutout,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar100 = torch.utils.data.DataLoader(dataset=test_dataset_cifar100,
                                          batch_size=batch_size_cifar100_cutout,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_resnet18_cifar100_cutout = "resnet18_cifar100_cutout"

num_classes_cifar100 = 100
resnet18_cifar100_cutout = ResNet18(num_classes=num_classes_cifar100)


resnet18_cifar100_cutout = resnet18_cifar100_cutout.cuda()
learning_rate_resnet18_cifar100_cutout = 0.1
criterion_resnet18_cifar100_cutout = nn.CrossEntropyLoss().cuda()
cnn_optimizer_resnet18_cifar100_cutout = torch.optim.SGD(resnet18_cifar100_cutout.parameters(), lr=learning_rate_resnet18_cifar100_cutout,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_resnet18_cifar100_cutout = MultiStepLR(cnn_optimizer_resnet18_cifar100_cutout, milestones=[60, 120, 160], gamma=0.2)

Training ResNet-18 with Cutout

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar100_cutout)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        resnet18_cifar100_cutout.zero_grad()
        pred = resnet18_cifar100_cutout(images)

        xentropy_loss = criterion_resnet18_cifar100_cutout(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_resnet18_cifar100_cutout.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_acc_cifar100 = test(test_loader_cifar100,resnet18_cifar100_cutout)
    tqdm.write('test_acc: %.3f' % (test_acc_cifar100))
    scheduler_resnet18_cifar100_cutout.step()     
torch.save(resnet18_cifar100_cutout.state_dict(), current_path + 'checkpoints/' + file_name_resnet18_cifar100_cutout + '.pt')


final_test_acc_resnet18_cifar100_cutout = (1 - test(test_loader_cifar100,resnet18_cifar100_cutout))*100
print('Final Result ResNet-18 using Cutout for CIFAR-100 Test Dataset: %.3f' % (final_test_acc_resnet18_cifar100_cutout))

### 2.2.3. Training ResNet-18 in CF100 with Data Augmentation

Image Processing for CIFAR-100

In [None]:
# Image Preprocessing

normalize_image_cifar100 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar100_da = transforms.Compose([])
train_transform_cifar100_da.transforms.append(transforms.RandomCrop(32, padding=4))
train_transform_cifar100_da.transforms.append(transforms.RandomHorizontalFlip())
train_transform_cifar100_da.transforms.append(transforms.ToTensor())
train_transform_cifar100_da.transforms.append(normalize_image_cifar100)


test_transform_cifar100 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar100])

Import the dataset of CIFAR-100

In [None]:
train_dataset_cifar100_da = datasets.CIFAR100(root=current_path + 'data/',
                                     train=True,
                                     transform=train_transform_cifar100_da,
                                     download=True)

test_dataset_cifar100 = datasets.CIFAR100(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar100,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar100_da = 128
train_loader_cifar100_da = torch.utils.data.DataLoader(dataset=train_dataset_cifar100_da,
                                           batch_size=batch_size_cifar100_da,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar100 = torch.utils.data.DataLoader(dataset=test_dataset_cifar100,
                                          batch_size=batch_size_cifar100_da,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_resnet18_cifar100_da = "resnet18_cifar100_da"

num_classes_cifar100 = 100
resnet18_cifar100_da = ResNet18(num_classes=num_classes_cifar100)


resnet18_cifar100_da = resnet18_cifar100_da.cuda()
learning_rate_resnet18_cifar100_da = 0.1
criterion_resnet18_cifar100_da = nn.CrossEntropyLoss().cuda()
cnn_optimizer_resnet18_cifar100_da = torch.optim.SGD(resnet18_cifar100_da.parameters(), lr=learning_rate_resnet18_cifar100_da,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_resnet18_cifar100_da = MultiStepLR(cnn_optimizer_resnet18_cifar100_da, milestones=[60, 120, 160], gamma=0.2)

Training ResNet-18 with Data Augmentation

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar100_da)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        resnet18_cifar100_da.zero_grad()
        pred = resnet18_cifar100_da(images)

        xentropy_loss = criterion_resnet18_cifar100_da(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_resnet18_cifar100_da.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_acc_resnet18_cifar100_da = test(test_loader_cifar100,resnet18_cifar100_da)
    tqdm.write('test_acc: %.3f' % (test_acc_resnet18_cifar100_da))
    scheduler_resnet18_cifar100_da.step()     
torch.save(resnet18_cifar100_da.state_dict(), current_path + 'checkpoints/' + file_name_resnet18_cifar100_da + '.pt')


final_test_acc_resnet18_cifar100_da = (1 - test(test_loader_cifar100,resnet18_cifar100_da))*100
print('Final Result ResNet-18 using Data Augmentation for CIFAR-100 Test Dataset: %.3f' % (final_test_acc_resnet18_cifar100_da))

### 2.2.4. Training ResNet-18 in CF100 with Data Augmentation with Cutout

Image Processing for CIFAR-100

In [None]:
# Image Preprocessing

normalize_image_cifar100 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar100_da_co = transforms.Compose([])
train_transform_cifar100_da_co.transforms.append(transforms.RandomCrop(32, padding=4))
train_transform_cifar100_da_co.transforms.append(transforms.RandomHorizontalFlip())
train_transform_cifar100_da_co.transforms.append(transforms.ToTensor())
train_transform_cifar100_da_co.transforms.append(normalize_image_cifar100)

#Add Cutout to the image transformer pipeline
n_holes_cifar100_da_co = 1
length_cifar100_da_co = 8
train_transform_cifar100_da_co.transforms.append(Cutout(n_holes=n_holes_cifar100_da_co, length=length_cifar100_da_co))


test_transform_cifar100 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar100])

Import the dataset of CIFAR-100

In [None]:
train_dataset_cifar100_da_co = datasets.CIFAR100(root=current_path + 'data/',
                                     train=True,
                                     transform=train_transform_cifar100_da_co,
                                     download=True)

test_dataset_cifar100 = datasets.CIFAR100(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar100,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar100_da_co = 128
train_loader_cifar100_da_co = torch.utils.data.DataLoader(dataset=train_dataset_cifar100_da_co,
                                           batch_size=batch_size_cifar100_da_co,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar100 = torch.utils.data.DataLoader(dataset=test_dataset_cifar100,
                                          batch_size=batch_size_cifar100_da_co,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_resnet18_cifar100_da_cutout = "resnet18_cifar100_da_cutout"

num_classes_cifar100 = 100
resnet18_cifar100_da_cutout = ResNet18(num_classes=num_classes_cifar100)


resnet18_cifar100_da_cutout = resnet18_cifar100_da_cutout.cuda()
learning_rate_cifar100_da_cutout = 0.1
criterion_cifar100_da_cutout = nn.CrossEntropyLoss().cuda()
cnn_optimizer_cifar100_da_cutout = torch.optim.SGD(resnet18_cifar100_da_cutout.parameters(), lr=learning_rate_cifar100_da_cutout,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_cifar100_da_cutout = MultiStepLR(cnn_optimizer_cifar100_da_cutout, milestones=[60, 120, 160], gamma=0.2)

Training ResNet-18 with Cutout

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar100_da_co)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        resnet18_cifar100_da_cutout.zero_grad()
        pred = resnet18_cifar100_da_cutout(images)

        xentropy_loss = criterion_cifar100_da_cutout(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_cifar100_da_cutout.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_acc_cifar100_da_cutout = test(test_loader_cifar100,resnet18_cifar100_da_cutout)
    tqdm.write('test_acc: %.3f' % (test_acc_cifar100_da_cutout))
    scheduler_cifar100_da_cutout.step()     
torch.save(resnet18_cifar100_da_cutout.state_dict(),current_path +  'checkpoints/' + file_name_resnet18_cifar100_da_cutout + '.pt')


final_test_acc_resnet18_cifar100_da_cutout = (1 - test(test_loader_cifar100,resnet18_cifar100_da_cutout))*100
print('Final Result ResNet-18 using Data Augmentation and  Cutout for CIFAR-100 Test Dataset: %.3f' % (final_test_acc_resnet18_cifar100_da_cutout))

In [None]:
print('Final Result ResNet-18 without Cutout for Test CIFAR-100 Dataset: %.3f' % (final_test_acc_resnet18_cifar100))
print('Final Result ResNet-18 using Cutout for CIFAR-100 Test Dataset: %.3f' % (final_test_acc_resnet18_cifar100_cutout))
print('Final Result ResNet-18 using Data Augmentation for CIFAR-100 Test Dataset: %.3f' % (final_test_acc_resnet18_cifar100_da))
print('Final Result ResNet-18 using Data Augmentation and  Cutout for CIFAR-100 Test Dataset: %.3f' % (final_test_acc_resnet18_cifar100_da_cutout))

# 03. WideResNet

WideResNet model implementation from https://github.com/xternalz/WideResNet-pytorch

Note: for faster training, use Runtime \> Change Runtime Type to run this notebook on a GPU.

: ::: {.cell .markdown}

In the Cutout paper, the authors claim that:

1.  Cutout improves the robustness and overall performance of convolutional neural networks.
2.  Cutout can be used in conjunction with existing forms of data augmentation and other regularizers to further improve model performance.

In this section, we will evaluate these claims using a WideResNet model. For the WideResNet model, the specific quantitative claims are given in the following table:

Test error (%, flip/translation augmentation, mean/std normalization, mean of 5 runs) and “+” indicates standard data augmentation (mirror + crop)

| **Network**         | **CIFAR-10** | **CIFAR-10+** | **CIFAR-100** | **CIFAR-100+** | **SVHN** |
|-----------|------------|-------------|------------|-------------|------------|
| WideResNet          | 6.97         | 3.87          | 26.06         | 18.8           | 1.60     |
| WideResNet + cutout | 5.54         | 3.08          | 23.94         | 18.41          | 1.30     |

In this table, the effectiveness of standard and cutout data augmentation techniques is evaluated using the WideResNet architecture on the CIFAR-10, CIFAR-100, and SVHN datasets. The “+”, as before, indicates the use of standard data augmentation (mirror and crop).

For CIFAR-10, utilizing the WideResNet model with standard augmentation significantly reduces the test error from 6.97% to 3.87%. When cutout augmentation is added, the error drops even further to 3.08%.

A comparable trend is seen with the CIFAR-100 dataset. Standard augmentation reduces the WideResNet model’s test error from 26.06% to 18.8%. With the application of cutout augmentation, the error rate decreases slightly more to 18.41%.

Lastly, the SVHN dataset shows the smallest error rates. With standard augmentation, the error is 1.60% which further reduces to 1.30% with the addition of cutout augmentation.

These results demonstrate the robust effectiveness of both standard and cutout augmentation techniques in lowering test error rates across all tested datasets when used with the WideResNet model. As with the previous findings, the effect of augmentation appears to be influenced by the complexity of the dataset.

:::

## Import Library

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
import torch.backends.cudnn as cudnn
from torch.optim.lr_scheduler import MultiStepLR
from torchvision import datasets, transforms
import numpy as np
import os
from tqdm import tqdm
import math

Check Cuda GPU availability and set seed number

In [None]:
cuda = torch.cuda.is_available()
print(cuda)
cudnn.benchmark = True  # Should make training should go faster for large models

seed = 1
torch.manual_seed(seed)
np.random.seed(seed)

If you are using Google Colab, here’s a step-by-step how to connect with your google drive:

1.  On the left sidebar of the Colab notebook interface, you will see a folder icon with the Google Drive logo. Click on this folder icon to open the file explorer.

2.  If you haven’t connected your Google Drive to Colab yet, it will prompt you to do so. Click the “Mount Drive” button to connect your Google Drive to Colab.

3.  Once your Google Drive is mounted, you can use the file explorer to navigate to the file you want to open. Click on the folders to explore the contents of your Google Drive.

4.  When you find the file you want to open, click the three dots next to the name of the file in the file explorer. From the options that appear, choose “Copy path.” This action will copy the full path of the file to your clipboard. Paste the copy path into the ‘current_path’ below.

In [None]:
current_path ="./"

This code block is used for creating a directory named ‘checkpoints’. This directory will be used to store the weights of our models, which are crucial for both preserving our progress during model training and for future use of the trained models.

Creating such a directory and regularly saving model weights is a good practice in machine learning, as it ensures that you can resume your work from where you left off, should the training process be interrupted.

In [None]:
# Create file names 'checkpoints' to save the weight of the models
if not os.path.exists(current_path + 'checkpoints'):
    os.makedirs(current_path + 'checkpoints')

## 3.1 Implementation Code

### 3.1.1 WideResNet Code

In [None]:
# WideResNet

# From https://github.com/uoguelph-mlrg/Cutout/blob/master/model/wide_resnet.py


class BasicBlockWide(nn.Module):
    def __init__(self, in_planes, out_planes, stride, dropRate=0.0):
        super(BasicBlockWide, self).__init__()
        self.bn1 = nn.BatchNorm2d(in_planes)
        self.relu1 = nn.ReLU(inplace=True)
        self.conv1 = nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
                               padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(out_planes)
        self.relu2 = nn.ReLU(inplace=True)
        self.conv2 = nn.Conv2d(out_planes, out_planes, kernel_size=3, stride=1,
                               padding=1, bias=False)
        self.droprate = dropRate
        self.equalInOut = (in_planes == out_planes)
        self.convShortcut = (not self.equalInOut) and nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride,
                               padding=0, bias=False) or None
    def forward(self, x):
        if not self.equalInOut:
            x = self.relu1(self.bn1(x))
        else:
            out = self.relu1(self.bn1(x))
        out = self.relu2(self.bn2(self.conv1(out if self.equalInOut else x)))
        if self.droprate > 0:
            out = F.dropout(out, p=self.droprate, training=self.training)
        out = self.conv2(out)
        return torch.add(x if self.equalInOut else self.convShortcut(x), out)

class NetworkBlock(nn.Module):
    def __init__(self, nb_layers, in_planes, out_planes, block, stride, dropRate=0.0):
        super(NetworkBlock, self).__init__()
        self.layer = self._make_layer(block, in_planes, out_planes, nb_layers, stride, dropRate)
    def _make_layer(self, block, in_planes, out_planes, nb_layers, stride, dropRate):
        layers = []
        for i in range(nb_layers):
            layers.append(block(i == 0 and in_planes or out_planes, out_planes, i == 0 and stride or 1, dropRate))
        return nn.Sequential(*layers)
    def forward(self, x):
        return self.layer(x)

class WideResNet(nn.Module):
    def __init__(self, depth, num_classes, widen_factor=1, dropRate=0.0):
        super(WideResNet, self).__init__()
        nChannels = [16, 16*widen_factor, 32*widen_factor, 64*widen_factor]
        assert((depth - 4) % 6 == 0)
        n = (depth - 4) // 6
        block = BasicBlockWide
        # 1st conv before any network block
        self.conv1 = nn.Conv2d(3, nChannels[0], kernel_size=3, stride=1,
                               padding=1, bias=False)
        # 1st block
        self.block1 = NetworkBlock(n, nChannels[0], nChannels[1], block, 1, dropRate)
        # 2nd block
        self.block2 = NetworkBlock(n, nChannels[1], nChannels[2], block, 2, dropRate)
        # 3rd block
        self.block3 = NetworkBlock(n, nChannels[2], nChannels[3], block, 2, dropRate)
        # global average pooling and classifier
        self.bn1 = nn.BatchNorm2d(nChannels[3])
        self.relu = nn.ReLU(inplace=True)
        self.fc = nn.Linear(nChannels[3], num_classes)
        self.nChannels = nChannels[3]

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2. / n))
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()
            elif isinstance(m, nn.Linear):
                m.bias.data.zero_()
    def forward(self, x):
        out = self.conv1(x)
        out = self.block1(out)
        out = self.block2(out)
        out = self.block3(out)
        out = self.relu(self.bn1(out))

        out = F.avg_pool2d(out, 8)
        out = out.view(-1, self.nChannels)
        out = self.fc(out)
        return out

### 3.1.2. Model Evaluate Test Code

This function evaluates the performance of the model on a given data loader (loader). It sets the model to evaluation mode (eval), calculates the accuracy on the dataset, and returns the validation accuracy. It then switches the model back to training mode (train) before returning the validation accuracy.

In [None]:
def test(loader, cnn):
    cnn.eval()    # Change model to 'eval' mode (BN uses moving mean/var).
    correct = 0.
    total = 0.
    for images, labels in loader:
        images = images.cuda()
        labels = labels.cuda()

        with torch.no_grad():
            pred = cnn(images)

        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels).sum().item()

    val_acc = correct / total
    cnn.train()
    return val_acc

## 3.2 Training WideResNet in CIFAR-10

### 3.2.1. Training WideResNet in CF10 without Cutout

Image Processing for CIFAR-10

In [None]:
# Image Preprocessing

normalize_image_cifar10 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar10 = transforms.Compose([])

train_transform_cifar10.transforms.append(transforms.ToTensor())
train_transform_cifar10.transforms.append(normalize_image_cifar10)



test_transform_cifar10 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar10])

Import the dataset of CIFAR-10

In [None]:
train_dataset_cifar10 = datasets.CIFAR10(root=current_path + 'data/',
                                     train=True,
                                     transform=train_transform_cifar10,
                                     download=True)

test_dataset_cifar10 = datasets.CIFAR10(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar10,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar10 = 128
train_loader_cifar10 = torch.utils.data.DataLoader(dataset=train_dataset_cifar10,
                                           batch_size=batch_size_cifar10,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar10 = torch.utils.data.DataLoader(dataset=test_dataset_cifar10,
                                          batch_size=batch_size_cifar10,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_wideresnet_cifar10 = "wideresnet_cifar10"

num_classes_cifar10 = 10
wideresnet_cifar10 = WideResNet(depth=28, num_classes=num_classes_cifar10, widen_factor=10, dropRate=0.3)


wideresnet_cifar10 = wideresnet_cifar10.cuda()
learning_rate_wideresnet_cifar10 = 0.1
criterion_wideresnet_cifar10 = nn.CrossEntropyLoss().cuda()
cnn_optimizer_wideresnet_cifar10 = torch.optim.SGD(wideresnet_cifar10.parameters(), lr=learning_rate_wideresnet_cifar10,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_wideresnet_cifar10 = MultiStepLR(cnn_optimizer_wideresnet_cifar10, milestones=[60, 120, 160], gamma=0.2)

Training WideResNet withuout Cutout

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar10)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        wideresnet_cifar10.zero_grad()
        pred = wideresnet_cifar10(images)

        xentropy_loss = criterion_wideresnet_cifar10(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_wideresnet_cifar10.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_accr_wideresnet_cifar10 = test(test_loader_cifar10, wideresnet_cifar10)
    tqdm.write('test_acc: %.3f' % (test_accr_wideresnet_cifar10))

    scheduler_wideresnet_cifar10.step()    

    
torch.save(wideresnet_cifar10.state_dict(), current_path + 'checkpoints/' + file_name_wideresnet_cifar10 + '.pt')


final_test_acc_wideresnet_cifar10 = (1 - test(test_loader_cifar10, wideresnet_cifar10))*100
print('Final Result WideResNet without Cutout for Test CIFAR-10 Dataset: %.3f' % (final_test_acc_wideresnet_cifar10))

### 3.2.2. Training WideResNet in CF10 with Cutout

Cutout Code

In [None]:
# to do: link to the file in the original repo that this comes from
# Source Code from https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py
class Cutout(object):
    """Randomly mask out one or more patches from an image.

    Args:
        n_holes (int): Number of patches to cut out of each image.
        length (int): The length (in pixels) of each square patch.
    """
    def __init__(self, n_holes, length):
        self.n_holes = n_holes
        self.length = length

    def __call__(self, img):
        """
        Args:
            img (Tensor): Tensor image of size (C, H, W).
        Returns:
            Tensor: Image with n_holes of dimension length x length cut out of it.
        """
        h = img.size(1)
        w = img.size(2)

        mask = np.ones((h, w), np.float32)

        for n in range(self.n_holes):
            y = np.random.randint(h)
            x = np.random.randint(w)

            y1 = np.clip(y - self.length // 2, 0, h)
            y2 = np.clip(y + self.length // 2, 0, h)
            x1 = np.clip(x - self.length // 2, 0, w)
            x2 = np.clip(x + self.length // 2, 0, w)

            mask[y1: y2, x1: x2] = 0.

        mask = torch.from_numpy(mask)
        mask = mask.expand_as(img)
        img = img * mask

        return img

Image Processing for CIFAR-10

In [None]:
# Image Preprocessing

normalize_image_cifar10 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar10_cutout = transforms.Compose([])

train_transform_cifar10_cutout.transforms.append(transforms.ToTensor())
train_transform_cifar10_cutout.transforms.append(normalize_image_cifar10)

#Add Cutout to the image transformer pipeline
n_holes_cifar10 = 1
length_cifar10 = 16
train_transform_cifar10_cutout.transforms.append(Cutout(n_holes=n_holes_cifar10, length=length_cifar10))


test_transform_cifar10 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar10])

Import the dataset of CIFAR-10

In [None]:
train_dataset_cifar10_cutout = datasets.CIFAR10(root=current_path + 'data/',
                                     train=True,
                                     transform=train_transform_cifar10_cutout,
                                     download=True)

test_dataset_cifar10 = datasets.CIFAR10(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar10,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar10_cutout = 128
train_loader_cifar10_cutout = torch.utils.data.DataLoader(dataset=train_dataset_cifar10_cutout,
                                           batch_size=batch_size_cifar10_cutout,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar10 = torch.utils.data.DataLoader(dataset=test_dataset_cifar10,
                                          batch_size=batch_size_cifar10_cutout,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_wideresnet_cifar10_cutout = "wideresnet_cifar10_cutout"

num_classes_cifar10 = 10
wideresnet_cifar10_cutout = WideResNet(depth=28, num_classes=num_classes_cifar10, widen_factor=10, dropRate=0.3)


wideresnet_cifar10_cutout = wideresnet_cifar10_cutout.cuda()
learning_rate_wideresnet_cifar10_cutout = 0.1
criterion_wideresnet_cifar10_cutout = nn.CrossEntropyLoss().cuda()
cnn_optimizer_wideresnet_cifar10_cutout = torch.optim.SGD(wideresnet_cifar10_cutout.parameters(), lr=learning_rate_wideresnet_cifar10_cutout,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_wideresnet_cifar10_cutout = MultiStepLR(cnn_optimizer_wideresnet_cifar10_cutout, milestones=[60, 120, 160], gamma=0.2)

Training WideResNet with Cutout

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar10_cutout)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        wideresnet_cifar10_cutout.zero_grad()
        pred = wideresnet_cifar10_cutout(images)

        xentropy_loss = criterion_wideresnet_cifar10_cutout(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_wideresnet_cifar10_cutout.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_acc_cifar10 = test(test_loader_cifar10,wideresnet_cifar10_cutout)
    tqdm.write('test_acc: %.3f' % (test_acc_cifar10))
    scheduler_wideresnet_cifar10_cutout.step()     
torch.save(wideresnet_cifar10_cutout.state_dict(), current_path + 'checkpoints/' + file_name_wideresnet_cifar10_cutout + '.pt')


final_test_acc_wideresnet_cifar10_cutout = (1 - test(test_loader_cifar10,wideresnet_cifar10_cutout))*100
print('Final Result WideResNet using Cutout for CIFAR-10 Test Dataset: %.3f' % (final_test_acc_wideresnet_cifar10_cutout))

### 3.2.3. Training WideResNet in CF10 with Data Augmentation

Image Processing for CIFAR-10

In [None]:
# Image Preprocessing

normalize_image_cifar10 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar10_da = transforms.Compose([])
train_transform_cifar10_da.transforms.append(transforms.RandomCrop(32, padding=4))
train_transform_cifar10_da.transforms.append(transforms.RandomHorizontalFlip())
train_transform_cifar10_da.transforms.append(transforms.ToTensor())
train_transform_cifar10_da.transforms.append(normalize_image_cifar10)


test_transform_cifar10 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar10])

Import the dataset of CIFAR-10

In [None]:
train_dataset_cifar10_da = datasets.CIFAR10(root=current_path + 'data/',
                                     train=True,
                                     transform=train_transform_cifar10_da,
                                     download=True)

test_dataset_cifar10 = datasets.CIFAR10(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar10,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar10_da = 128
train_loader_cifar10_da = torch.utils.data.DataLoader(dataset=train_dataset_cifar10_da,
                                           batch_size=batch_size_cifar10_da,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar10 = torch.utils.data.DataLoader(dataset=test_dataset_cifar10,
                                          batch_size=batch_size_cifar10_da,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_wideresnet_cifar10_da = "wideresnet_cifar10_da"

num_classes_cifar10 = 10
wideresnet_cifar10_da = WideResNet(depth=28, num_classes=num_classes_cifar10, widen_factor=10, dropRate=0.3)


wideresnet_cifar10_da = wideresnet_cifar10_da.cuda()
learning_rate_wideresnet_cifar10_da = 0.1
criterion_wideresnet_cifar10_da = nn.CrossEntropyLoss().cuda()
cnn_optimizer_wideresnet_cifar10_da = torch.optim.SGD(wideresnet_cifar10_da.parameters(), lr=learning_rate_wideresnet_cifar10_da,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_wideresnet_cifar10_da = MultiStepLR(cnn_optimizer_wideresnet_cifar10_da, milestones=[60, 120, 160], gamma=0.2)

Training WideResNet with Data Augmentation

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar10_da)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        wideresnet_cifar10_da.zero_grad()
        pred = wideresnet_cifar10_da(images)

        xentropy_loss = criterion_wideresnet_cifar10_da(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_wideresnet_cifar10_da.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_acc_wideresnet_cifar10_da = test(test_loader_cifar10,wideresnet_cifar10_da)
    tqdm.write('test_acc: %.3f' % (test_acc_wideresnet_cifar10_da))
    scheduler_wideresnet_cifar10_da.step()     
torch.save(wideresnet_cifar10_da.state_dict(), current_path + 'checkpoints/' + file_name_wideresnet_cifar10_da + '.pt')


final_test_acc_wideresnet_cifar10_da = (1 - test(test_loader_cifar10,wideresnet_cifar10_da))*100
print('Final Result WideResNet using Data Augmentation for CIFAR-10 Test Dataset: %.3f' % (final_test_acc_wideresnet_cifar10_da))

### 3.2.4. Training WideResNet in CF10 with Data Augmentation with Cutout

Image Processing for CIFAR-10

In [None]:
# Image Preprocessing

normalize_image_cifar10 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar10_da_co = transforms.Compose([])
train_transform_cifar10_da_co.transforms.append(transforms.RandomCrop(32, padding=4))
train_transform_cifar10_da_co.transforms.append(transforms.RandomHorizontalFlip())
train_transform_cifar10_da_co.transforms.append(transforms.ToTensor())
train_transform_cifar10_da_co.transforms.append(normalize_image_cifar10)

#Add Cutout to the image transformer pipeline
n_holes_cifar10_da_co = 1
length_cifar10_da_co = 16
train_transform_cifar10_da_co.transforms.append(Cutout(n_holes=n_holes_cifar10_da_co, length=length_cifar10_da_co))


test_transform_cifar10 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar10])

Import the dataset of CIFAR-10

In [None]:
train_dataset_cifar10_da_co = datasets.CIFAR10(root=current_path + 'data/',
                                     train=True,
                                     transform=train_transform_cifar10_da_co,
                                     download=True)

test_dataset_cifar10 = datasets.CIFAR10(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar10,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar10_da_co = 128
train_loader_cifar10_da_co = torch.utils.data.DataLoader(dataset=train_dataset_cifar10_da_co,
                                           batch_size=batch_size_cifar10_da_co,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar10 = torch.utils.data.DataLoader(dataset=test_dataset_cifar10,
                                          batch_size=batch_size_cifar10_da_co,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_wideresnet_cifar10_da_cutout = "wideresnet_cifar10_da_cutout"

num_classes_cifar10 = 10
wideresnet_cifar10_da_cutout = WideResNet(depth=28, num_classes=num_classes_cifar10, widen_factor=10, dropRate=0.3)


wideresnet_cifar10_da_cutout = wideresnet_cifar10_da_cutout.cuda()
learning_rate_cifar10_da_cutout = 0.1
criterion_cifar10_da_cutout = nn.CrossEntropyLoss().cuda()
cnn_optimizer_cifar10_da_cutout = torch.optim.SGD(wideresnet_cifar10_da_cutout.parameters(), lr=learning_rate_cifar10_da_cutout,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_cifar10_da_cutout = MultiStepLR(cnn_optimizer_cifar10_da_cutout, milestones=[60, 120, 160], gamma=0.2)

Training WideResNet with Cutout

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar10_da_co)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        wideresnet_cifar10_da_cutout.zero_grad()
        pred = wideresnet_cifar10_da_cutout(images)

        xentropy_loss = criterion_cifar10_da_cutout(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_cifar10_da_cutout.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_acc_cifar10_da_cutout = test(test_loader_cifar10,wideresnet_cifar10_da_cutout)
    tqdm.write('test_acc: %.3f' % (test_acc_cifar10_da_cutout))
    scheduler_cifar10_da_cutout.step()     
torch.save(wideresnet_cifar10_da_cutout.state_dict(), current_path + 'checkpoints/' + file_name_wideresnet_cifar10_da_cutout + '.pt')


final_test_acc_wideresnet_cifar10_da_cutout = (1 - test(test_loader_cifar10,wideresnet_cifar10_da_cutout))*100
print('Final Result WideResNet using Data Augmentation and  Cutout for CIFAR-10 Test Dataset: %.3f' % (final_test_acc_wideresnet_cifar10_da_cutout))

In [None]:
print('Final Result WideResNet without Cutout for Test CIFAR-10 Dataset: %.3f' % (final_test_acc_wideresnet_cifar10))
print('Final Result WideResNet using Cutout for CIFAR-10 Test Dataset: %.3f' % (final_test_acc_wideresnet_cifar10_cutout))
print('Final Result WideResNet using Data Augmentation for CIFAR-10 Test Dataset: %.3f' % (final_test_acc_wideresnet_cifar10_da))
print('Final Result WideResNet using Data Augmentation and  Cutout for CIFAR-10 Test Dataset: %.3f' % (final_test_acc_wideresnet_cifar10_da_cutout))

## 3.3 Training WideResNet in CIFAR-100

### 3.3.1. Training WideResNet in CF100 without Cutout

Image Processing for CIFAR-100

In [None]:
# Image Preprocessing

normalize_image_cifar100 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar100 = transforms.Compose([])

train_transform_cifar100.transforms.append(transforms.ToTensor())
train_transform_cifar100.transforms.append(normalize_image_cifar100)



test_transform_cifar100 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar100])

Import the dataset of CIFAR-100

In [None]:
train_dataset_cifar100 = datasets.CIFAR100(root=current_path +  'data/',
                                     train=True,
                                     transform=train_transform_cifar100,
                                     download=True)

test_dataset_cifar100 = datasets.CIFAR100(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar100,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar100 = 128
train_loader_cifar100 = torch.utils.data.DataLoader(dataset=train_dataset_cifar100,
                                           batch_size=batch_size_cifar100,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar100 = torch.utils.data.DataLoader(dataset=test_dataset_cifar100,
                                          batch_size=batch_size_cifar100,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_wideresnet_cifar100 = "wideresnet_cifar100"

num_classes_cifar100 = 100
wideresnet_cifar100 = WideResNet(depth=28, num_classes=num_classes_cifar100, widen_factor=10, dropRate=0.3)


wideresnet_cifar100 = wideresnet_cifar100.cuda()
learning_rate_wideresnet_cifar100 = 0.1
criterion_wideresnet_cifar100 = nn.CrossEntropyLoss().cuda()
cnn_optimizer_wideresnet_cifar100 = torch.optim.SGD(wideresnet_cifar100.parameters(), lr=learning_rate_wideresnet_cifar100,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_wideresnet_cifar100 = MultiStepLR(cnn_optimizer_wideresnet_cifar100, milestones=[60, 120, 160], gamma=0.2)

Training WideResNet withuout Cutout

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar100)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        wideresnet_cifar100.zero_grad()
        pred = wideresnet_cifar100(images)

        xentropy_loss = criterion_wideresnet_cifar100(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_wideresnet_cifar100.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_accr_wideresnet_cifar100 = test(test_loader_cifar100, wideresnet_cifar100)
    tqdm.write('test_acc: %.3f' % (test_accr_wideresnet_cifar100))

    scheduler_wideresnet_cifar100.step()    

    
torch.save(wideresnet_cifar100.state_dict(), current_path + 'checkpoints/' + file_name_wideresnet_cifar100 + '.pt')


final_test_acc_wideresnet_cifar100 = (1 - test(test_loader_cifar100, wideresnet_cifar100))*100
print('Final Result WideResNet without Cutout for Test CIFAR-100 Dataset: %.3f' % (final_test_acc_wideresnet_cifar100))

### 3.2.2. Training WideResNet in CF100 with Cutout

Image Processing for CIFAR-100

In [None]:
# Image Preprocessing

normalize_image_cifar100 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar100_cutout = transforms.Compose([])

train_transform_cifar100_cutout.transforms.append(transforms.ToTensor())
train_transform_cifar100_cutout.transforms.append(normalize_image_cifar100)

#Add Cutout to the image transformer pipeline
n_holes_cifar100 = 1
length_cifar100 = 8
train_transform_cifar100_cutout.transforms.append(Cutout(n_holes=n_holes_cifar100, length=length_cifar100))


test_transform_cifar100 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar100])

Import the dataset of CIFAR-0

In [None]:
train_dataset_cifar100_cutout = datasets.CIFAR100(root=current_path + 'data/',
                                     train=True,
                                     transform=train_transform_cifar100_cutout,
                                     download=True)

test_dataset_cifar100 = datasets.CIFAR100(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar100,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar100_cutout = 128
train_loader_cifar100_cutout = torch.utils.data.DataLoader(dataset=train_dataset_cifar100_cutout,
                                           batch_size=batch_size_cifar100_cutout,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar100 = torch.utils.data.DataLoader(dataset=test_dataset_cifar100,
                                          batch_size=batch_size_cifar100_cutout,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_wideresnet_cifar100_cutout = "wideresnet_cifar100_cutout"

num_classes_cifar100 = 100
wideresnet_cifar100_cutout = WideResNet(depth=28, num_classes=num_classes_cifar100, widen_factor=10, dropRate=0.3)


wideresnet_cifar100_cutout = wideresnet_cifar100_cutout.cuda()
learning_rate_wideresnet_cifar100_cutout = 0.1
criterion_wideresnet_cifar100_cutout = nn.CrossEntropyLoss().cuda()
cnn_optimizer_wideresnet_cifar100_cutout = torch.optim.SGD(wideresnet_cifar100_cutout.parameters(), lr=learning_rate_wideresnet_cifar100_cutout,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_wideresnet_cifar100_cutout = MultiStepLR(cnn_optimizer_wideresnet_cifar100_cutout, milestones=[60, 120, 160], gamma=0.2)

Training WideResNet with Cutout

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar100_cutout)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        wideresnet_cifar100_cutout.zero_grad()
        pred = wideresnet_cifar100_cutout(images)

        xentropy_loss = criterion_wideresnet_cifar100_cutout(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_wideresnet_cifar100_cutout.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_acc_cifar100 = test(test_loader_cifar100,wideresnet_cifar100_cutout)
    tqdm.write('test_acc: %.3f' % (test_acc_cifar100))
    scheduler_wideresnet_cifar100_cutout.step()     
torch.save(wideresnet_cifar100_cutout.state_dict(), current_path + 'checkpoints/' + file_name_wideresnet_cifar100_cutout + '.pt')


final_test_acc_wideresnet_cifar100_cutout = (1 - test(test_loader_cifar100,wideresnet_cifar100_cutout))*100
print('Final Result WideResNet using Cutout for CIFAR-100 Test Dataset: %.3f' % (final_test_acc_wideresnet_cifar100_cutout))

### 3.2.3. Training WideResNet in CF100 with Data Augmentation

Image Processing for CIFAR-100

In [None]:
# Image Preprocessing

normalize_image_cifar100 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar100_da = transforms.Compose([])
train_transform_cifar100_da.transforms.append(transforms.RandomCrop(32, padding=4))
train_transform_cifar100_da.transforms.append(transforms.RandomHorizontalFlip())
train_transform_cifar100_da.transforms.append(transforms.ToTensor())
train_transform_cifar100_da.transforms.append(normalize_image_cifar100)


test_transform_cifar100 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar100])

Import the dataset of CIFAR-100

In [None]:
train_dataset_cifar100_da = datasets.CIFAR100(root=current_path + 'data/',
                                     train=True,
                                     transform=train_transform_cifar100_da,
                                     download=True)

test_dataset_cifar100 = datasets.CIFAR100(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar100,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar100_da = 128
train_loader_cifar100_da = torch.utils.data.DataLoader(dataset=train_dataset_cifar100_da,
                                           batch_size=batch_size_cifar100_da,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar100 = torch.utils.data.DataLoader(dataset=test_dataset_cifar100,
                                          batch_size=batch_size_cifar100_da,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_wideresnet_cifar100_da = "wideresnet_cifar100_da"

num_classes_cifar100 = 100
wideresnet_cifar100_da = WideResNet(depth=28, num_classes=num_classes_cifar100, widen_factor=10, dropRate=0.3)



wideresnet_cifar100_da = wideresnet_cifar100_da.cuda()
learning_rate_wideresnet_cifar100_da = 0.1
criterion_wideresnet_cifar100_da = nn.CrossEntropyLoss().cuda()
cnn_optimizer_wideresnet_cifar100_da = torch.optim.SGD(wideresnet_cifar100_da.parameters(), lr=learning_rate_wideresnet_cifar100_da,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_wideresnet_cifar100_da = MultiStepLR(cnn_optimizer_wideresnet_cifar100_da, milestones=[60, 120, 160], gamma=0.2)

Training WideResNet with Data Augmentation

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar100_da)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        wideresnet_cifar100_da.zero_grad()
        pred = wideresnet_cifar100_da(images)

        xentropy_loss = criterion_wideresnet_cifar100_da(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_wideresnet_cifar100_da.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_acc_wideresnet_cifar100_da = test(test_loader_cifar100,wideresnet_cifar100_da)
    tqdm.write('test_acc: %.3f' % (test_acc_wideresnet_cifar100_da))
    scheduler_wideresnet_cifar100_da.step()     
torch.save(wideresnet_cifar100_da.state_dict(), current_path + 'checkpoints/' + file_name_wideresnet_cifar100_da + '.pt')


final_test_acc_wideresnet_cifar100_da = (1 - test(test_loader_cifar100,wideresnet_cifar100_da))*100
print('Final Result WideResNet using Data Augmentation for CIFAR-100 Test Dataset: %.3f' % (final_test_acc_wideresnet_cifar100_da))

### 3.2.4. Training WideResNet in CF100 with Data Augmentation with Cutout

Image Processing for CIFAR-100

In [None]:
# Image Preprocessing

normalize_image_cifar100 = transforms.Normalize(mean=[x / 255.0 for x in [125.3, 123.0, 113.9]], std=[x / 255.0 for x in [63.0, 62.1, 66.7]])

train_transform_cifar100_da_co = transforms.Compose([])
train_transform_cifar100_da_co.transforms.append(transforms.RandomCrop(32, padding=4))
train_transform_cifar100_da_co.transforms.append(transforms.RandomHorizontalFlip())
train_transform_cifar100_da_co.transforms.append(transforms.ToTensor())
train_transform_cifar100_da_co.transforms.append(normalize_image_cifar100)

#Add Cutout to the image transformer pipeline
n_holes_cifar100_da_co = 1
length_cifar100_da_co = 8
train_transform_cifar100_da_co.transforms.append(Cutout(n_holes=n_holes_cifar100_da_co, length=length_cifar100_da_co))


test_transform_cifar100 = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_cifar100])

Import the dataset of CIFAR-100

In [None]:
train_dataset_cifar100_da_co = datasets.CIFAR100(root=current_path + 'data/',
                                     train=True,
                                     transform=train_transform_cifar100_da_co,
                                     download=True)

test_dataset_cifar100 = datasets.CIFAR100(root=current_path + 'data/',
                                    train=False,
                                    transform=test_transform_cifar100,
                                    download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_cifar100_da_co = 128
train_loader_cifar100_da_co = torch.utils.data.DataLoader(dataset=train_dataset_cifar100_da_co,
                                           batch_size=batch_size_cifar100_da_co,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_cifar100 = torch.utils.data.DataLoader(dataset=test_dataset_cifar100,
                                          batch_size=batch_size_cifar100_da_co,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_wideresnet_cifar100_da_cutout = "wideresnet_cifar100_da_cutout"

num_classes_cifar100 = 100
wideresnet_cifar100_da_cutout = WideResNet(depth=28, num_classes=num_classes_cifar100, widen_factor=10, dropRate=0.3)


wideresnet_cifar100_da_cutout = wideresnet_cifar100_da_cutout.cuda()
learning_rate_cifar100_da_cutout = 0.1
criterion_cifar100_da_cutout = nn.CrossEntropyLoss().cuda()
cnn_optimizer_cifar100_da_cutout = torch.optim.SGD(wideresnet_cifar100_da_cutout.parameters(), lr=learning_rate_cifar100_da_cutout,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_cifar100_da_cutout = MultiStepLR(cnn_optimizer_cifar100_da_cutout, milestones=[60, 120, 160], gamma=0.2)

Training WideResNet with Cutout

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 200
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_cifar100_da_co)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        wideresnet_cifar100_da_cutout.zero_grad()
        pred = wideresnet_cifar100_da_cutout(images)

        xentropy_loss = criterion_cifar100_da_cutout(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_cifar100_da_cutout.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_acc_cifar100_da_cutout = test(test_loader_cifar100,wideresnet_cifar100_da_cutout)
    tqdm.write('test_acc: %.3f' % (test_acc_cifar100_da_cutout))
    scheduler_cifar100_da_cutout.step()     
torch.save(wideresnet_cifar100_da_cutout.state_dict(), current_path + 'checkpoints/' + file_name_wideresnet_cifar100_da_cutout + '.pt')


final_test_acc_wideresnet_cifar100_da_cutout = (1 - test(test_loader_cifar100,wideresnet_cifar100_da_cutout))*100
print('Final Result WideResNet using Data Augmentation and  Cutout for CIFAR-100 Test Dataset: %.3f' % (final_test_acc_wideresnet_cifar100_da_cutout))

In [None]:
print('Final Result WideResNet without Cutout for Test CIFAR-100 Dataset: %.3f' % (final_test_acc_wideresnet_cifar100))
print('Final Result WideResNet using Cutout for CIFAR-100 Test Dataset: %.3f' % (final_test_acc_wideresnet_cifar100_cutout))
print('Final Result WideResNet using Data Augmentation for CIFAR-100 Test Dataset: %.3f' % (final_test_acc_wideresnet_cifar100_da))
print('Final Result WideResNet using Data Augmentation and  Cutout for CIFAR-100 Test Dataset: %.3f' % (final_test_acc_wideresnet_cifar100_da_cutout))

## 3.4 Training WideResNet in SVHN

### 3.4.1. Training WideResNet in SVHN without Cutout

Image Processing for SVHN

In [None]:
# Image Preprocessing

normalize_image_svhn = transforms.Normalize(mean=[x / 255.0 for x in[109.9, 109.7, 113.8]],std=[x / 255.0 for x in [50.1, 50.6, 50.8]])

train_transform_svhn = transforms.Compose([])

train_transform_svhn.transforms.append(transforms.ToTensor())
train_transform_svhn.transforms.append(normalize_image_svhn)



test_transform_svhn = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_svhn])

Import the dataset of SVHN

In [None]:
train_dataset_svhn = datasets.SVHN(root=current_path + 'data/',
                                    split='train',
                                    transform=train_transform_svhn,
                                    download=True)

extra_dataset_svhn = datasets.SVHN(root=current_path + 'data/',
                                    split='extra',
                                    transform=train_transform_svhn,
                                    download=True)

# Combine both training splits (https://arxiv.org/pdf/1605.07146.pdf)
data_svhn = np.concatenate([train_dataset_svhn.data, extra_dataset_svhn.data], axis=0)
labels_svhn = np.concatenate([train_dataset_svhn.labels, extra_dataset_svhn.labels], axis=0)
train_dataset_svhn.data = data_svhn
train_dataset_svhn.labels = labels_svhn

test_dataset_svhn = datasets.SVHN(root=current_path + 'data/',
                                  split='test',
                                  transform=test_transform_svhn,
                                  download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_svhn = 128
train_loader_svhn = torch.utils.data.DataLoader(dataset=train_dataset_svhn,
                                           batch_size=batch_size_svhn,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_svhn = torch.utils.data.DataLoader(dataset=test_dataset_svhn,
                                          batch_size=batch_size_svhn,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_wideresnet_svhn = "wideresnet_svhn"

num_classes_svhn = 10
wideresnet_svhn = WideResNet(depth=16, num_classes=num_classes_svhn, widen_factor=8,dropRate=0.4)


wideresnet_svhn = wideresnet_svhn.cuda()
learning_rate_wideresnet_svhn = 0.01
criterion_wideresnet_svhn = nn.CrossEntropyLoss().cuda()
cnn_optimizer_wideresnet_svhn = torch.optim.SGD(wideresnet_svhn.parameters(), lr=learning_rate_wideresnet_svhn,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_wideresnet_svhn = MultiStepLR(cnn_optimizer_wideresnet_svhn, milestones=[80, 120], gamma=0.1)

Training WideResNet withuout Cutout

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 160
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_svhn)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        wideresnet_svhn.zero_grad()
        pred = wideresnet_svhn(images)

        xentropy_loss = criterion_wideresnet_svhn(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_wideresnet_svhn.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_accr_wideresnet_svhn = test(test_loader_svhn, wideresnet_svhn)
    tqdm.write('test_acc: %.3f' % (test_accr_wideresnet_svhn))

    scheduler_wideresnet_svhn.step()     

    
torch.save(wideresnet_svhn.state_dict(), current_path + 'checkpoints/' + file_name_wideresnet_svhn + '.pt')


final_test_acc_wideresnet_svhn = (1 - test(test_loader_svhn, wideresnet_svhn))*100
print('Final Result WideResNet without Cutout for Test SVHN Dataset: %.3f' % (final_test_acc_wideresnet_svhn))

### 3.4.2. Training WideResNet in SVHN with Cutout

Image Processing for SVHN

In [None]:
# Image Preprocessing

normalize_image_svhn = transforms.Normalize(mean=[x / 255.0 for x in[109.9, 109.7, 113.8]], std=[x / 255.0 for x in [50.1, 50.6, 50.8]])

train_transform_svhn_cutout = transforms.Compose([])

train_transform_svhn_cutout.transforms.append(transforms.ToTensor())
train_transform_svhn_cutout.transforms.append(normalize_image_svhn)

#Add Cutout to the image transformer pipeline
n_holes_svhn = 1
length_svhn = 20
train_transform_svhn_cutout.transforms.append(Cutout(n_holes=n_holes_svhn, length=length_svhn))


test_transform_svhn = transforms.Compose([
    transforms.ToTensor(),
    normalize_image_svhn])

Import the dataset of SVHN

In [None]:
train_dataset_svhn_cutout = datasets.SVHN(root=current_path + 'data/',
                                    split='train',
                                    transform=train_transform_svhn_cutout,
                                    download=True)

extra_dataset_svhn_cutout = datasets.SVHN(root=current_path + 'data/',
                                    split='extra',
                                    transform=train_transform_svhn_cutout,
                                    download=True)

# Combine both training splits (https://arxiv.org/pdf/1605.07146.pdf)
data_svhn_cutout = np.concatenate([train_dataset_svhn_cutout.data, extra_dataset_svhn_cutout.data], axis=0)
labels_svhn_cutout = np.concatenate([train_dataset_svhn_cutout.labels, extra_dataset_svhn_cutout.labels], axis=0)
train_dataset_svhn_cutout.data = data_svhn_cutout
train_dataset_svhn_cutout.labels = labels_svhn_cutout

test_dataset_svhn = datasets.SVHN(root=current_path + 'data/',
                                  split='test',
                                  transform=test_transform_svhn,
                                  download=True)

Create Dataset as Dataloader

In [None]:
# Data Loader (Input Pipeline)
batch_size_svhn_cutout = 128
train_loader_svhn_cutout = torch.utils.data.DataLoader(dataset=train_dataset_svhn_cutout,
                                           batch_size=batch_size_svhn_cutout,
                                           shuffle=True,
                                           pin_memory=True,
                                           num_workers=2)

test_loader_svhn = torch.utils.data.DataLoader(dataset=test_dataset_svhn,
                                          batch_size=batch_size_svhn_cutout,
                                          shuffle=False,
                                          pin_memory=True,
                                          num_workers=2)

Define the model

This code block sets up the machine learning model, loss function, optimizer, and learning rate scheduler.

In [None]:
#file_name will be the used for the name of the file of weight of the model and also the result
file_name_wideresnet_svhn_cutout = "wideresnet_svhn_cutout"

num_classes_svhn = 10
wideresnet_svhn_cutout = WideResNet(depth=16, num_classes=num_classes_svhn, widen_factor=8,dropRate=0.4)


wideresnet_svhn_cutout = wideresnet_svhn_cutout.cuda()
learning_rate_wideresnet_svhn_cutout = 0.01
criterion_wideresnet_svhn_cutout = nn.CrossEntropyLoss().cuda()
cnn_optimizer_wideresnet_svhn_cutout = torch.optim.SGD(wideresnet_svhn_cutout.parameters(), lr=learning_rate_wideresnet_svhn_cutout,
                                momentum=0.9, nesterov=True, weight_decay=5e-4)
scheduler_wideresnet_svhn_cutout = MultiStepLR(cnn_optimizer_wideresnet_svhn_cutout, milestones=[80, 120], gamma=0.1)

Training WideResNet with Cutout

This code runs the training loop for the chosen machine learning model over a specified number of epochs. Each epoch involves a forward pass, loss computation, backpropagation, and parameter updates. It also calculates and displays the training accuracy and cross-entropy loss. At the end of each epoch, the model’s performance is evaluated on the test set, and the results are logged and saved.

In [None]:
epochs = 160
for epoch in range(epochs):

    xentropy_loss_avg = 0.
    correct = 0.
    total = 0.

    progress_bar = tqdm(train_loader_svhn_cutout)
    for i, (images, labels) in enumerate(progress_bar):
        progress_bar.set_description('Epoch ' + str(epoch))

        images = images.cuda()
        labels = labels.cuda()

        wideresnet_svhn_cutout.zero_grad()
        pred = wideresnet_svhn_cutout(images)

        xentropy_loss = criterion_wideresnet_svhn_cutout(pred, labels)
        xentropy_loss.backward()
        cnn_optimizer_wideresnet_svhn_cutout.step()

        xentropy_loss_avg += xentropy_loss.item()

        # Calculate running average of accuracy
        pred = torch.max(pred.data, 1)[1]
        total += labels.size(0)
        correct += (pred == labels.data).sum().item()
        accuracy = correct / total

        progress_bar.set_postfix(
            xentropy='%.3f' % (xentropy_loss_avg / (i + 1)),
            acc='%.3f' % accuracy)

    test_acc_svhn = test(test_loader_svhn,wideresnet_svhn_cutout)
    tqdm.write('test_acc: %.3f' % (test_acc_svhn))
    scheduler_wideresnet_svhn_cutout.step()     
torch.save(wideresnet_svhn_cutout.state_dict(), current_path + 'checkpoints/' + file_name_wideresnet_svhn_cutout + '.pt')


final_test_acc_wideresnet_svhn_cutout = (1 - test(test_loader_svhn,wideresnet_svhn_cutout))*100
print('Final Result WideResNet using Cutout for SVHN Test Dataset: %.3f' % (final_test_acc_wideresnet_svhn_cutout))

In [None]:
print('Final Result WideResNet without Cutout for Test SVHN Dataset: %.3f' % (final_test_acc_wideresnet_svhn))
print('Final Result WideResNet using Cutout for SVHN Test Dataset: %.3f' % (final_test_acc_wideresnet_svhn_cutout))

# 04. Grad-CAM

###### What is Grad-CAM?

Grad-CAM (Gradient-weighted Class Activation Mapping) is a technique that provides visual explanations for decisions made by Convolutional Neural Network (CNN) models. It uses the gradients of any target concept, flowing into the final convolutional layer to produce a coarse localization map highlighting the important regions in the image for predicting the concept.

Grad-CAM is not limited to a specific architecture, it can be applied to a wide range of CNN models without any changes to their existing structure or requiring re-training. It’s also class-discriminative, allowing it to effectively manage multi-label scenarios.

By visualizing the model’s focus areas with Grad-CAM, we can assess how effectively Cutout is encouraging the model to use a broader range of features. For example, if a model trained with Cutout still primarily focuses on a single region, that might suggest the Cutout squares are too small, or not numerous enough. Conversely, if the focus areas are well spread across the image, it would confirm that Cutout is indeed pushing the model to generalize better.

If you want to understand more about Grad-CAM? Check this paper (https://arxiv.org/abs/1610.02391)

## Import Library

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
import torch.backends.cudnn as cudnn
from torch.optim.lr_scheduler import MultiStepLR
from torchvision import datasets, transforms
import numpy as np
import os
from tqdm import tqdm
import math
import cv2
import matplotlib.pyplot as plt
from PIL import Image

Check Cuda GPU availability and set seed number

In [None]:
cuda = torch.cuda.is_available()
print(cuda)
cudnn.benchmark = True  # Should make training should go faster for large models

seed = 1
torch.manual_seed(seed)
np.random.seed(seed)

If you are using Google Colab, here’s a step-by-step how to connect with your google drive:

1.  On the left sidebar of the Colab notebook interface, you will see a folder icon with the Google Drive logo. Click on this folder icon to open the file explorer.

2.  If you haven’t connected your Google Drive to Colab yet, it will prompt you to do so. Click the “Mount Drive” button to connect your Google Drive to Colab.

3.  Once your Google Drive is mounted, you can use the file explorer to navigate to the file you want to open. Click on the folders to explore the contents of your Google Drive.

4.  When you find the file you want to open, click the three dots next to the name of the file in the file explorer. From the options that appear, choose “Copy path.” This action will copy the full path of the file to your clipboard. Paste the copy path into the ‘current_path’ below.

In [None]:
current_path ="./"

## 4.2 Implementation Grad-CAM for ResNet Model

In [None]:

# ResNet
# From https://github.com/uoguelph-mlrg/Cutout/blob/master/model/resnet.py

def conv3x3(in_planes, out_planes, stride=1):
    return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, padding=1, bias=False)


class BasicBlock(nn.Module):
    expansion = 1

    def __init__(self, in_planes, planes, stride=1):
        super(BasicBlock, self).__init__()
        self.conv1 = conv3x3(in_planes, planes, stride)
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = conv3x3(planes, planes)
        self.bn2 = nn.BatchNorm2d(planes)

        self.shortcut = nn.Sequential()
        if stride != 1 or in_planes != self.expansion*planes:
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_planes, self.expansion*planes, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(self.expansion*planes)
            )

    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.bn2(self.conv2(out))
        out += self.shortcut(x)
        out = F.relu(out)
        return out


class Bottleneck(nn.Module):
    expansion = 4

    def __init__(self, in_planes, planes, stride=1):
        super(Bottleneck, self).__init__()
        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, bias=False)
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(planes)
        self.conv3 = nn.Conv2d(planes, self.expansion*planes, kernel_size=1, bias=False)
        self.bn3 = nn.BatchNorm2d(self.expansion*planes)

        self.shortcut = nn.Sequential()
        if stride != 1 or in_planes != self.expansion*planes:
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_planes, self.expansion*planes, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(self.expansion*planes)
            )

    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = F.relu(self.bn2(self.conv2(out)))
        out = self.bn3(self.conv3(out))
        out += self.shortcut(x)
        out = F.relu(out)
        return out


class ResNet(nn.Module):
    def __init__(self, block, num_blocks, num_classes=10):
        super(ResNet, self).__init__()
        self.in_planes = 64

        self.conv1 = conv3x3(3,64)
        self.bn1 = nn.BatchNorm2d(64)
        self.layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)
        self.layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)
        self.layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)
        self.layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)
        self.linear = nn.Linear(512*block.expansion, num_classes)

        # Register hooks for Grad-CAM
        self.gradients = None
        self.activations = None
        self.layer4.register_forward_hook(self._store_activations_hook)
        self.layer4.register_backward_hook(self._store_gradients_hook)

    def _store_activations_hook(self, module, input, output):
        self.activations = output

    def _store_gradients_hook(self, module, grad_input, grad_output):
        self.gradients = grad_output[0]

    def _make_layer(self, block, planes, num_blocks, stride):
        strides = [stride] + [1]*(num_blocks-1)
        layers = []
        for stride in strides:
            layers.append(block(self.in_planes, planes, stride))
            self.in_planes = planes * block.expansion
        return nn.Sequential(*layers)

    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.layer1(out)
        out = self.layer2(out)
        out = self.layer3(out)
        out = self.layer4(out) 
        out = F.avg_pool2d(out, 4)
        out = out.view(out.size(0), -1)
        out = self.linear(out)
        return out

def ResNet18(num_classes=10):
    return ResNet(BasicBlock, [2,2,2,2], num_classes)

## 4.2.1 Implementation Grad-CAM for ResNet18 Model for CIFAR-10

In [None]:

resnet18_gradcam_cifar10 = ResNet18(num_classes=10)
resnet18_gradcam_cifar10.load_state_dict(torch.load(current_path + "checkpoints/resnet18_cifar10.pt"))
resnet18_gradcam_cifar10.eval()

resnet18_gradcam_cifar10_cutout = ResNet18(num_classes=10)
resnet18_gradcam_cifar10_cutout.load_state_dict(torch.load(current_path + "checkpoints/resnet18_cifar10_cutout.pt"))
resnet18_gradcam_cifar10_cutout.eval()

resnet18_gradcam_cifar10_da = ResNet18(num_classes=10)
resnet18_gradcam_cifar10_da.load_state_dict(torch.load(current_path + "checkpoints/resnet18_cifar10_da.pt"))
resnet18_gradcam_cifar10_da.eval()

resnet18_gradcam_cifar10_da_cutout = ResNet18(num_classes=10)
resnet18_gradcam_cifar10_da_cutout.load_state_dict(torch.load(current_path + "checkpoints/resnet18_cifar10_da_cutout.pt"))
resnet18_gradcam_cifar10_da_cutout.eval()

Let’s try to see the result from the testloader of CIFAR-10 dataset

In [None]:
transform_cifar10 = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

testset_cifar10 = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform_cifar10)
testloader_cifar10 = torch.utils.data.DataLoader(testset_cifar10, batch_size=1, shuffle=True, num_workers=2)


In [None]:
cifar10_classes = [
    "Airplane", "Automobile", "Bird", "Cat", "Deer",
    "Dog", "Frog", "Horse", "Ship", "Truck"
]

In [None]:
# Get a batch from the testloader
images, labels = next(iter(testloader_cifar10))
input_tensor = images  # As your batch_size is 1, you will have a single image here

# Forward pass
resnet18_gradcam_cifar10.zero_grad()
output_resnet18_gradcam_cifar10 = resnet18_gradcam_cifar10(input_tensor)

resnet18_gradcam_cifar10_cutout.zero_grad()
output_resnet18_gradcam_cifar10_cutout = resnet18_gradcam_cifar10_cutout(input_tensor)

resnet18_gradcam_cifar10_da.zero_grad()
output_resnet18_gradcam_cifar10_da = resnet18_gradcam_cifar10_da(input_tensor)

resnet18_gradcam_cifar10_da_cutout.zero_grad()
output_resnet18_gradcam_cifar10_da_cutout = resnet18_gradcam_cifar10_da_cutout(input_tensor)

# Get the index of the max log-probability
target_resnet18_gradcam_cifar10 = output_resnet18_gradcam_cifar10.argmax(1)
output_resnet18_gradcam_cifar10.max().backward()

target_resnet18_gradcam_cifar10_cutout = output_resnet18_gradcam_cifar10_cutout.argmax(1)
output_resnet18_gradcam_cifar10_cutout.max().backward()

target_resnet18_gradcam_cifar10_da = output_resnet18_gradcam_cifar10_da.argmax(1)
output_resnet18_gradcam_cifar10_da.max().backward()

target_resnet18_gradcam_cifar10_da_cutout = output_resnet18_gradcam_cifar10_da_cutout.argmax(1)
output_resnet18_gradcam_cifar10_da_cutout.max().backward()

# Map the predicted class indices to the class labels
predicted_class_resnet18_gradcam_cifar10 = cifar10_classes[target_resnet18_gradcam_cifar10.item()]
predicted_class_resnet18_gradcam_cifar10 = cifar10_classes[target_resnet18_gradcam_cifar10_cutout.item()]
predicted_class_resnet18_gradcam_cifar10_da = cifar10_classes[target_resnet18_gradcam_cifar10_da.item()]
predicted_class_resnet18_gradcam_cifar10_da_cutout = cifar10_classes[target_resnet18_gradcam_cifar10_da_cutout.item()]


# Get the gradients and activations
gradients_resnet18_gradcam_cifar10 = resnet18_gradcam_cifar10.gradients.detach().cpu()
activations_resnet18_gradcam_cifar10 = resnet18_gradcam_cifar10.activations.detach().cpu()

gradients_resnet18_gradcam_cifar10_cutout = resnet18_gradcam_cifar10_cutout.gradients.detach().cpu()
activations_resnet18_gradcam_cifar10_cutout = resnet18_gradcam_cifar10_cutout.activations.detach().cpu()

gradients_resnet18_gradcam_cifar10_da = resnet18_gradcam_cifar10_da.gradients.detach().cpu()
activations_resnet18_gradcam_cifar10_da = resnet18_gradcam_cifar10_da.activations.detach().cpu()

gradients_resnet18_gradcam_cifar10_da_cutout = resnet18_gradcam_cifar10_da_cutout.gradients.detach().cpu()
activations_resnet18_gradcam_cifar10_da_cutout = resnet18_gradcam_cifar10_da_cutout.activations.detach().cpu()


# Calculate the weights
weights_resnet18_gradcam_cifar10 = gradients_resnet18_gradcam_cifar10.mean(dim=(2, 3), keepdim=True)

weights_resnet18_gradcam_cifar10_cutout = gradients_resnet18_gradcam_cifar10_cutout.mean(dim=(2, 3), keepdim=True)

weights_resnet18_gradcam_cifar10_da = gradients_resnet18_gradcam_cifar10_da.mean(dim=(2, 3), keepdim=True)

weights_resnet18_gradcam_cifar10_da_cutout = gradients_resnet18_gradcam_cifar10_da_cutout.mean(dim=(2, 3), keepdim=True)

# Calculate the weighted sum of activations (Grad-CAM)
cam_resnet18_gradcam_cifar10 = (weights_resnet18_gradcam_cifar10 * activations_resnet18_gradcam_cifar10).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar10 = F.relu(cam_resnet18_gradcam_cifar10)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar10 = F.interpolate(cam_resnet18_gradcam_cifar10, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar10 = cam_resnet18_gradcam_cifar10.squeeze().numpy()

cam_resnet18_gradcam_cifar10_cutout = (weights_resnet18_gradcam_cifar10_cutout * activations_resnet18_gradcam_cifar10_cutout).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar10_cutout = F.relu(cam_resnet18_gradcam_cifar10_cutout)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar10_cutout = F.interpolate(cam_resnet18_gradcam_cifar10_cutout, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar10_cutout = cam_resnet18_gradcam_cifar10_cutout.squeeze().numpy()

cam_resnet18_gradcam_cifar10_da = (weights_resnet18_gradcam_cifar10_da * activations_resnet18_gradcam_cifar10_da).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar10_da = F.relu(cam_resnet18_gradcam_cifar10_da)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar10_da = F.interpolate(cam_resnet18_gradcam_cifar10_da, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar10_da = cam_resnet18_gradcam_cifar10_da.squeeze().numpy()

cam_resnet18_gradcam_cifar10_da_cutout = (weights_resnet18_gradcam_cifar10_da_cutout * activations_resnet18_gradcam_cifar10_da_cutout).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar10_da_cutout = F.relu(cam_resnet18_gradcam_cifar10_da_cutout)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar10_da_cutout = F.interpolate(cam_resnet18_gradcam_cifar10_da_cutout, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar10_da_cutout = cam_resnet18_gradcam_cifar10_da_cutout.squeeze().numpy()


# Normalize the heatmap
cam_resnet18_gradcam_cifar10 -= cam_resnet18_gradcam_cifar10.min()
cam_resnet18_gradcam_cifar10 /= cam_resnet18_gradcam_cifar10.max()

cam_resnet18_gradcam_cifar10_cutout -= cam_resnet18_gradcam_cifar10_cutout.min()
cam_resnet18_gradcam_cifar10_cutout /= cam_resnet18_gradcam_cifar10_cutout.max()

cam_resnet18_gradcam_cifar10_da -= cam_resnet18_gradcam_cifar10_da.min()
cam_resnet18_gradcam_cifar10_da /= cam_resnet18_gradcam_cifar10_da.max()

cam_resnet18_gradcam_cifar10_da_cutout -= cam_resnet18_gradcam_cifar10_da_cutout.min()
cam_resnet18_gradcam_cifar10_da_cutout /= cam_resnet18_gradcam_cifar10_da_cutout.max()

# Since the images from the dataloader are normalized, you have to denormalize them before plotting
mean = torch.tensor([0.485, 0.456, 0.406])
std = torch.tensor([0.229, 0.224, 0.225])
img = images.squeeze().detach().cpu() * std[..., None, None] + mean[..., None, None]
img = img.permute(1, 2, 0).numpy()

# Superimpose the heatmap onto the original image
heatmap_resnet18_gradcam_cifar10 = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar10), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar10 = cv2.cvtColor(heatmap_resnet18_gradcam_cifar10, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar10 = heatmap_resnet18_gradcam_cifar10 * 0.4 + img * 255

heatmap_resnet18_gradcam_cifar10_cutout = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar10_cutout), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar10_cutout = cv2.cvtColor(heatmap_resnet18_gradcam_cifar10_cutout, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar10_cutout = heatmap_resnet18_gradcam_cifar10_cutout * 0.4 + img * 255

heatmap_resnet18_gradcam_cifar10_da = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar10_da), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar10_da = cv2.cvtColor(heatmap_resnet18_gradcam_cifar10_da, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar10_da = heatmap_resnet18_gradcam_cifar10_da * 0.4 + img * 255

heatmap_resnet18_gradcam_cifar10_da_cutout = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar10_da_cutout), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar10_da_cutout = cv2.cvtColor(heatmap_resnet18_gradcam_cifar10_da_cutout, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar10_da_cutout = heatmap_resnet18_gradcam_cifar10_da_cutout * 0.4 + img * 255

class_label = str(labels.item())

# Display the original image and the Grad-CAM
fig, ax = plt.subplots(nrows=1, ncols=5)

ax[0].imshow(img)
ax[0].set_title('(Class: ' + cifar10_classes[int(class_label)] + ')')
ax[0].axis('off')
ax[1].imshow(superimposed_img_resnet18_gradcam_cifar10 / 255)
ax[1].set_title(predicted_class_resnet18_gradcam_cifar10)
ax[1].axis('off')
ax[2].imshow(superimposed_img_resnet18_gradcam_cifar10_cutout / 255)
ax[2].set_title(predicted_class_resnet18_gradcam_cifar10)
ax[2].axis('off')
ax[3].imshow(superimposed_img_resnet18_gradcam_cifar10_da / 255)
ax[3].set_title(predicted_class_resnet18_gradcam_cifar10_da)
ax[3].axis('off')
ax[4].imshow(superimposed_img_resnet18_gradcam_cifar10_da_cutout / 255)
ax[4].set_title(predicted_class_resnet18_gradcam_cifar10_da_cutout)
ax[4].axis('off')

fig.suptitle("Original Image - Grad-CAM -  GC with CO - GC with DA - GC with CO&Da")
plt.show()



Now you can try to load your image, preprocess it and convert it into a PyTorch tensor. Choose an image that is in the CIFAR-10 classes (airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks). The preprocessing steps should be the same as the ones you used for training your model. Let’s say you have an image `image.jpeg`:

In [None]:
# Load the image
image_path = "image.jpeg"
image = Image.open(current_path + image_path)

# Define the transformations: resize, to tensor, normalize (replace the mean and std with values you used for training)
preprocess = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

# Preprocess the image
input_tensor = preprocess(image)
input_tensor = input_tensor.unsqueeze(0)  # add batch dimension.  C,H,W => B,C,H,W

Apply Grad-CAM

In [None]:
# Forward pass
resnet18_gradcam_cifar10.zero_grad()
output_resnet18_gradcam_cifar10 = resnet18_gradcam_cifar10(input_tensor)

resnet18_gradcam_cifar10_cutout.zero_grad()
output_resnet18_gradcam_cifar10_cutout = resnet18_gradcam_cifar10_cutout(input_tensor)

resnet18_gradcam_cifar10_da.zero_grad()
output_resnet18_gradcam_cifar10_da = resnet18_gradcam_cifar10_da(input_tensor)

resnet18_gradcam_cifar10_da_cutout.zero_grad()
output_resnet18_gradcam_cifar10_da_cutout = resnet18_gradcam_cifar10_da_cutout(input_tensor)

# Get the index of the max log-probability
target_resnet18_gradcam_cifar10 = output_resnet18_gradcam_cifar10.argmax(1)
output_resnet18_gradcam_cifar10.max().backward()

target_resnet18_gradcam_cifar10_cutout = output_resnet18_gradcam_cifar10_cutout.argmax(1)
output_resnet18_gradcam_cifar10_cutout.max().backward()

target_resnet18_gradcam_cifar10_da = output_resnet18_gradcam_cifar10_da.argmax(1)
output_resnet18_gradcam_cifar10_da.max().backward()

target_resnet18_gradcam_cifar10_da_cutout = output_resnet18_gradcam_cifar10_da_cutout.argmax(1)
output_resnet18_gradcam_cifar10_da_cutout.max().backward()

# Map the predicted class indices to the class labels
predicted_class_resnet18_gradcam_cifar10 = cifar10_classes[target_resnet18_gradcam_cifar10.item()]
predicted_class_resnet18_gradcam_cifar10 = cifar10_classes[target_resnet18_gradcam_cifar10_cutout.item()]
predicted_class_resnet18_gradcam_cifar10_da = cifar10_classes[target_resnet18_gradcam_cifar10_da.item()]
predicted_class_resnet18_gradcam_cifar10_da_cutout = cifar10_classes[target_resnet18_gradcam_cifar10_da_cutout.item()]


# Get the gradients and activations
gradients_resnet18_gradcam_cifar10 = resnet18_gradcam_cifar10.gradients.detach().cpu()
activations_resnet18_gradcam_cifar10 = resnet18_gradcam_cifar10.activations.detach().cpu()

gradients_resnet18_gradcam_cifar10_cutout = resnet18_gradcam_cifar10_cutout.gradients.detach().cpu()
activations_resnet18_gradcam_cifar10_cutout = resnet18_gradcam_cifar10_cutout.activations.detach().cpu()

gradients_resnet18_gradcam_cifar10_da = resnet18_gradcam_cifar10_da.gradients.detach().cpu()
activations_resnet18_gradcam_cifar10_da = resnet18_gradcam_cifar10_da.activations.detach().cpu()

gradients_resnet18_gradcam_cifar10_da_cutout = resnet18_gradcam_cifar10_da_cutout.gradients.detach().cpu()
activations_resnet18_gradcam_cifar10_da_cutout = resnet18_gradcam_cifar10_da_cutout.activations.detach().cpu()


# Calculate the weights
weights_resnet18_gradcam_cifar10 = gradients_resnet18_gradcam_cifar10.mean(dim=(2, 3), keepdim=True)

weights_resnet18_gradcam_cifar10_cutout = gradients_resnet18_gradcam_cifar10_cutout.mean(dim=(2, 3), keepdim=True)

weights_resnet18_gradcam_cifar10_da = gradients_resnet18_gradcam_cifar10_da.mean(dim=(2, 3), keepdim=True)

weights_resnet18_gradcam_cifar10_da_cutout = gradients_resnet18_gradcam_cifar10_da_cutout.mean(dim=(2, 3), keepdim=True)

# Calculate the weighted sum of activations (Grad-CAM)
cam_resnet18_gradcam_cifar10 = (weights_resnet18_gradcam_cifar10 * activations_resnet18_gradcam_cifar10).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar10 = F.relu(cam_resnet18_gradcam_cifar10)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar10 = F.interpolate(cam_resnet18_gradcam_cifar10, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar10 = cam_resnet18_gradcam_cifar10.squeeze().numpy()

cam_resnet18_gradcam_cifar10_cutout = (weights_resnet18_gradcam_cifar10_cutout * activations_resnet18_gradcam_cifar10_cutout).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar10_cutout = F.relu(cam_resnet18_gradcam_cifar10_cutout)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar10_cutout = F.interpolate(cam_resnet18_gradcam_cifar10_cutout, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar10_cutout = cam_resnet18_gradcam_cifar10_cutout.squeeze().numpy()

cam_resnet18_gradcam_cifar10_da = (weights_resnet18_gradcam_cifar10_da * activations_resnet18_gradcam_cifar10_da).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar10_da = F.relu(cam_resnet18_gradcam_cifar10_da)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar10_da = F.interpolate(cam_resnet18_gradcam_cifar10_da, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar10_da = cam_resnet18_gradcam_cifar10_da.squeeze().numpy()

cam_resnet18_gradcam_cifar10_da_cutout = (weights_resnet18_gradcam_cifar10_da_cutout * activations_resnet18_gradcam_cifar10_da_cutout).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar10_da_cutout = F.relu(cam_resnet18_gradcam_cifar10_da_cutout)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar10_da_cutout = F.interpolate(cam_resnet18_gradcam_cifar10_da_cutout, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar10_da_cutout = cam_resnet18_gradcam_cifar10_da_cutout.squeeze().numpy()


# Normalize the heatmap
cam_resnet18_gradcam_cifar10 -= cam_resnet18_gradcam_cifar10.min()
cam_resnet18_gradcam_cifar10 /= cam_resnet18_gradcam_cifar10.max()

cam_resnet18_gradcam_cifar10_cutout -= cam_resnet18_gradcam_cifar10_cutout.min()
cam_resnet18_gradcam_cifar10_cutout /= cam_resnet18_gradcam_cifar10_cutout.max()

cam_resnet18_gradcam_cifar10_da -= cam_resnet18_gradcam_cifar10_da.min()
cam_resnet18_gradcam_cifar10_da /= cam_resnet18_gradcam_cifar10_da.max()

cam_resnet18_gradcam_cifar10_da_cutout -= cam_resnet18_gradcam_cifar10_da_cutout.min()
cam_resnet18_gradcam_cifar10_da_cutout /= cam_resnet18_gradcam_cifar10_da_cutout.max()

# Since the images from the dataloader are normalized, you have to denormalize them before plotting
img = cv2.imread(current_path + image_path)
img = cv2.resize(img, (32, 32))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)


# Superimpose the heatmap onto the original image
heatmap_resnet18_gradcam_cifar10 = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar10), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar10 = cv2.cvtColor(heatmap_resnet18_gradcam_cifar10, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar10 = heatmap_resnet18_gradcam_cifar10 * 0.4 + img * 255

heatmap_resnet18_gradcam_cifar10_cutout = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar10_cutout), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar10_cutout = cv2.cvtColor(heatmap_resnet18_gradcam_cifar10_cutout, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar10_cutout = heatmap_resnet18_gradcam_cifar10_cutout * 0.4 + img * 255

heatmap_resnet18_gradcam_cifar10_da = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar10_da), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar10_da = cv2.cvtColor(heatmap_resnet18_gradcam_cifar10_da, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar10_da = heatmap_resnet18_gradcam_cifar10_da * 0.4 + img * 255

heatmap_resnet18_gradcam_cifar10_da_cutout = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar10_da_cutout), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar10_da_cutout = cv2.cvtColor(heatmap_resnet18_gradcam_cifar10_da_cutout, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar10_da_cutout = heatmap_resnet18_gradcam_cifar10_da_cutout * 0.4 + img * 255

class_label = str(labels.item())

Visualize the image and the Grad-CAM heatmap

In [None]:
# Display the original image and the Grad-CAM
fig, ax = plt.subplots(nrows=1, ncols=5)

ax[0].imshow(img)
ax[0].set_title('(Class: ' + cifar10_classes[int(class_label)] + ')')
ax[0].axis('off')
ax[1].imshow(superimposed_img_resnet18_gradcam_cifar10 / 255)
ax[1].set_title(predicted_class_resnet18_gradcam_cifar10)
ax[1].axis('off')
ax[2].imshow(superimposed_img_resnet18_gradcam_cifar10_cutout / 255)
ax[2].set_title(predicted_class_resnet18_gradcam_cifar10)
ax[2].axis('off')
ax[3].imshow(superimposed_img_resnet18_gradcam_cifar10_da / 255)
ax[3].set_title(predicted_class_resnet18_gradcam_cifar10_da)
ax[3].axis('off')
ax[4].imshow(superimposed_img_resnet18_gradcam_cifar10_da_cutout / 255)
ax[4].set_title(predicted_class_resnet18_gradcam_cifar10_da_cutout)
ax[4].axis('off')

fig.suptitle("Original Image - Grad-CAM -  GC with CO - GC with DA - GC with CO&Da")

plt.show()

#START HERE

## 4.2.2 Implementation Grad-CAM for ResNet18 Model for CIFAR-100

In [None]:

resnet18_gradcam_cifar100 = ResNet18(num_classes=100)
resnet18_gradcam_cifar100.load_state_dict(torch.load(current_path + "checkpoints/resnet18_cifar100.pt"))
resnet18_gradcam_cifar100.eval()

resnet18_gradcam_cifar100_cutout = ResNet18(num_classes=100)
resnet18_gradcam_cifar100_cutout.load_state_dict(torch.load(current_path + "checkpoints/resnet18_cifar100_cutout.pt"))
resnet18_gradcam_cifar100_cutout.eval()

resnet18_gradcam_cifar100_da = ResNet18(num_classes=100)
resnet18_gradcam_cifar100_da.load_state_dict(torch.load(current_path + "checkpoints/resnet18_cifar100_da.pt"))
resnet18_gradcam_cifar100_da.eval()

resnet18_gradcam_cifar100_da_cutout = ResNet18(num_classes=100)
resnet18_gradcam_cifar100_da_cutout.load_state_dict(torch.load(current_path + "checkpoints/resnet18_cifar100_da_cutout.pt"))
resnet18_gradcam_cifar100_da_cutout.eval()

Let’s try to see the result from the testloader of CIFAR-10 dataset

In [None]:
transform_cifar100 = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

testset_cifar100 = torchvision.datasets.CIFAR100(root='./data', train=False, download=True, transform=transform_cifar100)
testloader_cifar100 = torch.utils.data.DataLoader(testset_cifar100, batch_size=1, shuffle=True, num_workers=2)


In [None]:
cifar100_classes = [
    "Apple", "Aquarium fish", "Baby", "Bear", "Beaver", "Bed", "Bee", "Beetle", 
    "Bicycle", "Bottle", "Bowl", "Boy", "Bridge", "Bus", "Butterfly", "Camel", 
    "Can", "Castle", "Caterpillar", "Cattle", "Chair", "Chimpanzee", "Clock", 
    "Cloud", "Cockroach", "Couch", "Crab", "Crocodile", "Cup", "Dinosaur", 
    "Dolphin", "Elephant", "Flatfish", "Forest", "Fox", "Girl", "Hamster", 
    "House", "Kangaroo", "Computer keyboard", "Lamp", "Lawn-mower", "Leopard", "Lion",
    "Lizard", "Lobster", "Man", "Maple tree", "Motorcycle", "Mountain", "Mouse",
    "Mushrooms", "Oak tree", "Oranges", "Orchids", "Otter", "Palm tree", "Pears",
    "Pickup truck", "Pine tree", "Plain", "Plates", "Poppies", "Porcupine",
    "Possum", "Rabbit", "Raccoon", "Ray", "Road", "Rocket", "Roses", "Sea", "Seal",
    "Shark", "Shrew", "Skunk", "Skyscraper", "Snail", "Snake", "Spider", "Squirrel",
    "Streetcar", "Sunflowers", "Sweet peppers", "Table", "Tank", "Telephone", "Television", 
    "Tiger", "Tractor", "Train", "Trout", "Tulips", "Turtle", "Wardrobe", "Whale", 
    "Willow tree", "Wolf", "Woman", "Worm"
]


In [None]:
# Get a batch from the testloader
images, labels = next(iter(testloader_cifar100))
input_tensor = images  # As your batch_size is 1, you will have a single image here

# Forward pass
resnet18_gradcam_cifar100.zero_grad()
output_resnet18_gradcam_cifar100 = resnet18_gradcam_cifar100(input_tensor)

resnet18_gradcam_cifar100_cutout.zero_grad()
output_resnet18_gradcam_cifar100_cutout = resnet18_gradcam_cifar100_cutout(input_tensor)

resnet18_gradcam_cifar100_da.zero_grad()
output_resnet18_gradcam_cifar100_da = resnet18_gradcam_cifar100_da(input_tensor)

resnet18_gradcam_cifar100_da_cutout.zero_grad()
output_resnet18_gradcam_cifar100_da_cutout = resnet18_gradcam_cifar100_da_cutout(input_tensor)

# Get the index of the max log-probability
target_resnet18_gradcam_cifar100 = output_resnet18_gradcam_cifar100.argmax(1)
output_resnet18_gradcam_cifar100.max().backward()

target_resnet18_gradcam_cifar100_cutout = output_resnet18_gradcam_cifar100_cutout.argmax(1)
output_resnet18_gradcam_cifar100_cutout.max().backward()

target_resnet18_gradcam_cifar100_da = output_resnet18_gradcam_cifar100_da.argmax(1)
output_resnet18_gradcam_cifar100_da.max().backward()

target_resnet18_gradcam_cifar100_da_cutout = output_resnet18_gradcam_cifar100_da_cutout.argmax(1)
output_resnet18_gradcam_cifar100_da_cutout.max().backward()

# Map the predicted class indices to the class labels
predicted_class_resnet18_gradcam_cifar100 = cifar100_classes[target_resnet18_gradcam_cifar100.item()]
predicted_class_resnet18_gradcam_cifar100 = cifar100_classes[target_resnet18_gradcam_cifar100_cutout.item()]
predicted_class_resnet18_gradcam_cifar100_da = cifar100_classes[target_resnet18_gradcam_cifar100_da.item()]
predicted_class_resnet18_gradcam_cifar100_da_cutout = cifar100_classes[target_resnet18_gradcam_cifar100_da_cutout.item()]


# Get the gradients and activations
gradients_resnet18_gradcam_cifar100 = resnet18_gradcam_cifar100.gradients.detach().cpu()
activations_resnet18_gradcam_cifar100 = resnet18_gradcam_cifar100.activations.detach().cpu()

gradients_resnet18_gradcam_cifar100_cutout = resnet18_gradcam_cifar100_cutout.gradients.detach().cpu()
activations_resnet18_gradcam_cifar100_cutout = resnet18_gradcam_cifar100_cutout.activations.detach().cpu()

gradients_resnet18_gradcam_cifar100_da = resnet18_gradcam_cifar100_da.gradients.detach().cpu()
activations_resnet18_gradcam_cifar100_da = resnet18_gradcam_cifar100_da.activations.detach().cpu()

gradients_resnet18_gradcam_cifar100_da_cutout = resnet18_gradcam_cifar100_da_cutout.gradients.detach().cpu()
activations_resnet18_gradcam_cifar100_da_cutout = resnet18_gradcam_cifar100_da_cutout.activations.detach().cpu()


# Calculate the weights
weights_resnet18_gradcam_cifar100 = gradients_resnet18_gradcam_cifar100.mean(dim=(2, 3), keepdim=True)

weights_resnet18_gradcam_cifar100_cutout = gradients_resnet18_gradcam_cifar100_cutout.mean(dim=(2, 3), keepdim=True)

weights_resnet18_gradcam_cifar100_da = gradients_resnet18_gradcam_cifar100_da.mean(dim=(2, 3), keepdim=True)

weights_resnet18_gradcam_cifar100_da_cutout = gradients_resnet18_gradcam_cifar100_da_cutout.mean(dim=(2, 3), keepdim=True)

# Calculate the weighted sum of activations (Grad-CAM)
cam_resnet18_gradcam_cifar100 = (weights_resnet18_gradcam_cifar100 * activations_resnet18_gradcam_cifar100).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar100 = F.relu(cam_resnet18_gradcam_cifar100)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar100 = F.interpolate(cam_resnet18_gradcam_cifar100, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar100 = cam_resnet18_gradcam_cifar100.squeeze().numpy()

cam_resnet18_gradcam_cifar100_cutout = (weights_resnet18_gradcam_cifar100_cutout * activations_resnet18_gradcam_cifar100_cutout).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar100_cutout = F.relu(cam_resnet18_gradcam_cifar100_cutout)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar100_cutout = F.interpolate(cam_resnet18_gradcam_cifar100_cutout, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar100_cutout = cam_resnet18_gradcam_cifar100_cutout.squeeze().numpy()

cam_resnet18_gradcam_cifar100_da = (weights_resnet18_gradcam_cifar100_da * activations_resnet18_gradcam_cifar100_da).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar100_da = F.relu(cam_resnet18_gradcam_cifar100_da)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar100_da = F.interpolate(cam_resnet18_gradcam_cifar100_da, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar100_da = cam_resnet18_gradcam_cifar100_da.squeeze().numpy()

cam_resnet18_gradcam_cifar100_da_cutout = (weights_resnet18_gradcam_cifar100_da_cutout * activations_resnet18_gradcam_cifar100_da_cutout).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar100_da_cutout = F.relu(cam_resnet18_gradcam_cifar100_da_cutout)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar100_da_cutout = F.interpolate(cam_resnet18_gradcam_cifar100_da_cutout, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar100_da_cutout = cam_resnet18_gradcam_cifar100_da_cutout.squeeze().numpy()


# Normalize the heatmap
cam_resnet18_gradcam_cifar100 -= cam_resnet18_gradcam_cifar100.min()
cam_resnet18_gradcam_cifar100 /= cam_resnet18_gradcam_cifar100.max()

cam_resnet18_gradcam_cifar100_cutout -= cam_resnet18_gradcam_cifar100_cutout.min()
cam_resnet18_gradcam_cifar100_cutout /= cam_resnet18_gradcam_cifar100_cutout.max()

cam_resnet18_gradcam_cifar100_da -= cam_resnet18_gradcam_cifar100_da.min()
cam_resnet18_gradcam_cifar100_da /= cam_resnet18_gradcam_cifar100_da.max()

cam_resnet18_gradcam_cifar100_da_cutout -= cam_resnet18_gradcam_cifar100_da_cutout.min()
cam_resnet18_gradcam_cifar100_da_cutout /= cam_resnet18_gradcam_cifar100_da_cutout.max()

# Since the images from the dataloader are normalized, you have to denormalize them before plotting
mean = torch.tensor([0.485, 0.456, 0.406])
std = torch.tensor([0.229, 0.224, 0.225])
img = images.squeeze().detach().cpu() * std[..., None, None] + mean[..., None, None]
img = img.permute(1, 2, 0).numpy()

# Superimpose the heatmap onto the original image
heatmap_resnet18_gradcam_cifar100 = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar100), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar100 = cv2.cvtColor(heatmap_resnet18_gradcam_cifar100, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar100 = heatmap_resnet18_gradcam_cifar100 * 0.4 + img * 255

heatmap_resnet18_gradcam_cifar100_cutout = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar100_cutout), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar100_cutout = cv2.cvtColor(heatmap_resnet18_gradcam_cifar100_cutout, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar100_cutout = heatmap_resnet18_gradcam_cifar100_cutout * 0.4 + img * 255

heatmap_resnet18_gradcam_cifar100_da = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar100_da), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar100_da = cv2.cvtColor(heatmap_resnet18_gradcam_cifar100_da, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar100_da = heatmap_resnet18_gradcam_cifar100_da * 0.4 + img * 255

heatmap_resnet18_gradcam_cifar100_da_cutout = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar100_da_cutout), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar100_da_cutout = cv2.cvtColor(heatmap_resnet18_gradcam_cifar100_da_cutout, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar100_da_cutout = heatmap_resnet18_gradcam_cifar100_da_cutout * 0.4 + img * 255

class_label = str(labels.item())

# Display the original image and the Grad-CAM
fig, ax = plt.subplots(nrows=1, ncols=5)

ax[0].imshow(img)
ax[0].set_title('(Class: ' + cifar100_classes[int(class_label)] + ')')
ax[0].axis('off')
ax[1].imshow(superimposed_img_resnet18_gradcam_cifar100 / 255)
ax[1].set_title(predicted_class_resnet18_gradcam_cifar100)
ax[1].axis('off')
ax[2].imshow(superimposed_img_resnet18_gradcam_cifar100_cutout / 255)
ax[2].set_title(predicted_class_resnet18_gradcam_cifar100)
ax[2].axis('off')
ax[3].imshow(superimposed_img_resnet18_gradcam_cifar100_da / 255)
ax[3].set_title(predicted_class_resnet18_gradcam_cifar100_da)
ax[3].axis('off')
ax[4].imshow(superimposed_img_resnet18_gradcam_cifar100_da_cutout / 255)
ax[4].set_title(predicted_class_resnet18_gradcam_cifar100_da_cutout)
ax[4].axis('off')

fig.suptitle("Original Image - Grad-CAM -  GC with CO - GC with DA - GC with CO&Da")
plt.show()



Now you can try to load your image, preprocess it and convert it into a PyTorch tensor. Choose an image that is in the CIFAR-10 classes (airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks). The preprocessing steps should be the same as the ones you used for training your model. Let’s say you have an image `image.jpeg`:

In [None]:
# Load the image
image_path = "image.jpeg"
image = Image.open(current_path + image_path)

# Define the transformations: resize, to tensor, normalize (replace the mean and std with values you used for training)
preprocess = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

# Preprocess the image
input_tensor = preprocess(image)
input_tensor = input_tensor.unsqueeze(0)  # add batch dimension.  C,H,W => B,C,H,W

Apply Grad-CAM

In [None]:
# Forward pass
resnet18_gradcam_cifar100.zero_grad()
output_resnet18_gradcam_cifar100 = resnet18_gradcam_cifar100(input_tensor)

resnet18_gradcam_cifar100_cutout.zero_grad()
output_resnet18_gradcam_cifar100_cutout = resnet18_gradcam_cifar100_cutout(input_tensor)

resnet18_gradcam_cifar100_da.zero_grad()
output_resnet18_gradcam_cifar100_da = resnet18_gradcam_cifar100_da(input_tensor)

resnet18_gradcam_cifar100_da_cutout.zero_grad()
output_resnet18_gradcam_cifar100_da_cutout = resnet18_gradcam_cifar100_da_cutout(input_tensor)

# Get the index of the max log-probability
target_resnet18_gradcam_cifar100 = output_resnet18_gradcam_cifar100.argmax(1)
output_resnet18_gradcam_cifar100.max().backward()

target_resnet18_gradcam_cifar100_cutout = output_resnet18_gradcam_cifar100_cutout.argmax(1)
output_resnet18_gradcam_cifar100_cutout.max().backward()

target_resnet18_gradcam_cifar100_da = output_resnet18_gradcam_cifar100_da.argmax(1)
output_resnet18_gradcam_cifar100_da.max().backward()

target_resnet18_gradcam_cifar100_da_cutout = output_resnet18_gradcam_cifar100_da_cutout.argmax(1)
output_resnet18_gradcam_cifar100_da_cutout.max().backward()

# Map the predicted class indices to the class labels
predicted_class_resnet18_gradcam_cifar100 = cifar100_classes[target_resnet18_gradcam_cifar100.item()]
predicted_class_resnet18_gradcam_cifar100 = cifar100_classes[target_resnet18_gradcam_cifar100_cutout.item()]
predicted_class_resnet18_gradcam_cifar100_da = cifar100_classes[target_resnet18_gradcam_cifar100_da.item()]
predicted_class_resnet18_gradcam_cifar100_da_cutout = cifar100_classes[target_resnet18_gradcam_cifar100_da_cutout.item()]


# Get the gradients and activations
gradients_resnet18_gradcam_cifar100 = resnet18_gradcam_cifar100.gradients.detach().cpu()
activations_resnet18_gradcam_cifar100 = resnet18_gradcam_cifar100.activations.detach().cpu()

gradients_resnet18_gradcam_cifar100_cutout = resnet18_gradcam_cifar100_cutout.gradients.detach().cpu()
activations_resnet18_gradcam_cifar100_cutout = resnet18_gradcam_cifar100_cutout.activations.detach().cpu()

gradients_resnet18_gradcam_cifar100_da = resnet18_gradcam_cifar100_da.gradients.detach().cpu()
activations_resnet18_gradcam_cifar100_da = resnet18_gradcam_cifar100_da.activations.detach().cpu()

gradients_resnet18_gradcam_cifar100_da_cutout = resnet18_gradcam_cifar100_da_cutout.gradients.detach().cpu()
activations_resnet18_gradcam_cifar100_da_cutout = resnet18_gradcam_cifar100_da_cutout.activations.detach().cpu()


# Calculate the weights
weights_resnet18_gradcam_cifar100 = gradients_resnet18_gradcam_cifar100.mean(dim=(2, 3), keepdim=True)

weights_resnet18_gradcam_cifar100_cutout = gradients_resnet18_gradcam_cifar100_cutout.mean(dim=(2, 3), keepdim=True)

weights_resnet18_gradcam_cifar100_da = gradients_resnet18_gradcam_cifar100_da.mean(dim=(2, 3), keepdim=True)

weights_resnet18_gradcam_cifar100_da_cutout = gradients_resnet18_gradcam_cifar100_da_cutout.mean(dim=(2, 3), keepdim=True)

# Calculate the weighted sum of activations (Grad-CAM)
cam_resnet18_gradcam_cifar100 = (weights_resnet18_gradcam_cifar100 * activations_resnet18_gradcam_cifar100).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar100 = F.relu(cam_resnet18_gradcam_cifar100)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar100 = F.interpolate(cam_resnet18_gradcam_cifar100, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar100 = cam_resnet18_gradcam_cifar100.squeeze().numpy()

cam_resnet18_gradcam_cifar100_cutout = (weights_resnet18_gradcam_cifar100_cutout * activations_resnet18_gradcam_cifar100_cutout).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar100_cutout = F.relu(cam_resnet18_gradcam_cifar100_cutout)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar100_cutout = F.interpolate(cam_resnet18_gradcam_cifar100_cutout, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar100_cutout = cam_resnet18_gradcam_cifar100_cutout.squeeze().numpy()

cam_resnet18_gradcam_cifar100_da = (weights_resnet18_gradcam_cifar100_da * activations_resnet18_gradcam_cifar100_da).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar100_da = F.relu(cam_resnet18_gradcam_cifar100_da)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar100_da = F.interpolate(cam_resnet18_gradcam_cifar100_da, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar100_da = cam_resnet18_gradcam_cifar100_da.squeeze().numpy()

cam_resnet18_gradcam_cifar100_da_cutout = (weights_resnet18_gradcam_cifar100_da_cutout * activations_resnet18_gradcam_cifar100_da_cutout).sum(dim=1, keepdim=True)
cam_resnet18_gradcam_cifar100_da_cutout = F.relu(cam_resnet18_gradcam_cifar100_da_cutout)  # apply ReLU to the heatmap
cam_resnet18_gradcam_cifar100_da_cutout = F.interpolate(cam_resnet18_gradcam_cifar100_da_cutout, size=(32, 32), mode='bilinear', align_corners=False)
cam_resnet18_gradcam_cifar100_da_cutout = cam_resnet18_gradcam_cifar100_da_cutout.squeeze().numpy()


# Normalize the heatmap
cam_resnet18_gradcam_cifar100 -= cam_resnet18_gradcam_cifar100.min()
cam_resnet18_gradcam_cifar100 /= cam_resnet18_gradcam_cifar100.max()

cam_resnet18_gradcam_cifar100_cutout -= cam_resnet18_gradcam_cifar100_cutout.min()
cam_resnet18_gradcam_cifar100_cutout /= cam_resnet18_gradcam_cifar100_cutout.max()

cam_resnet18_gradcam_cifar100_da -= cam_resnet18_gradcam_cifar100_da.min()
cam_resnet18_gradcam_cifar100_da /= cam_resnet18_gradcam_cifar100_da.max()

cam_resnet18_gradcam_cifar100_da_cutout -= cam_resnet18_gradcam_cifar100_da_cutout.min()
cam_resnet18_gradcam_cifar100_da_cutout /= cam_resnet18_gradcam_cifar100_da_cutout.max()

# Since the images from the dataloader are normalized, you have to denormalize them before plotting
img = cv2.imread(current_path + image_path)
img = cv2.resize(img, (32, 32))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)


# Superimpose the heatmap onto the original image
heatmap_resnet18_gradcam_cifar100 = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar100), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar100 = cv2.cvtColor(heatmap_resnet18_gradcam_cifar100, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar100 = heatmap_resnet18_gradcam_cifar100 * 0.4 + img * 255

heatmap_resnet18_gradcam_cifar100_cutout = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar100_cutout), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar100_cutout = cv2.cvtColor(heatmap_resnet18_gradcam_cifar100_cutout, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar100_cutout = heatmap_resnet18_gradcam_cifar100_cutout * 0.4 + img * 255

heatmap_resnet18_gradcam_cifar100_da = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar100_da), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar100_da = cv2.cvtColor(heatmap_resnet18_gradcam_cifar100_da, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar100_da = heatmap_resnet18_gradcam_cifar100_da * 0.4 + img * 255

heatmap_resnet18_gradcam_cifar100_da_cutout = cv2.applyColorMap(np.uint8(255 * cam_resnet18_gradcam_cifar100_da_cutout), cv2.COLORMAP_JET)
heatmap_resnet18_gradcam_cifar100_da_cutout = cv2.cvtColor(heatmap_resnet18_gradcam_cifar100_da_cutout, cv2.COLOR_BGR2RGB)
superimposed_img_resnet18_gradcam_cifar100_da_cutout = heatmap_resnet18_gradcam_cifar100_da_cutout * 0.4 + img * 255

class_label = str(labels.item())

Visualize the image and the Grad-CAM heatmap

In [None]:
# Display the original image and the Grad-CAM
fig, ax = plt.subplots(nrows=1, ncols=5)

ax[0].imshow(img)
ax[0].set_title('(Class: ' + cifar100_classes[int(class_label)] + ')')
ax[0].axis('off')
ax[1].imshow(superimposed_img_resnet18_gradcam_cifar100 / 255)
ax[1].set_title(predicted_class_resnet18_gradcam_cifar100)
ax[1].axis('off')
ax[2].imshow(superimposed_img_resnet18_gradcam_cifar100_cutout / 255)
ax[2].set_title(predicted_class_resnet18_gradcam_cifar100)
ax[2].axis('off')
ax[3].imshow(superimposed_img_resnet18_gradcam_cifar100_da / 255)
ax[3].set_title(predicted_class_resnet18_gradcam_cifar100_da)
ax[3].axis('off')
ax[4].imshow(superimposed_img_resnet18_gradcam_cifar100_da_cutout / 255)
ax[4].set_title(predicted_class_resnet18_gradcam_cifar100_da_cutout)
ax[4].axis('off')

fig.suptitle("Original Image - Grad-CAM -  GC with CO - GC with DA - GC with CO&Da")

plt.show()

#ENDS HERE

## 4.3 Implementation Grad-CAM for WideResNet Model

### WideResNet Code

In [None]:
# WideResNet

# From https://github.com/uoguelph-mlrg/Cutout/blob/master/model/wide_resnet.py

class BasicBlockWide(nn.Module):
    def __init__(self, in_planes, out_planes, stride, dropRate=0.0):
        super(BasicBlockWide, self).__init__()
        self.bn1 = nn.BatchNorm2d(in_planes)
        self.relu1 = nn.ReLU(inplace=True)
        self.conv1 = nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
                               padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(out_planes)
        self.relu2 = nn.ReLU(inplace=True)
        self.conv2 = nn.Conv2d(out_planes, out_planes, kernel_size=3, stride=1,
                               padding=1, bias=False)
        self.droprate = dropRate
        self.equalInOut = (in_planes == out_planes)
        self.convShortcut = (not self.equalInOut) and nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride,
                               padding=0, bias=False) or None
    def forward(self, x):
        if not self.equalInOut:
            x = self.relu1(self.bn1(x))
        else:
            out = self.relu1(self.bn1(x))
        out = self.relu2(self.bn2(self.conv1(out if self.equalInOut else x)))
        if self.droprate > 0:
            out = F.dropout(out, p=self.droprate, training=self.training)
        out = self.conv2(out)
        return torch.add(x if self.equalInOut else self.convShortcut(x), out)

class NetworkBlock(nn.Module):
    def __init__(self, nb_layers, in_planes, out_planes, block, stride, dropRate=0.0):
        super(NetworkBlock, self).__init__()
        self.layer = self._make_layer(block, in_planes, out_planes, nb_layers, stride, dropRate)
    def _make_layer(self, block, in_planes, out_planes, nb_layers, stride, dropRate):
        layers = []
        for i in range(nb_layers):
            layers.append(block(i == 0 and in_planes or out_planes, out_planes, i == 0 and stride or 1, dropRate))
        return nn.Sequential(*layers)
    def forward(self, x):
        return self.layer(x)

class WideResNet(nn.Module):
    def __init__(self, depth, num_classes, widen_factor=1, dropRate=0.0):
        super(WideResNet, self).__init__()
        nChannels = [16, 16*widen_factor, 32*widen_factor, 64*widen_factor]
        assert((depth - 4) % 6 == 0)
        n = (depth - 4) // 6
        block = BasicBlockWide
        # 1st conv before any network block
        self.conv1 = nn.Conv2d(3, nChannels[0], kernel_size=3, stride=1,
                               padding=1, bias=False)
        # 1st block
        self.block1 = NetworkBlock(n, nChannels[0], nChannels[1], block, 1, dropRate)
        # 2nd block
        self.block2 = NetworkBlock(n, nChannels[1], nChannels[2], block, 2, dropRate)
        # 3rd block
        self.block3 = NetworkBlock(n, nChannels[2], nChannels[3], block, 2, dropRate)
        # global average pooling and classifier
        self.bn1 = nn.BatchNorm2d(nChannels[3])
        self.relu = nn.ReLU(inplace=True)
        self.fc = nn.Linear(nChannels[3], num_classes)
        self.nChannels = nChannels[3]

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2. / n))
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()
            elif isinstance(m, nn.Linear):
                m.bias.data.zero_()

        # Register hooks for Grad-CAM
        self.gradients = None
        self.activations = None
        self.block3.register_forward_hook(self._store_activations_hook)
        self.block3.register_backward_hook(self._store_gradients_hook)

    def _store_activations_hook(self, module, input, output):
        self.activations = output

    def _store_gradients_hook(self, module, grad_input, grad_output):
        self.gradients = grad_output[0]

    def forward(self, x):
        out = self.conv1(x)
        out = self.block1(out)
        out = self.block2(out)
        out = self.block3(out)
        out = self.relu(self.bn1(out))

        out = F.avg_pool2d(out, 8)
        out = out.view(-1, self.nChannels)
        out = self.fc(out)
        return out

## 4.3.1 Implementation Grad-CAM for WideResNet Model for CIFAR-10

In [None]:

wideresnet_gradcam_cifar10 = WideResNet(depth=28, num_classes=10, widen_factor=10, dropRate=0.3)
wideresnet_gradcam_cifar10.load_state_dict(torch.load(current_path + "checkpoints/wideresnet_cifar10.pt"))
wideresnet_gradcam_cifar10.eval()

wideresnet_gradcam_cifar10_cutout = WideResNet(depth=28, num_classes=10, widen_factor=10, dropRate=0.3)
wideresnet_gradcam_cifar10_cutout.load_state_dict(torch.load(current_path + "checkpoints/wideresnet_cifar10_cutout.pt"))
wideresnet_gradcam_cifar10_cutout.eval()

wideresnet_gradcam_cifar10_da = WideResNet(depth=28, num_classes=10, widen_factor=10, dropRate=0.3)
wideresnet_gradcam_cifar10_da.load_state_dict(torch.load(current_path + "checkpoints/wideresnet_cifar10_da.pt"))
wideresnet_gradcam_cifar10_da.eval()

wideresnet_gradcam_cifar10_da_cutout = WideResNet(depth=28, num_classes=10, widen_factor=10, dropRate=0.3)
wideresnet_gradcam_cifar10_da_cutout.load_state_dict(torch.load(current_path + "checkpoints/wideresnet_cifar10_da_cutout.pt"))
wideresnet_gradcam_cifar10_da_cutout.eval()

Let’s try to see the result from the testloader of CIFAR-10 dataset

In [None]:
transform_cifar10 = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

testset_cifar10 = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform_cifar10)
testloader_cifar10 = torch.utils.data.DataLoader(testset_cifar10, batch_size=1, shuffle=True, num_workers=2)


In [None]:
cifar10_classes = [
    "Airplane", "Automobile", "Bird", "Cat", "Deer",
    "Dog", "Frog", "Horse", "Ship", "Truck"
]

In [None]:
# Get a batch from the testloader
images, labels = next(iter(testloader_cifar10))
input_tensor = images  # As your batch_size is 1, you will have a single image here

# Forward pass
wideresnet_gradcam_cifar10.zero_grad()
output_wideresnet_gradcam_cifar10 = wideresnet_gradcam_cifar10(input_tensor)

wideresnet_gradcam_cifar10_cutout.zero_grad()
output_wideresnet_gradcam_cifar10_cutout = wideresnet_gradcam_cifar10_cutout(input_tensor)

wideresnet_gradcam_cifar10_da.zero_grad()
output_wideresnet_gradcam_cifar10_da = wideresnet_gradcam_cifar10_da(input_tensor)

wideresnet_gradcam_cifar10_da_cutout.zero_grad()
output_wideresnet_gradcam_cifar10_da_cutout = wideresnet_gradcam_cifar10_da_cutout(input_tensor)

# Get the index of the max log-probability
target_wideresnet_gradcam_cifar10 = output_wideresnet_gradcam_cifar10.argmax(1)
output_wideresnet_gradcam_cifar10.max().backward()

target_wideresnet_gradcam_cifar10_cutout = output_wideresnet_gradcam_cifar10_cutout.argmax(1)
output_wideresnet_gradcam_cifar10_cutout.max().backward()

target_wideresnet_gradcam_cifar10_da = output_wideresnet_gradcam_cifar10_da.argmax(1)
output_wideresnet_gradcam_cifar10_da.max().backward()

target_wideresnet_gradcam_cifar10_da_cutout = output_wideresnet_gradcam_cifar10_da_cutout.argmax(1)
output_wideresnet_gradcam_cifar10_da_cutout.max().backward()

# Map the predicted class indices to the class labels
predicted_class_wideresnet_gradcam_cifar10 = cifar10_classes[target_wideresnet_gradcam_cifar10.item()]
predicted_class_wideresnet_gradcam_cifar10 = cifar10_classes[target_wideresnet_gradcam_cifar10_cutout.item()]
predicted_class_wideresnet_gradcam_cifar10_da = cifar10_classes[target_wideresnet_gradcam_cifar10_da.item()]
predicted_class_wideresnet_gradcam_cifar10_da_cutout = cifar10_classes[target_wideresnet_gradcam_cifar10_da_cutout.item()]


# Get the gradients and activations
gradients_wideresnet_gradcam_cifar10 = wideresnet_gradcam_cifar10.gradients.detach().cpu()
activations_wideresnet_gradcam_cifar10 = wideresnet_gradcam_cifar10.activations.detach().cpu()

gradients_wideresnet_gradcam_cifar10_cutout = wideresnet_gradcam_cifar10_cutout.gradients.detach().cpu()
activations_wideresnet_gradcam_cifar10_cutout = wideresnet_gradcam_cifar10_cutout.activations.detach().cpu()

gradients_wideresnet_gradcam_cifar10_da = wideresnet_gradcam_cifar10_da.gradients.detach().cpu()
activations_wideresnet_gradcam_cifar10_da = wideresnet_gradcam_cifar10_da.activations.detach().cpu()

gradients_wideresnet_gradcam_cifar10_da_cutout = wideresnet_gradcam_cifar10_da_cutout.gradients.detach().cpu()
activations_wideresnet_gradcam_cifar10_da_cutout = wideresnet_gradcam_cifar10_da_cutout.activations.detach().cpu()


# Calculate the weights
weights_wideresnet_gradcam_cifar10 = gradients_wideresnet_gradcam_cifar10.mean(dim=(2, 3), keepdim=True)

weights_wideresnet_gradcam_cifar10_cutout = gradients_wideresnet_gradcam_cifar10_cutout.mean(dim=(2, 3), keepdim=True)

weights_wideresnet_gradcam_cifar10_da = gradients_wideresnet_gradcam_cifar10_da.mean(dim=(2, 3), keepdim=True)

weights_wideresnet_gradcam_cifar10_da_cutout = gradients_wideresnet_gradcam_cifar10_da_cutout.mean(dim=(2, 3), keepdim=True)

# Calculate the weighted sum of activations (Grad-CAM)
cam_wideresnet_gradcam_cifar10 = (weights_wideresnet_gradcam_cifar10 * activations_wideresnet_gradcam_cifar10).sum(dim=1, keepdim=True)
cam_wideresnet_gradcam_cifar10 = F.relu(cam_wideresnet_gradcam_cifar10)  # apply ReLU to the heatmap
cam_wideresnet_gradcam_cifar10 = F.interpolate(cam_wideresnet_gradcam_cifar10, size=(32, 32), mode='bilinear', align_corners=False)
cam_wideresnet_gradcam_cifar10 = cam_wideresnet_gradcam_cifar10.squeeze().numpy()

cam_wideresnet_gradcam_cifar10_cutout = (weights_wideresnet_gradcam_cifar10_cutout * activations_wideresnet_gradcam_cifar10_cutout).sum(dim=1, keepdim=True)
cam_wideresnet_gradcam_cifar10_cutout = F.relu(cam_wideresnet_gradcam_cifar10_cutout)  # apply ReLU to the heatmap
cam_wideresnet_gradcam_cifar10_cutout = F.interpolate(cam_wideresnet_gradcam_cifar10_cutout, size=(32, 32), mode='bilinear', align_corners=False)
cam_wideresnet_gradcam_cifar10_cutout = cam_wideresnet_gradcam_cifar10_cutout.squeeze().numpy()

cam_wideresnet_gradcam_cifar10_da = (weights_wideresnet_gradcam_cifar10_da * activations_wideresnet_gradcam_cifar10_da).sum(dim=1, keepdim=True)
cam_wideresnet_gradcam_cifar10_da = F.relu(cam_wideresnet_gradcam_cifar10_da)  # apply ReLU to the heatmap
cam_wideresnet_gradcam_cifar10_da = F.interpolate(cam_wideresnet_gradcam_cifar10_da, size=(32, 32), mode='bilinear', align_corners=False)
cam_wideresnet_gradcam_cifar10_da = cam_wideresnet_gradcam_cifar10_da.squeeze().numpy()

cam_wideresnet_gradcam_cifar10_da_cutout = (weights_wideresnet_gradcam_cifar10_da_cutout * activations_wideresnet_gradcam_cifar10_da_cutout).sum(dim=1, keepdim=True)
cam_wideresnet_gradcam_cifar10_da_cutout = F.relu(cam_wideresnet_gradcam_cifar10_da_cutout)  # apply ReLU to the heatmap
cam_wideresnet_gradcam_cifar10_da_cutout = F.interpolate(cam_wideresnet_gradcam_cifar10_da_cutout, size=(32, 32), mode='bilinear', align_corners=False)
cam_wideresnet_gradcam_cifar10_da_cutout = cam_wideresnet_gradcam_cifar10_da_cutout.squeeze().numpy()


# Normalize the heatmap
cam_wideresnet_gradcam_cifar10 -= cam_wideresnet_gradcam_cifar10.min()
cam_wideresnet_gradcam_cifar10 /= cam_wideresnet_gradcam_cifar10.max()

cam_wideresnet_gradcam_cifar10_cutout -= cam_wideresnet_gradcam_cifar10_cutout.min()
cam_wideresnet_gradcam_cifar10_cutout /= cam_wideresnet_gradcam_cifar10_cutout.max()

cam_wideresnet_gradcam_cifar10_da -= cam_wideresnet_gradcam_cifar10_da.min()
cam_wideresnet_gradcam_cifar10_da /= cam_wideresnet_gradcam_cifar10_da.max()

cam_wideresnet_gradcam_cifar10_da_cutout -= cam_wideresnet_gradcam_cifar10_da_cutout.min()
cam_wideresnet_gradcam_cifar10_da_cutout /= cam_wideresnet_gradcam_cifar10_da_cutout.max()

# Since the images from the dataloader are normalized, you have to denormalize them before plotting
mean = torch.tensor([0.485, 0.456, 0.406])
std = torch.tensor([0.229, 0.224, 0.225])
img = images.squeeze().detach().cpu() * std[..., None, None] + mean[..., None, None]
img = img.permute(1, 2, 0).numpy()

# Superimpose the heatmap onto the original image
heatmap_wideresnet_gradcam_cifar10 = cv2.applyColorMap(np.uint8(255 * cam_wideresnet_gradcam_cifar10), cv2.COLORMAP_JET)
heatmap_wideresnet_gradcam_cifar10 = cv2.cvtColor(heatmap_wideresnet_gradcam_cifar10, cv2.COLOR_BGR2RGB)
superimposed_img_wideresnet_gradcam_cifar10 = heatmap_wideresnet_gradcam_cifar10 * 0.4 + img * 255

heatmap_wideresnet_gradcam_cifar10_cutout = cv2.applyColorMap(np.uint8(255 * cam_wideresnet_gradcam_cifar10_cutout), cv2.COLORMAP_JET)
heatmap_wideresnet_gradcam_cifar10_cutout = cv2.cvtColor(heatmap_wideresnet_gradcam_cifar10_cutout, cv2.COLOR_BGR2RGB)
superimposed_img_wideresnet_gradcam_cifar10_cutout = heatmap_wideresnet_gradcam_cifar10_cutout * 0.4 + img * 255

heatmap_wideresnet_gradcam_cifar10_da = cv2.applyColorMap(np.uint8(255 * cam_wideresnet_gradcam_cifar10_da), cv2.COLORMAP_JET)
heatmap_wideresnet_gradcam_cifar10_da = cv2.cvtColor(heatmap_wideresnet_gradcam_cifar10_da, cv2.COLOR_BGR2RGB)
superimposed_img_wideresnet_gradcam_cifar10_da = heatmap_wideresnet_gradcam_cifar10_da * 0.4 + img * 255

heatmap_wideresnet_gradcam_cifar10_da_cutout = cv2.applyColorMap(np.uint8(255 * cam_wideresnet_gradcam_cifar10_da_cutout), cv2.COLORMAP_JET)
heatmap_wideresnet_gradcam_cifar10_da_cutout = cv2.cvtColor(heatmap_wideresnet_gradcam_cifar10_da_cutout, cv2.COLOR_BGR2RGB)
superimposed_img_wideresnet_gradcam_cifar10_da_cutout = heatmap_wideresnet_gradcam_cifar10_da_cutout * 0.4 + img * 255

class_label = str(labels.item())

# Display the original image and the Grad-CAM
fig, ax = plt.subplots(nrows=1, ncols=5, constrained_layout=True)

ax[0].imshow(img)
ax[0].set_title('(Class: ' + cifar10_classes[int(class_label)] + ')')
ax[0].axis('off')
ax[1].imshow(superimposed_img_wideresnet_gradcam_cifar10 / 255)
ax[1].set_title('Pred: ' + predicted_class_wideresnet_gradcam_cifar10)
ax[1].axis('off')
ax[2].imshow(superimposed_img_wideresnet_gradcam_cifar10_cutout / 255)
ax[2].set_title(predicted_class_wideresnet_gradcam_cifar10)
ax[2].axis('off')
ax[3].imshow(superimposed_img_wideresnet_gradcam_cifar10_da / 255)
ax[3].set_title(predicted_class_wideresnet_gradcam_cifar10_da)
ax[3].axis('off')
ax[4].imshow(superimposed_img_wideresnet_gradcam_cifar10_da_cutout / 255)
ax[4].set_title(predicted_class_wideresnet_gradcam_cifar10_da_cutout)
ax[4].axis('off')

fig.suptitle("Original Image - Grad-CAM -  GC with CO - GC with DA - GC with CO&Da")
plt.show()

Now you can try to load your image, preprocess it and convert it into a PyTorch tensor. Choose an image that is in the CIFAR-10 classes (airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks). The preprocessing steps should be the same as the ones you used for training your model. Let’s say you have an image `image.jpeg`:

In [None]:
# Load the image
image_path = "image.jpeg"
image = Image.open(current_path + image_path)

# Define the transformations: resize, to tensor, normalize (replace the mean and std with values you used for training)
preprocess = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

# Preprocess the image
input_tensor = preprocess(image)
input_tensor = input_tensor.unsqueeze(0)  # add batch dimension.  C,H,W => B,C,H,W

Apply Grad-CAM

In [None]:
# Forward pass
wideresnet_gradcam_cifar10.zero_grad()
output_wideresnet_gradcam_cifar10 = wideresnet_gradcam_cifar10(input_tensor)

wideresnet_gradcam_cifar10_cutout.zero_grad()
output_wideresnet_gradcam_cifar10_cutout = wideresnet_gradcam_cifar10_cutout(input_tensor)

wideresnet_gradcam_cifar10_da.zero_grad()
output_wideresnet_gradcam_cifar10_da = wideresnet_gradcam_cifar10_da(input_tensor)

wideresnet_gradcam_cifar10_da_cutout.zero_grad()
output_wideresnet_gradcam_cifar10_da_cutout = wideresnet_gradcam_cifar10_da_cutout(input_tensor)

# Get the index of the max log-probability
target_wideresnet_gradcam_cifar10 = output_wideresnet_gradcam_cifar10.argmax(1)
output_wideresnet_gradcam_cifar10.max().backward()

target_wideresnet_gradcam_cifar10_cutout = output_wideresnet_gradcam_cifar10_cutout.argmax(1)
output_wideresnet_gradcam_cifar10_cutout.max().backward()

target_wideresnet_gradcam_cifar10_da = output_wideresnet_gradcam_cifar10_da.argmax(1)
output_wideresnet_gradcam_cifar10_da.max().backward()

target_wideresnet_gradcam_cifar10_da_cutout = output_wideresnet_gradcam_cifar10_da_cutout.argmax(1)
output_wideresnet_gradcam_cifar10_da_cutout.max().backward()

# Map the predicted class indices to the class labels
predicted_class_wideresnet_gradcam_cifar10 = cifar10_classes[target_wideresnet_gradcam_cifar10.item()]
predicted_class_wideresnet_gradcam_cifar10 = cifar10_classes[target_wideresnet_gradcam_cifar10_cutout.item()]
predicted_class_wideresnet_gradcam_cifar10_da = cifar10_classes[target_wideresnet_gradcam_cifar10_da.item()]
predicted_class_wideresnet_gradcam_cifar10_da_cutout = cifar10_classes[target_wideresnet_gradcam_cifar10_da_cutout.item()]


# Get the gradients and activations
gradients_wideresnet_gradcam_cifar10 = wideresnet_gradcam_cifar10.gradients.detach().cpu()
activations_wideresnet_gradcam_cifar10 = wideresnet_gradcam_cifar10.activations.detach().cpu()

gradients_wideresnet_gradcam_cifar10_cutout = wideresnet_gradcam_cifar10_cutout.gradients.detach().cpu()
activations_wideresnet_gradcam_cifar10_cutout = wideresnet_gradcam_cifar10_cutout.activations.detach().cpu()

gradients_wideresnet_gradcam_cifar10_da = wideresnet_gradcam_cifar10_da.gradients.detach().cpu()
activations_wideresnet_gradcam_cifar10_da = wideresnet_gradcam_cifar10_da.activations.detach().cpu()

gradients_wideresnet_gradcam_cifar10_da_cutout = wideresnet_gradcam_cifar10_da_cutout.gradients.detach().cpu()
activations_wideresnet_gradcam_cifar10_da_cutout = wideresnet_gradcam_cifar10_da_cutout.activations.detach().cpu()


# Calculate the weights
weights_wideresnet_gradcam_cifar10 = gradients_wideresnet_gradcam_cifar10.mean(dim=(2, 3), keepdim=True)

weights_wideresnet_gradcam_cifar10_cutout = gradients_wideresnet_gradcam_cifar10_cutout.mean(dim=(2, 3), keepdim=True)

weights_wideresnet_gradcam_cifar10_da = gradients_wideresnet_gradcam_cifar10_da.mean(dim=(2, 3), keepdim=True)

weights_wideresnet_gradcam_cifar10_da_cutout = gradients_wideresnet_gradcam_cifar10_da_cutout.mean(dim=(2, 3), keepdim=True)

# Calculate the weighted sum of activations (Grad-CAM)
cam_wideresnet_gradcam_cifar10 = (weights_wideresnet_gradcam_cifar10 * activations_wideresnet_gradcam_cifar10).sum(dim=1, keepdim=True)
cam_wideresnet_gradcam_cifar10 = F.relu(cam_wideresnet_gradcam_cifar10)  # apply ReLU to the heatmap
cam_wideresnet_gradcam_cifar10 = F.interpolate(cam_wideresnet_gradcam_cifar10, size=(32, 32), mode='bilinear', align_corners=False)
cam_wideresnet_gradcam_cifar10 = cam_wideresnet_gradcam_cifar10.squeeze().numpy()

cam_wideresnet_gradcam_cifar10_cutout = (weights_wideresnet_gradcam_cifar10_cutout * activations_wideresnet_gradcam_cifar10_cutout).sum(dim=1, keepdim=True)
cam_wideresnet_gradcam_cifar10_cutout = F.relu(cam_wideresnet_gradcam_cifar10_cutout)  # apply ReLU to the heatmap
cam_wideresnet_gradcam_cifar10_cutout = F.interpolate(cam_wideresnet_gradcam_cifar10_cutout, size=(32, 32), mode='bilinear', align_corners=False)
cam_wideresnet_gradcam_cifar10_cutout = cam_wideresnet_gradcam_cifar10_cutout.squeeze().numpy()

cam_wideresnet_gradcam_cifar10_da = (weights_wideresnet_gradcam_cifar10_da * activations_wideresnet_gradcam_cifar10_da).sum(dim=1, keepdim=True)
cam_wideresnet_gradcam_cifar10_da = F.relu(cam_wideresnet_gradcam_cifar10_da)  # apply ReLU to the heatmap
cam_wideresnet_gradcam_cifar10_da = F.interpolate(cam_wideresnet_gradcam_cifar10_da, size=(32, 32), mode='bilinear', align_corners=False)
cam_wideresnet_gradcam_cifar10_da = cam_wideresnet_gradcam_cifar10_da.squeeze().numpy()

cam_wideresnet_gradcam_cifar10_da_cutout = (weights_wideresnet_gradcam_cifar10_da_cutout * activations_wideresnet_gradcam_cifar10_da_cutout).sum(dim=1, keepdim=True)
cam_wideresnet_gradcam_cifar10_da_cutout = F.relu(cam_wideresnet_gradcam_cifar10_da_cutout)  # apply ReLU to the heatmap
cam_wideresnet_gradcam_cifar10_da_cutout = F.interpolate(cam_wideresnet_gradcam_cifar10_da_cutout, size=(32, 32), mode='bilinear', align_corners=False)
cam_wideresnet_gradcam_cifar10_da_cutout = cam_wideresnet_gradcam_cifar10_da_cutout.squeeze().numpy()


# Normalize the heatmap
cam_wideresnet_gradcam_cifar10 -= cam_wideresnet_gradcam_cifar10.min()
cam_wideresnet_gradcam_cifar10 /= cam_wideresnet_gradcam_cifar10.max()

cam_wideresnet_gradcam_cifar10_cutout -= cam_wideresnet_gradcam_cifar10_cutout.min()
cam_wideresnet_gradcam_cifar10_cutout /= cam_wideresnet_gradcam_cifar10_cutout.max()

cam_wideresnet_gradcam_cifar10_da -= cam_wideresnet_gradcam_cifar10_da.min()
cam_wideresnet_gradcam_cifar10_da /= cam_wideresnet_gradcam_cifar10_da.max()

cam_wideresnet_gradcam_cifar10_da_cutout -= cam_wideresnet_gradcam_cifar10_da_cutout.min()
cam_wideresnet_gradcam_cifar10_da_cutout /= cam_wideresnet_gradcam_cifar10_da_cutout.max()

# Load the original image
img = cv2.imread(current_path + image_path)
img = cv2.resize(img, (32, 32))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Superimpose the heatmap onto the original image
heatmap_wideresnet_gradcam_cifar10 = cv2.applyColorMap(np.uint8(255 * cam_wideresnet_gradcam_cifar10), cv2.COLORMAP_JET)
heatmap_wideresnet_gradcam_cifar10 = cv2.cvtColor(heatmap_wideresnet_gradcam_cifar10, cv2.COLOR_BGR2RGB)
superimposed_img_wideresnet_gradcam_cifar10 = heatmap_wideresnet_gradcam_cifar10 * 0.4 + img * 255

heatmap_wideresnet_gradcam_cifar10_cutout = cv2.applyColorMap(np.uint8(255 * cam_wideresnet_gradcam_cifar10_cutout), cv2.COLORMAP_JET)
heatmap_wideresnet_gradcam_cifar10_cutout = cv2.cvtColor(heatmap_wideresnet_gradcam_cifar10_cutout, cv2.COLOR_BGR2RGB)
superimposed_img_wideresnet_gradcam_cifar10_cutout = heatmap_wideresnet_gradcam_cifar10_cutout * 0.4 + img * 255

heatmap_wideresnet_gradcam_cifar10_da = cv2.applyColorMap(np.uint8(255 * cam_wideresnet_gradcam_cifar10_da), cv2.COLORMAP_JET)
heatmap_wideresnet_gradcam_cifar10_da = cv2.cvtColor(heatmap_wideresnet_gradcam_cifar10_da, cv2.COLOR_BGR2RGB)
superimposed_img_wideresnet_gradcam_cifar10_da = heatmap_wideresnet_gradcam_cifar10_da * 0.4 + img * 255

heatmap_wideresnet_gradcam_cifar10_da_cutout = cv2.applyColorMap(np.uint8(255 * cam_wideresnet_gradcam_cifar10_da_cutout), cv2.COLORMAP_JET)
heatmap_wideresnet_gradcam_cifar10_da_cutout = cv2.cvtColor(heatmap_wideresnet_gradcam_cifar10_da_cutout, cv2.COLOR_BGR2RGB)
superimposed_img_wideresnet_gradcam_cifar10_da_cutout = heatmap_wideresnet_gradcam_cifar10_da_cutout * 0.4 + img * 255

class_label = str(labels.item())

# Display the original image and the Grad-CAM
fig, ax = plt.subplots(nrows=1, ncols=5, constrained_layout=True)

ax[0].imshow(img)
ax[0].set_title('(Class: ' + cifar10_classes[int(class_label)] + ')')
ax[0].axis('off')
ax[1].imshow(superimposed_img_wideresnet_gradcam_cifar10 / 255)
ax[1].set_title('Pred: ' + predicted_class_wideresnet_gradcam_cifar10)
ax[1].axis('off')
ax[2].imshow(superimposed_img_wideresnet_gradcam_cifar10_cutout / 255)
ax[2].set_title(predicted_class_wideresnet_gradcam_cifar10)
ax[2].axis('off')
ax[3].imshow(superimposed_img_wideresnet_gradcam_cifar10_da / 255)
ax[3].set_title(predicted_class_wideresnet_gradcam_cifar10_da)
ax[3].axis('off')
ax[4].imshow(superimposed_img_wideresnet_gradcam_cifar10_da_cutout / 255)
ax[4].set_title(predicted_class_wideresnet_gradcam_cifar10_da_cutout)
ax[4].axis('off')

fig.suptitle("Original Image - Grad-CAM -  GC with CO - GC with DA - GC with CO&Da")
plt.show()