This notebook implements the [Assignment 4](https://github.com/sprintml/tml_2024/blob/main/Assignment4.pdf) - Task 3 of Trustworthy Machine Learning course offered in the Summer Semester 2024 at the Saarland University. This task focuses on obtaining annotations on 10 ImageNet images using [Grad CAM and other methods](https://github.com/jacobgil/pytorch-grad-cam?tab=readme-ov-file#using-from-code-as-a-librarys) technique and explaining the the predictions made by Resnet 50 model. The report analyzing the results of this task can be accessed [here](https://github.com/nupur412/TML_Assignment4_Explainability/blob/main/TML_Task_3_Report.pdf)

In [5]:
import matplotlib.pyplot as plt
from PIL import Image
import torch.nn as nn
import numpy as np
import os, json

import torch
from torchvision import models, transforms
from torchvision.models import resnet50, ResNet50_Weights
from torch.autograd import Variable
import torch.nn.functional as F

In [2]:
imgs = ['/content/n02098286_West_Highland_white_terrier.JPEG', '/content/n02018207_American_coot.JPEG', '/content/n04037443_racer.JPEG',
        '/content/n02007558_flamingo.JPEG', '/content/n01608432_kite.JPEG', '/content/n01443537_goldfish.JPEG',
        '/content/n01491361_tiger_shark.JPEG', '/content/n01616318_vulture.JPEG', '/content/n01677366_common_iguana.JPEG',
        '/content/n07747607_orange.JPEG']

In [6]:
model = resnet50(weights=ResNet50_Weights.IMAGENET1K_V2)

In [7]:
model.layer4[-1]

Bottleneck(
  (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
  (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
  (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
  (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
)

Making use of the Grad CAM library for obtaining explainations on the model predictions

In [8]:
! git clone https://github.com/jacobgil/pytorch-grad-cam.git

Cloning into 'pytorch-grad-cam'...
remote: Enumerating objects: 1194, done.[K
remote: Counting objects: 100% (96/96), done.[K
remote: Compressing objects: 100% (72/72), done.[K
remote: Total 1194 (delta 56), reused 43 (delta 24), pack-reused 1098[K
Receiving objects: 100% (1194/1194), 133.62 MiB | 14.23 MiB/s, done.
Resolving deltas: 100% (668/668), done.


In [9]:
os.chdir('/content/pytorch-grad-cam')

In [10]:
! pip install ttach

Collecting ttach
  Downloading ttach-0.0.3-py3-none-any.whl.metadata (5.2 kB)
Downloading ttach-0.0.3-py3-none-any.whl (9.8 kB)
Installing collected packages: ttach
Successfully installed ttach-0.0.3


In [11]:
from pytorch_grad_cam import GradCAM, HiResCAM, ScoreCAM, GradCAMPlusPlus, AblationCAM, XGradCAM, EigenCAM, FullGrad
from pytorch_grad_cam.utils.model_targets import ClassifierOutputTarget
from pytorch_grad_cam.utils.image import show_cam_on_image

We're supposed to compute the gradient of the output with
respect to the last convolutional layer, so we modify the target layers accordingly in the cam.py

In [12]:
target_layers = [model.layer4[-1]]

Obtaining the explainations for each of the 10 ImageNet data points using Grad CAM, Ablation CAM and Score CAM

In [None]:
! python cam.py --image-path /content/n02098286_West_Highland_white_terrier.JPEG --device cuda --method ablationcam --output-dir /content/pytorch-grad-cam

Using device "cuda" for acceleration
100% 64/64 [00:17<00:00,  3.69it/s]


In [None]:
! python cam.py --image-path /content/n01443537_goldfish.JPEG --device cuda --method scorecam --output-dir /content/pytorch-grad-cam

Using device "cuda" for acceleration
100% 64/64 [00:11<00:00,  5.42it/s]


In [None]:
! python cam.py --image-path /content/n07747607_orange.JPEG --device cuda --method gradcam --output-dir /content/pytorch-grad-cam

Using device "cuda" for acceleration
