# Class Activation Mapping In PyTorch
Have you ever wondered just how a neural network model like ResNet decides on its decision to determine that an image is a cat or a flower in the field? Class Activation Mappings (CAM) can provide some insight into this process by overlaying a heatmap over the original image to show us where our model thought most strongly that this cat was indeed a cat.

This script will demonstrate how to use a pretrained model, in PyTorch, 
to make predictions. Specifically, we will be using VGG16 with a cat 
image.

References used to make this script:

PyTorch pretrained models doc:

    http://pytorch.org/docs/master/torchvision/models.html

PyTorch image transforms example:

    http://pytorch.org/tutorials/beginner/data_loading_tutorial.html#transforms

Example code:

    http://blog.outcome.io/pytorch-quick-start-classifying-an-image/

Firstly, we’re going to need a picture of a cat. And thankfully, here’s one I took earlier of a rather suspicious cat that is wondering why the strange man is back in his house again.

In [None]:
%matplotlib inline
%reload_ext autoreload
%autoreload 2

In [None]:
import numpy as np
import skimage.transform
from PIL import Image
from matplotlib.pyplot import imshow

from torchvision import models, transforms
from torch.nn import functional as F
from torch import topk

import io
import requests

In [None]:
# input image
IMG_URL = 'http://media.mlive.com/news_impact/photo/9933031-large.jpg'

# Random cat img taken from Google
# IMG_URL = 'https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg'

# Class labels used when training VGG as json, courtesy of the 'Example code' link above.
LABELS_URL = 'https://s3.amazonaws.com/outcome-blog/imagenet/labels.json'

In [None]:
# Let's get our class labels.
response = requests.get(LABELS_URL)  # Make an HTTP GET request and store the response.
labels = {int(key): value for key, value in response.json().items()}

In [None]:
# Let's get our img.
response = requests.get(IMG_URL)
# Read bytes and store as an img.
image = Image.open(io.BytesIO(response.content))

# let's take a look at it
imshow(image);

Next, we’re going to set up some torchvision transforms to scale the image to the 224x224 required for ResNet and also to normalize it to the ImageNet mean/std. 

Now that we have an img, we need to preprocess it.
We need to:
* resize the img, it is pretty big (~1200x1200px).
* normalize it, as noted in the PyTorch pretrained models doc, with, mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225].
* convert it to a PyTorch Tensor.

We can do all this preprocessing using a transform pipeline.

In [None]:
# Imagenet mean/std

normalize = transforms.Normalize(
   mean=[0.485, 0.456, 0.406],
   std=[0.229, 0.224, 0.225]
)

# Preprocessing - scale to 224x224 for model, convert to tensor, 
# and normalize to -1..1 with mean/std for ImageNet
min_img_size = 224  # The min size, as noted in the PyTorch pretrained models doc, is 224 px.
preprocess = transforms.Compose([
   transforms.Resize((min_img_size,min_img_size)),
   transforms.ToTensor(),
   normalize
])

display_transform = transforms.Compose([
   transforms.Resize((224,224))])

In [None]:
img_tensor = preprocess(image)
print(img_tensor.shape)

In [None]:
# PyTorch pretrained models expect the Tensor dims to be (num input imgs, num color channels, height, width).
# Currently however, we have (num color channels, height, width); let's fix this by inserting a new axis.
img = img_tensor.unsqueeze(0)  # Insert the new axis at index 0 i.e. in front of the other axes/dims. 

# send it to gpu and track the graph
prediction_var = img.cuda()
prediction_var.requires_grad_();

Having converted our image into a PyTorch variable, we need a model to generate a prediction. Let’s use ResNet18, put it in evaluation mode, and stick it on the GPU using the CUDA libraries.

In [None]:
# Now let's load our model
model = models.resnet18(pretrained=True);
model.cuda()
model.eval()

This next bit of code is swiped from Jeremy Howard’s fast.ai course. It basically allows you to easily attach a hook to any model (or any part of a model - here we’re going to grab the final convnet layer in ResNet18) which will save the activation features as an instance variable.

In [None]:
class SaveFeatures():
    features=None
    def __init__(self, m): self.hook = m.register_forward_hook(self.hook_fn)
    def hook_fn(self, module, input, output): self.features = ((output.cpu()).data).numpy()
    def remove(self): self.hook.remove()

In [None]:
final_layer = model._modules.get('layer4')

activated_features = SaveFeatures(final_layer)

Having set that up, we run the image through our model and get the prediction. We then run that through a softmax layer to turn that prediction into a series of probabilities for each of the 1000 classes in ImageNet.

In [None]:
# and get a prediciton!
prediction = model(prediction_var)
# Returns a Tensor of shape (batch, num class labels)
print(prediction.shape)

In [None]:
pred_probabilities = F.softmax(prediction, dim=-1).data.squeeze()
activated_features.remove()

Using topk(), we can see that our model is 36% confident that this picture is class 282. Looking that up in the ImageNet classes, that gives us…’tiger cat’. I would say that’s not a bad guess!

In [None]:
prob, class_idx = topk(pred_probabilities,1)

In [None]:
print (prob.item(), class_idx.item(), labels[class_idx.item()])  # Converts the index to a string using our labels dict

Having made the guess, let’s see where the neural network was focussing its attention. The getCAM() method here takes the activated features of the convnet, the weights of the fully-connected layer (on the side of the average pooling), and the class index we want to investigate (282/‘tiger cat’ in our case). We index into the fully-connected layer to get the weights for that class and calculate the dot product with our features from the image.

(this code is based on the paper that introduced CAM)

In [None]:
def getCAM(feature_conv, weight_fc, class_idx):
    _, nc, h, w = feature_conv.shape
    cam = weight_fc[class_idx].dot(feature_conv.reshape((nc, h*w)))
    cam = cam.reshape(h, w)
    cam = cam - np.min(cam)
    cam_img = cam / np.max(cam)
    return [cam_img]

In [None]:
weight_bias = model._modules.get('fc').parameters()

In [None]:
weight, bias = tuple(weight_bias)

In [None]:
weight_softmax = weight.cpu().data.numpy()

In [None]:
print(weight_softmax.shape)

In [None]:
w = weight_softmax[class_idx.item()]
print(w.shape)

In [None]:
overlay = getCAM(activated_features.features, weight_softmax, class_idx.item() )

Now we can see our heatmap and overlay it onto Casper. It doesn’t make him look any happier, but we can see exactly where the model made its mind up about him.

In [None]:
imshow(overlay[0], alpha=0.5, cmap='jet')

In [None]:
imshow(skimage.transform.resize(overlay[0], img_tensor.shape[1:3]), cmap='jet');

In [None]:
imshow(display_transform(image))
imshow(skimage.transform.resize(overlay[0], img_tensor.shape[1:3]), alpha=0.5, cmap='jet');

But wait, there’s a bit more - we can also look at the model’s second choice for cat.

In [None]:
probs, class_indices = topk(pred_probabilities,2)
probs, class_indices

In [None]:
print(probs[0].item(), class_indices[0].item(), labels[class_indices[0].item()])
print(probs[1].item(), class_indices[1].item(), labels[class_indices[1].item()])

In [None]:
overlay = getCAM(activated_features.features, weight_softmax, class_indices[1].item() )
imshow(skimage.transform.resize(overlay[0], img_tensor.shape[1:3]), alpha=0.5, cmap='jet');

In [None]:
imshow(display_transform(image))
imshow(skimage.transform.resize(overlay[0], img_tensor.shape[1:3]), alpha=0.5, cmap='jet');