# Lab 6.1 : Using Lime with Pytorch

In this tutorial we will show how to use Lime framework with Pytorch. We will use Lime to explain the prediction generated by one of the pretrained ImageNet models.

Let's start with importing our dependencies.

In [None]:
# Install LIME package
# On Google Colab
!pip install lime
# On your personal laptop with Anaconda
# conda install -c conda-forge lime

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
from PIL import Image
import torch.nn as nn
import numpy as np
import os, json

import torch
from torchvision import models, transforms
from torch.autograd import Variable
import torch.nn.functional as F

In [None]:
print(torch.__version__)

In [None]:
# Download the image and the class index
# You can alternatively use "wget"
!curl https://raw.githubusercontent.com/marcotcr/lime/master/doc/notebooks/data/imagenet_class_index.json --output imagenet_class_index.json 
!curl https://raw.githubusercontent.com/EliSchwartz/imagenet-sample-images/master/n02085620_Chihuahua.JPEG --output dog.png
!ls -l

In [None]:
# The following code loads json file. The output will have values of dict type.
file_name_imagenet_class = "imagenet_class_index.json"
with open(file_name_imagenet_class, 'r') as f:
    class_idx =  json.loads(f.read())
    
# Print the class names
for c in class_idx.values():
    print(c)
    

### Question 1: What is ImageNet?

#### Answer:

The ImageNet project is a large visual database designed for use in visual object recognition software research. More than 14 million images have been hand-annotated by the project to indicate what objects are pictured and in at least one million of the images, bounding boxes are also provided.

ImageNet is a large collection of images, labeled against WordNet 3.0 and described at http://image-net.org/. 
WordNet is a lexical database of semantic relations between words in more than 200 languages. 

### Question 2: What is Inception ?

#### Answer:

Inception-v3 is a convolutional neural network architecture from the Inception family.

Inception-v3 is a convolutional neural network that is 48 layers deep. You can load a pretrained version of the network trained on more than a million images from the ImageNet database. The pretrained network can classify images into 1000 object categories, such as keyboard, mouse, pencil, and many animals. As a result, the network has learned rich feature representations for a wide range of images.

Load our test image and see how it looks.

In [None]:
def get_image(path):
    with open(os.path.abspath(path), 'rb') as f:
        with Image.open(f) as img:
            return img.convert('RGB') 
        
img = get_image('dog.png')
imgplot = plt.imshow(img)

We need to convert this image to Pytorch tensor and also apply whitening as used by our pretrained model.

In [None]:
# resize and take the center part of image to what our model expects
def get_input_transform():
    normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                    std=[0.229, 0.224, 0.225])       
    transf = transforms.Compose([
        transforms.Resize((256, 256)),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        normalize
    ])    

    return transf

def get_input_tensors(img):
    transf = get_input_transform()
    # unsqeeze converts single image to batch of 1
    return transf(img).unsqueeze(0)

### Question 3: Explain what is a pretrained model?

#### Answer:

The pretrained model is described on the Pytorch website

https://pytorch.org/hub/pytorch_vision_inception_v3/

In [None]:
model = models.inception_v3(pretrained=True)


### Question 4: How do we call the approach of using pretrained model for a new task ? 

#### Answer:

We call it transfer learning.



### Question 5: Print the pretrained model. Analyse the architecture (How many layers? of which type? etc)

In [None]:
# fill in this cell
print(model)

Load label texts for ImageNet predictions so we know what model is predicting

The file 'imagenet_class_index.json' contains the mapping of ImageNet class id to ImageNet class name. We get the class name of the predicted index.

In [None]:
idx2label, cls2label, cls2idx = [], {}, {}
with open(os.path.abspath(file_name_imagenet_class), 'r') as read_file:
    class_idx = json.load(read_file)
    # Create the list of class labels
    idx2label = [class_idx[str(k)][1] for k in range(len(class_idx))]
    cls2label = {class_idx[str(k)][0]: class_idx[str(k)][1] for k in range(len(class_idx))}
    cls2idx = {class_idx[str(k)][0]: k for k in range(len(class_idx))}    

Get the predicition for our image.

In [None]:
img_t = get_input_tensors(img)
model.eval()
logits = model(img_t)

### Question 6: What is the meaning of Top N accuracy?

#### Answer:
Top N accuracy is when you measure how often your predicted class falls in the top N values of your classifier.

Predicitions we got are logits. Let's pass that through softmax to get probabilities and class labels for top 5 accuracy.

In [None]:
probs = F.softmax(logits, dim=1)
probs5 = probs.topk(5)
tuple((p,c, idx2label[c]) for p, c in zip(probs5[0][0].detach().numpy(), probs5[1][0].detach().numpy()))

We are getting ready to use Lime. Lime produces the array of images from original input image by pertubation algorithm. So we need to provide two things: (1) original image as numpy array (2) classification function that would take array of purturbed images as input and produce the probabilities for each class for each image as output. 

For Pytorch, first we need to define two separate transforms: (1) to take PIL image, resize and crop it (2) take resized, cropped image and apply whitening.

In [None]:
def get_pil_transform(): 
    transf = transforms.Compose([
        transforms.Resize((256, 256)),
        transforms.CenterCrop(224)
    ])    

    return transf

def get_preprocess_transform():
    normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                    std=[0.229, 0.224, 0.225])     
    transf = transforms.Compose([
        transforms.ToTensor(),
        normalize
    ])    

    return transf    

pill_transf = get_pil_transform()
preprocess_transform = get_preprocess_transform()

Now we are ready to define classification function that Lime needs. The input to this function is numpy array of images where each image is ndarray of shape (channel, height, width). The output is numpy array of shape (image index, classes) where each value in array should be probability for that image, class combination.

In [None]:
def batch_predict(images):
    model.eval()
    batch = torch.stack(tuple(preprocess_transform(i) for i in images), dim=0)

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model.to(device)
    batch = batch.to(device)
    
    logits = model(batch)
    probs = F.softmax(logits, dim=1)
    return probs.detach().cpu().numpy()

Let's test the function for the sample image.

In [None]:
test_pred = batch_predict([pill_transf(img)])
test_pred.squeeze().argmax()

Import lime and create explanation for this prediciton.

In [None]:
from lime import lime_image

In [None]:
explainer = lime_image.LimeImageExplainer()
explanation = explainer.explain_instance(np.array(pill_transf(img)), 
                                         batch_predict, # classification function
                                         top_labels=5, 
                                         num_samples=100) # number of images that will be sent to classification function

# The algorithm generates neighborhood data by randomly perturbing features from the instance. 
# It then learns locally weighted linear models on this neighborhood data to explain each of the classes 
# in an interpretable way.
# num_samples – size of the neighborhood to learn the linear mode
# top_labels – if not None, ignore labels and produce explanations for the K labels with highest prediction probabilities

Let's use mask on image and see the areas that are encouraging the top prediction.

In [None]:
from skimage.segmentation import mark_boundaries

In [None]:
temp, mask = explanation.get_image_and_mask(explanation.top_labels[0], 
                                            positive_only=True, negative_only=False, 
                                            num_features=5, 
                                            hide_rest=True)
# Parameters:
# hide_rest – if True, make the non-explanation part of the return image gray
# num_features – number of superpixels to include in explanation

img_boundry1 = mark_boundaries(temp/255.0, mask)
imgplot = plt.imshow(img_boundry1)

Let's turn on areas that contributes against the top prediction.

In [None]:
temp, mask = explanation.get_image_and_mask(explanation.top_labels[0], 
                                            positive_only=False, negative_only=True,
                                            num_features=5, 
                                            hide_rest=True)
img_boundry2 = mark_boundaries(temp/255.0, mask)
imgplot = plt.imshow(img_boundry2)

Let's turn on both positive and negative areas.

In [None]:
temp, mask = explanation.get_image_and_mask(explanation.top_labels[0], 
                                            positive_only=False, negative_only=False,
                                            num_features=100, 
                                            hide_rest=True)
img_boundry3 = mark_boundaries(temp/255.0, mask)
imgplot = plt.imshow(img_boundry3)