# Pre-trained deep neural network usage and interpretation

<table>
    <tr>
        <td><img src="img/timber_wolf.png" style="height: 200px;"></td>
        <td><img src="img/platypus.png" style="height: 200px;"></td>
        <td><img src="img/westie.png" style="height: 200px;"></td>
    </tr>
</table>

In this tutorial notebook we will make use of a pre-trained Deep Convolutional Network to automatically detect a variety of objects in images. We will also make use of the interpretation technique Grad-CAM to understand why the neural network is making its decisions.

## Preliminars

The following code will show all images inside the notebook

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

## Loading a pre-trained neural network

We will start by loading a pre-trained deep convolutional neural network: Xception. This network is an improved and more compact version of the 152-layers network that won the ImageNet Large Scale Visual Recognition Challenge in 2015. The network was trained to be able to recognize [1000 different classes of objects](http://image-net.org/challenges/LSVRC/2017/browse-synsets).

In [None]:
from keras.applications.xception import Xception

# Load the network configured to process 299x299 pixels colored images
model = Xception(input_shape=(299, 299, 3))

We can get a description of the network structure as follows

In [None]:
model.summary()

## Using the neural network to detect objects in an image

To begin with, we will use an image of oranges from Wikipedia. The following code downloads the image and loads it as a numpy array

In [None]:
from PIL import Image
import numpy as np

!wget -O input.jpg "https://upload.wikimedia.org/wikipedia/commons/b/b0/OrangeBloss_wb.jpg"
img = np.array(Image.open("./input.jpg"))

We can now visualize the downloaded image

In [None]:
display(Image.fromarray(img))

The following function transforms the image into a tensor suitable for Xception. In particular, it will:

* Resize the image to the 299 x 299 pixels expected by Xception.
* Bundle the image in a tensor that represents a batch of a single image.
* Call the Xception preprocessing function to the all the pixel normalization steps this network needs.

In [None]:
from keras.applications.xception import preprocess_input
import numpy as np
from skimage.transform import resize

def preprocess_image(img):
    img = resize(img, (299, 299, 3), preserve_range=True, mode="reflect", anti_aliasing=True)
    img = np.expand_dims(img, axis=0)
    return preprocess_input(img)

Through this preprocessing we can obtain predictions using the network. We will use the **decode_predictions** function of Xception to obtain an explanation on which are the top 5 predicted classes for this image.

In [None]:
from keras.applications.xception import decode_predictions

preds = model.predict(preprocess_image(img))
print('Predicted:', decode_predictions(preds)[0])

The top predicted class is **orange** with very high probability, followed by **lemon** with a small probability. That makes sense!

<img src="https://albarji-labs-materials.s3-eu-west-1.amazonaws.com/question.png" height="80" width="80" style="float: right;"/>

***

<font color=#ad3e26>
Using the functions above, write in the cell below the code needed to download a new image, preprocess it, and obtain the network predictions. You can use any image found in the internet! Just look for some image you like and use its URL as input for the <i>download_image</i> function. How well does the network work with the images you have chosen?
</font>

***

In [None]:
# Try with any image URL you want! Here is a photograph of a couple of wolves, but feel free to change it!
!wget -O input.jpg "https://cdn.pixabay.com/photo/2019/12/19/22/48/wolf-4707294_960_720.jpg"
img = np.array(Image.open("./input.jpg"))

display(Image.fromarray(img))
preds = model.predict(x = preprocess_image(img))
print('Predicted:', decode_predictions(preds)[0])

## Getting explanations from the network

We can use an approximation method to request the neural network an explanation about its decisions. In particular, we will use the [Grad-CAM algorithm](http://gradcam.cloudcv.org/), which finds how much each pixel in the image contributes to the predicted classes. In this way, we can highlight those parts of the image that contribute more to the decision.

Since the Grad-CAM algorithm is not implemented in Keras, we will need to build it step by a step. Let's see how! First, we define a function that produces a heatmap of activations following the Grad-CAM steps. Note how the function needs to receive the image to analyze, the neural network model, and the names of the last convolutional layer and the classifier layer. These are the layers whose gradients get analyzed in the Grad-CAM algorithm.

In [None]:
import matplotlib.cm as cm

from keras import Input, Model
import tensorflow as tf

def make_gradcam_heatmap(img_array, model, last_conv_layer_name, classifier_layer_names, top_n=0):
    """Creates a Grad-CAM heatmap showing hot spots for a given predicted class
    We will
    Adapted from https://keras.io/examples/vision/grad_cam/#the-gradcam-algorithm
    """
    # First, we create a model that maps the input image to the activations
    # of the last conv layer
    last_conv_layer = model.get_layer(last_conv_layer_name)
    last_conv_layer_model = Model(model.inputs, last_conv_layer.output)

    # Second, we create a model that maps the activations of the last conv
    # layer to the final class predictions
    classifier_input = Input(shape=last_conv_layer.output.shape[1:])
    x = classifier_input
    for layer_name in classifier_layer_names:
        x = model.get_layer(layer_name)(x)
    classifier_model = Model(classifier_input, x)

    # Then, we compute the gradient of the top predicted class for our input image
    # with respect to the activations of the last conv layer
    with tf.GradientTape() as tape:
        # Compute activations of the last conv layer and make the tape watch it
        last_conv_layer_output = last_conv_layer_model(img_array)
        tape.watch(last_conv_layer_output)
        # Compute class predictions for single image
        preds = classifier_model(last_conv_layer_output)[0]
        # Get output for top-k class
        idx = tf.math.top_k(preds, k=top_n+1).indices[top_n]
        target_class_channel = preds[idx]

    # This is the gradient of the target predicted class with regard to
    # the output feature map of the last conv layer
    grads = tape.gradient(target_class_channel, last_conv_layer_output)

    # This is a vector where each entry is the mean intensity of the gradient
    # over a specific feature map channel
    pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2))

    # We multiply each channel in the feature map array
    # by "how important this channel is" with regard to the top predicted class
    last_conv_layer_output = last_conv_layer_output.numpy()[0]
    pooled_grads = pooled_grads.numpy()
    for i in range(pooled_grads.shape[-1]):
        last_conv_layer_output[:, :, i] *= pooled_grads[i]

    # The channel-wise mean of the resulting feature map
    # is our heatmap of class activation
    heatmap = np.mean(last_conv_layer_output, axis=-1)

    # Filter out negative values, as we want to focus on pixels that contribute positively to the class
    # Also, for visualization purposes, we will normalize the heatmap between 0 & 1
    return np.maximum(heatmap, 0) / np.max(heatmap)

To obtain a more appealing visualization we define the following function, which superimposes the heatmap onto the original image, producing a new image.

In [None]:
from keras.preprocessing import image

def superimposed_heatmap(img, heatmap):
    """Creates a new image superimposing a given heatmap on top of an image
    
    Adapted from https://keras.io/examples/vision/grad_cam/#the-gradcam-algorithm
    """
    # We rescale heatmap to a range 0-255
    heatmap = np.uint8(255 * heatmap)

    # We use jet colormap to colorize heatmap
    jet = cm.get_cmap("jet")

    # We use RGB values of the colormap
    jet_colors = jet(np.arange(256))[:, :3]
    jet_heatmap = jet_colors[heatmap]

    # We create an image with RGB colorized heatmap
    jet_heatmap = image.array_to_img(jet_heatmap)
    jet_heatmap = jet_heatmap.resize((img.shape[1], img.shape[0]))
    jet_heatmap = image.img_to_array(jet_heatmap)

    # Superimpose the heatmap on original image
    superimposed_img = jet_heatmap * 0.4 + img
    superimposed_img = image.array_to_img(superimposed_img)
    
    return superimposed_img

Finally we create a function that prints out a full Grad-CAM classification report for a given image. It will display the original image together with Grad-CAM visualizations for the top predicted classes.

In [None]:
def gradcam_report(img, model, last_conv_layer_name, classifier_layer_names, top_n=3):
    # Obtain predictions
    x = preprocess_image(img)
    preds = model.predict(x)
    decoded_preds = decode_predictions(preds, top=top_n)[0]
    
    # Compute a grad cam visualization for each one of the top predicted classes
    heatmaps = [
        make_gradcam_heatmap(x, model, last_conv_layer_name, classifier_layer_names, i)
        for i in range(top_n)
    ]
    img_array = image.img_to_array(img)
    superimposed_imgs = [superimposed_heatmap(img_array, heatmap) for heatmap in heatmaps]
    
    # Show report
    display(Image.fromarray(img))
    print("Original image")
    
    for decoded_pred, visual in zip(decoded_preds, superimposed_imgs):
        display(visual)
        print(f"{decoded_pred[1]} ({decoded_pred[2]*100:.1f}%)")

Let's try it! 

As **last_conv_layer_name** we should use the name of the latest convolutional layer in the network: in the Xception network this layer is named `block14_sepconv2`. As for the **classifier_layer_names**, for Xception we need to provide both the final pooling layer and the predictions layer, `["avg_pool", "predictions"]`

If you want to use another neural network you would need to check the model summary to find the name of the appropriate layers.

In [None]:
# Display Grad CAM
gradcam_report(img, model, "block14_sepconv2", ["avg_pool", "predictions"], top_n=5)

<img src="https://albarji-labs-materials.s3-eu-west-1.amazonaws.com/question.png" height="80" width="80" style="float: right;"/>

***

<font color=#ad3e26>
Use the cell below to obtain explanations for other images of your choice. Can you find images for which the explanations make sense? What about images where the network predicts the correct class, but for which the explanations do not make sense?
</font>

***

In [None]:
# Here we use an image of a platypus to try to confuse the neural network, but try something different yourself!
!wget -O input.jpg "https://upload.wikimedia.org/wikipedia/commons/thumb/8/88/A_duck_billed_platypus_%28watermole%29._Colour_lithograph_after_Wellcome_V0021174ER.jpg/397px-A_duck_billed_platypus_%28watermole%29._Colour_lithograph_after_Wellcome_V0021174ER.jpg"

img = np.array(Image.open("./input.jpg"))

gradcam_report(img, model, "block14_sepconv2", ["avg_pool", "predictions"], top_n=5)