---
# <div align="center"><font color='blue'>  </font></div>
# <div align="center"><font color='blue'> COSC 2779 | Deep Learning  </font></div>
## <div align="center"> <font color='blue'> Week 12 Lab Exercises: **Model Interpretation**</font></div>
---

# Introduction

In this lab, you will learn some basic techniques for deep neural network model interpretation. 

In this lab, you will:
- Use self-developed scripts to do
  - Feature Visualisation
  - Feature Attribution
- Use functionality in `tf-explain` library to do model interpretation.

![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)  This notebook is designed to run on Google Colab. If you like to run this on your local machine, make sure that you have installed TensorFlow version 2.0. 

## Setting up the Notebook

Let's first load the packages we need.

In [None]:
import tensorflow as tf
AUTOTUNE = tf.data.experimental.AUTOTUNE
import numpy as np
import pandas as pd

import tensorflow_datasets as tfds
import pathlib
import shutil
import tempfile

from  IPython import display
from matplotlib import pyplot as plt

from sklearn.metrics import accuracy_score, precision_score, recall_score
from sklearn.model_selection import train_test_split
from tensorflow.keras import layers, losses
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras.models import Model

## Setting up Instrumentation

We can use the tensor board to view the learning curves, activation and weight hostograms. Lets first set it up.

In [None]:
logdir = pathlib.Path(tempfile.mkdtemp())/"tensorboard_logs"
shutil.rmtree(logdir, ignore_errors=True)

# Load the TensorBoard notebook extension
%load_ext tensorboard

# Open an embedded TensorBoard viewer
%tensorboard --logdir {logdir}/models

## Setting up the model

First we will read an image that we will be using for this lab. 

In [None]:
from google.colab import files
upload_file = files.upload()

In [None]:
IMAGE_PATH = './' + list(upload_file.keys())[0]

In this lab we will try to interpret the ResNet50V2 model trained on ImageNet Task. First we need to get the model from keras applications and load the weights. 

In [None]:
# Model to examine
model = tf.keras.applications.ResNet50V2(weights='imagenet', include_top=True)

from keras.utils.vis_utils import plot_model
# plot_model(model, show_shapes=True)

In [None]:
plot_model(model, show_shapes=True)

Now lets predict the class for the and print the results. 

In [None]:
# Preprocess the Image to pass as input
img = tf.keras.preprocessing.image.load_img(IMAGE_PATH, target_size=(224, 224))
img = tf.keras.preprocessing.image.img_to_array(img)
img_input = tf.expand_dims(img, axis=0)
img_input = tf.cast(img_input, tf.float32)
img_input = tf.keras.applications.resnet_v2.preprocess_input(img_input)

# Do the prediction
prediction = model.predict(img_input)

# Get the top 5 predictions and print
top5_pred = tf.keras.applications.imagenet_utils.decode_predictions(prediction, top=5)

for class_id, class_name , prob in top5_pred[0]:
  print('{} : {:2.2f}'.format(class_name, prob))

plt.imshow((img).astype(np.int))
plt.axis('off')
plt.show()

# Model interpritation with self-developed scripts

Lets write some python scripts to do model interpritation. This section is designed to gove you a brief overview of the underlying tehniques. 

## Feature Visualisation

In this segment we will try out some feature visualisation techniques

### Plot learned weights of the first layer

In [None]:
weights = model.get_layer('conv1_conv').get_weights()[0]
print(weights.shape)

plt.figure(figsize=(10,10))

for i in range(0, weights.shape[-1]):
  weights_ = weights[:,:,:,i]
  weights_ = (weights_-weights_.min())/(weights_.max()-weights_.min())
  plt.subplot(8,8,i+1)
  plt.imshow(weights_)
  plt.axis('off')

plt.show()

### Output visualisation of a layer

For middle layers in a CNN we can simply visualise what comes out of the activation layers. Does the output still look relevant? Or does it look like random noise? By examining how the image transits through the network, you can validate that it focuses on the right regions.

In [None]:
layers_name = ['conv2_block2_2_relu']

# Get the outputs of layers we want to inspect
outputs = [
    layer.output for layer in model.layers
    if layer.name in layers_name
]

# Create a connection between the input and those target outputs
activations_model = tf.keras.models.Model(model.inputs, outputs=outputs)
activations_model.compile(optimizer='adam', loss='categorical_crossentropy')

# Get their outputs
activations_1 = activations_model.predict(img_input)

print('Activation Shape: ', activations_1.shape)

In [None]:
plt.figure(figsize=(20,20))
for i in range(1,64+1):
  plt.subplot(8,8,i)
  plt.imshow(activations_1[0,:,:,i-1])
  plt.axis('off')
  
plt.show()

### Maximal Activation Input

Seeing what is coming out of a layer is great, but what if we could understand what makes a kernel activate?

Here we want to generate an input to the network that maximises the output of a given filter. Therefore, we create a  new sub model that goes from the input layer to the layer we are interested in. The loss function we seek to maximise is the mean of this activation layer’s output. 

We start from a random noise image and then update the image pixels using backprop of the loss defined above. Note that here we do not update the weights of the CNN.

In [None]:
# Layer name to inspect
layer_name = 'conv5_block2_out'

# Create a connection between the input and the target layer
submodel = tf.keras.models.Model(model.inputs, model.get_layer(layer_name).output)

epochs = 100
step_size = 0.1
filter_index = 25

# Initiate random noise
input_img_data = np.random.random((1, 224, 224, 3))
input_img_data = (input_img_data - 0.5) * 20 + 128.
input_img_data = tf.cast(input_img_data, tf.float32)
input_img_data = tf.keras.applications.resnet_v2.preprocess_input(input_img_data)

input_img_data = tf.Variable(input_img_data)

# Iterate gradient ascents
for _ in range(epochs):
    with tf.GradientTape() as tape:
        outputs = submodel(input_img_data)
        loss_value = tf.reduce_mean(outputs[:,:,:, filter_index]) + 0.01 * tf.reduce_mean(input_img_data**2)
    grads = tape.gradient(loss_value, input_img_data)
    normalized_grads = grads / (tf.sqrt(tf.reduce_mean(tf.square(grads))) + 1e-5)
    input_img_data.assign_add(normalized_grads * step_size)

In [None]:
resmap = input_img_data[0,:,:,:].numpy()
resmap = (resmap - resmap.min()) / (resmap.max() - resmap.min())
plt.figure(figsize=(10,10))
plt.imshow((resmap*255).astype(np.int))
plt.axis('off')
plt.show()

## Feature Attribution

In this section we are interested in knowing How did each input contribute to a particular prediction.



### Saliency via occlusion
For this we are first going to use Saliency via occlusion. 

In this method, we Mask part of the image before it is fed to the CNN. See how much the prediction probability for a class change.

In [None]:
# Create function to apply a grey patch on an image
def apply_grey_patch(image, top_left_x, top_left_y, patch_size):
    patched_image = np.array(image, copy=True)
    patched_image[top_left_y:top_left_y + patch_size, top_left_x:top_left_x + patch_size, :] = 127.5

    return patched_image

# Load image
img = tf.keras.preprocessing.image.load_img(IMAGE_PATH, target_size=(224, 224))
img = tf.keras.preprocessing.image.img_to_array(img)

# category to id mapping in ImageNet
# German_shepherd -> 235
# tennis_ball -> 852
# Walker_hound -> 166
# Tabby_cat -> 281


CLASS_INDEX = 235  # Imagenet class index
PATCH_SIZE = 60

sensitivity_map = np.zeros((img.shape[0], img.shape[1]))
count_map = np.zeros((img.shape[0], img.shape[1]))

# Iterate the patch over the image
for top_left_x in range(0, img.shape[0], PATCH_SIZE//8):
    for top_left_y in range(0, img.shape[1], PATCH_SIZE//8):
        patched_image = apply_grey_patch(img, top_left_x, top_left_y, PATCH_SIZE)

        patched_image = tf.expand_dims(patched_image, axis=0)
        patched_image = tf.cast(patched_image, tf.float32)
        patched_image = tf.keras.applications.resnet_v2.preprocess_input(patched_image)

        predicted_classes = model.predict(patched_image)[0]
        confidence = predicted_classes[CLASS_INDEX]
        
        # Save confidence for this specific patched image in map
        sensitivity_map[
            top_left_y:top_left_y + PATCH_SIZE,
            top_left_x:top_left_x + PATCH_SIZE,
        ] = sensitivity_map[top_left_y:top_left_y + PATCH_SIZE,top_left_x:top_left_x + PATCH_SIZE,] + confidence

        count_map[
            top_left_y:top_left_y + PATCH_SIZE,
            top_left_x:top_left_x + PATCH_SIZE,
        ] = count_map[top_left_y:top_left_y + PATCH_SIZE,top_left_x:top_left_x + PATCH_SIZE,] + 1.0

sensitivity_map = sensitivity_map / count_map

plot the sensitivity map.

In [None]:
sensitivity_map = (sensitivity_map - sensitivity_map.min()) / (sensitivity_map.max() - sensitivity_map.min())
sensitivity_map = (sensitivity_map -1 )*-1 # Invert color space

plt.figure(figsize=(5,5))
plt.imshow(img.astype(np.int))
plt.imshow((sensitivity_map*255).astype(np.int), alpha=.5,cmap='jet')
plt.axis('off')
plt.show()

### Saliency via Backprop

Compute gradient of class score with respect to image pixels. Take absolute value and max over RGB channels.

In [None]:
import matplotlib.cm as cm

CLASS_INDEX = 235
LAYER_NAME = 'predictions'

# Read and pre process the image
img = tf.keras.preprocessing.image.load_img(IMAGE_PATH, target_size=(224, 224))
img = tf.keras.preprocessing.image.img_to_array(img)
img_input = tf.expand_dims(img, axis=0)
img_input = tf.cast(img_input, tf.float32)
img_input = tf.keras.applications.resnet_v2.preprocess_input(img_input)

# Create a sub model
img_input = tf.Variable(img_input)
grad_model_bp = tf.keras.models.Model([model.inputs], [model.get_layer(LAYER_NAME).output])

# compute the gradinets
with tf.GradientTape() as tape:
    predictions = grad_model_bp(img_input)
    loss = predictions[:, CLASS_INDEX]

grads = tape.gradient(loss, img_input)


#Normalize and plot
print(grads.numpy().shape)
grads = grads.numpy()[0,:,:,:]
grads = np.abs(grads)
grads = np.max(grads, axis=-1)
print(grads.shape)

grads = (grads - grads.min()) / (grads.max() - grads.min())
plt.figure(figsize=(5,5))
plt.imshow(img.astype(np.int))
plt.imshow((grads*255).astype(np.int), alpha=.5,cmap='jet')
plt.axis('off')
plt.show()

### GRAD-CAM

Grad-cam combines saliency by backprop and class activation maps. This method is mostly applicable to CNN architectures that use global avarage pooling 

In [None]:
import matplotlib.cm as cm

LAYER_NAME = 'post_relu'
CLASS_INDEX = 235

#read image and preprocess
img = tf.keras.preprocessing.image.load_img(IMAGE_PATH, target_size=(224, 224))
img = tf.keras.preprocessing.image.img_to_array(img)
img_input = tf.expand_dims(img, axis=0)
img_input = tf.cast(img_input, tf.float32)
img_input = tf.keras.applications.resnet_v2.preprocess_input(img_input)

# get a sub model to output final prediction and the output before GAP layer
grad_model = tf.keras.models.Model([model.inputs], [model.get_layer(LAYER_NAME).output, model.output])

# compute the gradients
with tf.GradientTape() as tape:
    conv_outputs, predictions = grad_model(img_input)
    loss = predictions[:, CLASS_INDEX]

output = conv_outputs[0]
grads = tape.gradient(loss, conv_outputs)[0]

# guided backprop
gate_f = tf.cast(output > 0, 'float32')
gate_r = tf.cast(grads > 0, 'float32')
guided_grads = tf.cast(output > 0, 'float32') * tf.cast(grads > 0, 'float32') * grads

weights = tf.reduce_mean(guided_grads, axis=(0, 1))

# CAM
cam = np.ones(output.shape[0: 2], dtype = np.float32)

for i, w in enumerate(weights):
    cam += w * output[:, :, i]


#Plot results
cam = cam.numpy()
cam = np.maximum(cam, 0)
heatmap = (cam - cam.min()) / (cam.max() - cam.min())
                                               
from PIL import Image
heatmap = Image.fromarray(np.uint8(heatmap*255))
heatmap = heatmap.resize((224, 224))

plt.figure(figsize=(5,5))
plt.imshow(img.astype(np.int))
plt.imshow(heatmap, alpha=.5,cmap='jet')
plt.axis('off')
plt.show()


# Using `tf-explain` library for model interpritations

Lets look at some examples of using tf-explain library

In [None]:
!pip install tf-explain

## SmoothGrad

Lets see why ResNet model classifiy the given image as german shepard.

In [None]:
from tf_explain.core.smoothgrad import SmoothGrad

CLASS_INDEX = 235 #german sheprard
model = tf.keras.applications.ResNet50V2(weights='imagenet', include_top=True)

#read image and preprocess
img = tf.keras.preprocessing.image.load_img(IMAGE_PATH, target_size=(224, 224))
img = tf.keras.preprocessing.image.img_to_array(img)
img = tf.keras.applications.resnet_v2.preprocess_input(img)

data = ([img], None)

explainer = SmoothGrad()
# Compute SmoothGrad on VGG16
grid = explainer.explain(data, model, CLASS_INDEX, 20, 1.,)
explainer.save(grid, '.', 'smoothgrad.png')

In [None]:
plt.figure(figsize=(5,5))
plt.imshow(((img/2.0+0.5)*255).astype(np.uint8))
plt.imshow(grid, alpha=.5,cmap='jet')
plt.axis('off')
plt.show()

Lets see why woud ResNet model give high probability for the given image as tennis ball.

In [None]:
CLASS_INDEX = 852 # tennis ball

grid = explainer.explain(data, model, CLASS_INDEX, 20, .05)
explainer.save(grid, '.', 'smoothgrad.png')

plt.figure(figsize=(5,5))
plt.imshow(((img/2.0+0.5)*255).astype(np.uint8))
plt.imshow(grid, alpha=.5,cmap='jet')
plt.axis('off')
plt.show()

## GradCAM

In [None]:
model.summary()

In [None]:
from tf_explain.core.grad_cam import GradCAM
CLASS_INDEX = 235 # tennis ball

img = tf.keras.preprocessing.image.load_img(IMAGE_PATH, target_size=(224, 224))
img = tf.keras.preprocessing.image.img_to_array(img)
img = tf.keras.applications.resnet_v2.preprocess_input(img)

data = ([img], None)

explainer = GradCAM()
grid = explainer.explain(data, model, class_index=CLASS_INDEX, layer_name="conv5_block3_3_conv" )
explainer.save(grid, '.', 'smoothgrad.png')

plt.figure(figsize=(5,5))
plt.imshow(((img/2.0+0.5)*255).astype(np.uint8))
plt.imshow(grid, alpha=.5,cmap='jet')
plt.axis('off')
plt.show()

In [None]:
CLASS_INDEX = 852 # tennis ball
grid = explainer.explain(data, model, class_index=CLASS_INDEX, layer_name="conv5_block3_3_conv" )
explainer.save(grid, '.', 'smoothgrad.png')

plt.figure(figsize=(5,5))
plt.imshow(((img/2.0+0.5)*255).astype(np.uint8))
plt.imshow(grid, alpha=.5,cmap='jet')
plt.axis('off')
plt.show()

More information on `tf-explain` is at [URL](https://tf-explain.readthedocs.io/en/latest/overview.html)