# Interpretable Machine Learning
## Exercise Sheet: 9
## This exercise sheet covers chapters 10.1 and 10.2 from the IML book by Christoph Molnar
Kristin Blesch (blesch@leibniz-bips.de)<br>
Niklas Koenen (koenen@leibniz-bips.de)
<hr style="border:1.5px solid gray"> </hr>

# 1) Learned Features

## a) Feature Visualization

**I)** Describe the Feature Visualization method and formulate this method as an optimization problem for a single neuron, a channel, and the entire layer. Which approaches are available for solving this optimization problem?

**Solution:**

**II)** Consider the following neural network with two layers for a classification problem with two classes and input $x \in \mathbb{R}^2$:
$$\text{model}(x) = \text{softmax}\Big( W_2 \cdot \text{ReLU}\big(W_1 \cdot x + b_1 \big) + b_2 \Big)$$

In [None]:
import torch
import numpy as np
import matplotlib.pyplot as plt

# Define model weights and bias
W1 = torch.tensor(
    (( 0.5006,  0.2231),
     (-0.8746, -1.0410),
     ( 0.1379,  0.0027),
     ( 0.0556, -0.3049)))
b1 = torch.tensor((0.5872, 0.9070, -0.6963, 0.2474))

W2 = torch.tensor(
    (( 0.9417, -0.5311, -0.1064, -0.4385),
     (-0.5608,  0.3371, -0.0343, -0.2047)))
b2 = torch.tensor((0.1320, 0.1329))

# Define model
def model(x):
  # First layer
  x = torch.matmul(x, W1.t()) + b1
  x = torch.relu(x)

  # Second layer
  x = torch.matmul(x, W2.t()) + b2
  x = torch.softmax(x, dim = 1)
  
  return x

# Create a n x n grid and the coordinates for the contour plot
n = 100
x, y = np.meshgrid(np.linspace(-10,10,n),np.linspace(-10,10,n))
coords = np.stack((x.reshape(-1), y.reshape(-1)), axis = 1)
coords_torch = torch.tensor(coords, dtype = torch.float)

Instead of optimizing a unit for only one input, we now create a contour plot ([`matplotlib.pyplot.contourf`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.contourf.html#matplotlib-pyplot-contourf)) to see the activation for all possible inputs from $\mathbb{R}^2$ (this is only possible in this case because the input space is 2-dimensional).  

- Create these plots once for the first neuron in the second layer (before Softmax) and for the second neuron in the first layer (after ReLU).
- Which value solves (globally on the whole plane) the optimization problem in each case, or can this question be answered at all? If not, how is this problem tackled?

**Solution:**

In [None]:
# Function to return the activation of neuron 1 in layer 2 (before softmax)
def get_act_layer2_neuron1(x):
  #
  # to do!
  #
  return x

# Create the contour plot

# Get activations for selected unit and convert it from torch to numpy
z = get_act_layer2_neuron1(coords_torch).numpy()
# Reshape z to an n x n matrix
z = z.reshape((n,n))

# Plot the result
plt.contourf(x, y, z, levels = 20, cmap = 'bwr', vmin = -np.max(np.abs(z)), vmax = np.max(np.abs(z)))
plt.colorbar()
plt.show()

In [None]:
# Function to return the activation of neuron 2 in layer 1 (after ReLU)
def get_act_layer1_neuron2(x):
  #
  # to do!
  #
  return x

# Create the contour plot

# Get activations for selected unit and convert it from torch to numpy
z = get_act_layer1_neuron2(coords_torch).numpy()
# Reshape z to an n x n matrix
z = z.reshape((n,n))

# Plot the result
plt.contourf(x, y, z, levels = 20, cmap = 'bwr', vmin = -np.max(np.abs(z)), vmax = np.max(np.abs(z)))
plt.colorbar()
plt.show()

## b) Network Dissection

**I)** What is the difference between the Feature Visualization and Network Dissection methods?

**Solution:**

**II)** Explain the three steps of the network dissection algorithm.

**Solution:**

# 2) Pixel Attribution

## a) Theory

**I)** What is the basic idea behind pixel attribution methods and what is the mathematical legitimization of using the gradients to interpret the model prediction?

**Solution:**

**II)** Explain the intuition behind the Grad-CAM method and describe the corresponding algorithm.

**Solution:**

## b) Programming exercise
In this task, we use the pretrained network [Inception v3](https://arxiv.org/abs/1512.00567) in `torch` and `torchvision` to apply some pixel attribution methods on the following image:

In [None]:
import torchvision
import torchvision.models as models
import torchvision.transforms as transforms
from PIL import Image
import matplotlib.pyplot as plt

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],
                          std=[0.229, 0.224, 0.225]),
     transforms.Resize((299,299))])

# Define plot function for torch.tensors of form (1, C, H, W)
def plot_image(x):
  # take the sum of the absolut values over the channels
  x = x[0,].abs().sum(dim = 0)
  # convert it to numpy and transform it to make artifacts more explicit
  x = x.numpy()**0.75
  plt.imshow(x)
  plt.show()

# Load and preprocess image
img = Image.open('imagenet_rooster.png') # Make sure that the image is in your working directory!!!
x = transform(img)  # Preprocess image
x = x.unsqueeze(0)

# Get model from torchvision
model = models.inception_v3(pretrained = True)
model.eval()

# Show image
plt.imshow(img)
plt.show()

**I)** Calculate the prediction for the image for the Inception v3 model as a probability and output the index with the highest probability. Hint: The model outputs the pre-softmax values, i.e. the last layer has no activation function.  
**Bonus:** What label does this index correspond to?

**Solution:**

**II)** Apply the Vanilla Gradient method to the image for the model output with the highest probability from the last subtask. Plot your result.  
**Hint:** For a scalar output `out`, the method `out.backward()` calculates the gradients of the output with respect to the inputs `input`. Afterwards the gradients can be output with `input.grad`.

**Solution:**

In [None]:
# Clone the torch-converted image
input = x.clone()
# Tell torch to track the gradients for this value
input.requires_grad = True

#
# to do
#
# grad = ...
# plot_image(grad)

**III)** Apply the Grad-CAM method to the image for the model output with the highest probability from task **I)** and plot your result.

**Solution:**

In [None]:
# We need to define a hook to calculate the gradients of the output w.r.t. the last convolutional layer.
# Hooks are executed in addition to the forward pass and in this case only the input will be stored.
hook = {}
def get_input_hook(self, input, output):
  hook['input'] = input[0]

# We add the defined hook in the layer 'avgpool' which is the first non-colvolutional layer and its input 
# is of size (*, 2048, 8, 8)
model.avgpool.register_forward_hook(get_input_hook)

# Clone the torch-converted image and tell torch to track the gradients for this input
input = x.clone()
input.requires_grad = True

# to do (start) -----------------------------------------------------------------------------------------------------

# get the model output with the highest class probability
# out = ...

# Calculate gradients w.r.t to the last convolutional layer
grad_A_k = torch.autograd.grad(out, hook['input'])[0]

# Calculate a_k
# a_k = ...

# Get the feature maps A_k (stored in the forward-hook)
# A_k = ...

# Calculate the localization map (ReLU(sum(a_k * A_k)))
# grad_cam = ...

# to do (end) -------------------------------------------------------------------------------------------------------

# Plot the result
plot_image(grad_cam)

**IV)** Combine both previous results in the Guided Grad-CAM method and plot your result. Use the function [`torch.nn.Upsample`](https://pytorch.org/docs/stable/generated/torch.nn.Upsample.html#torch.nn.Upsample) for upsampling the Grad-CAM result.

**Solution:**

In [None]:
# Upsample the Grad-CAM result to size (1,1,299,299)
# grad_cam_ups = ...

# Multiply heatmaps
# guided_grad_cam = ...

# Plot the result
plot_image(guided_grad_cam)