In [1]:
%run Latex_macros.ipynb

<IPython.core.display.Latex object>

In [2]:
# My standard magic !  You will see this in almost all my notebooks.

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

# Reload all modules imported with %aimport
%load_ext autoreload
%autoreload 1

%matplotlib inline

In [3]:
from IPython.display import Image

import cnn_helper
%aimport cnn_helper
cnnh = cnn_helper.CNN_Helper()

# Interpretation by Inverting 

Our initial exploration of Interpretability emphasized some pretty simple methods.

We continue our quest utilizing slightly more advanced ideas.

The general flavor of these ideas is as follows:
    
- If we can map an  individual feature (at a single spatial location of feature map $k$ of layer $\ll$)
- Back to the region of *input* features that affect it
- Then perhaps we can interpret the feature map $k$ of layer $\ll$: $\y_{\llp,k}$


We call these methods "inversion" as we map outputs (layer $\ll$ representations $\y_\llp$ )
back to inputs ($\x$).


# Receptive Field: From Feature Map to Input

Mapping an element of layer $\ll$ back to regions of layer $0$
requires the concept of *receptive field* that was introduced in [the module on CNN](CNN_Space_and_Time.ipynb#Receptive-field).

Let's review.


Since a Convolutional layer $\ll$
- Preserves the spatial dimension of its input (layer $(\ll-1)$ output (assuming full padding)
- We can relate a single feature at a particular spatial location of a feature map
- To the  spatial locations of layer $0$, the input, that affect the layer $\ll$ feature

We can determine spatial locations of the layer $0$ features influencing this single layer $\ll$ location  by working backwards from layer $\ll$.

<div>
    <center><strong>Conv 2D Receptive field: 2 layers<strong></center>
    <br>
<img src=images/Conv2d_receptive_2.png width=900>
    </div>

**Aside: Notes on the diagram**

The column under layer $(\ll-1)$ depicts
- A *single* feature map at different times (i.e., when the kernel is centered at different layer $\ll$ spatial locations)
- **Not** different layer $(\ll-1)$ feature maps!

We also omit feature map/channel subscripts (i.e., writing $\y_\llp$ rather than $\y_{\llp, \ldots, k}$)
as they are not necessary for our purpose
- As can be seen by reviewing the mechanics of convolution



This is because of the mechanics of the convolutional dot product
- Each feature map $k$ at layer $\ll$ 
- Is a function of *all* the feature maps at layer $(\ll-1)$
- So all feature maps at layer $\ll$ depend on the same spatial locations of layer $(\ll-1)$
- And these spatial locations are identical across all feature maps/channels of layer $(\ll-1)$   



<div>
    <center><strong>Convolution: preserve spatial dimension, change channel dimension</strong></center>
    <br>
<img src=images/Conv3d_2.png>
    </div>

Using a kernel with spatial dimension $(3 \times 3)$ for the Convolution of each layer
- The spatial locations in layer $\ll$
- Are color coded to match the spatial locations in layer $(\ll-1)$
- That affect it

So the yellow location in layer $\ll$ is a function of the yellow locations in layer $(\ll-1)$

Moving forward one layer: the central location in layer $(\ll+1)$
- Is a function of the spatial locations in layer $\ll$ that are encircled by the dashed square
- Which in turn are a function of a larger number of layer $(\ll-1)$ locations

In general
- The number of layer $(\ll-1)$ spatial locations
- That affect a given spatial location in layer $\ll' \ge \ll$
- **Grows** as $\ll'$ increases

We can continue this process backwards from layer $\ll$ to layer $0$
- Finally determining the set of input features (region of the input)
- Affecting a single spatial location at layer $\ll$

This region of layer $0$ spatial locations
- Is called the **receptive field** of the layer $\ll$ spatial location
- They are what this single layer $\ll$ spatial location "sees" (i.e., depend on)

- Let $\idxspatial$ denote the spatial indices of a single location 
    - Length of $\idxspatial$ depends on shape of data: one-dimensional, two-dimensional
- Let
$$
\y_{\llp, \idxspatial, k}
$$
denote the value of the $k^{th}$ feature of layer $\ll$ at spatial location $\idxspatial$
    
- In particular, we can refer to input features as
$$
 \y_{(0), \idxspatial, k}
$$

The receptive field $\mathcal{R}_{\llp, \idxspatial}$  of spatial location $\idxspatial$ of layer $\ll$ is

$$
\mathcal{R}_{\llp, \idxspatial} = \left\{ \idxspatial' \text{ at layer } 0 \; | \; \y_{\llp, \idxspatial, k}
 \text{ depends on } \y_{(0), \idxspatial', k'}\right\}
$$ 
for some
$$
\begin{array}[lll]\\
 1  \le & k  & \le n_\llp \\
 1  \le & k' & \le n_{(0)} \\
\end{array}
$$
where
$$
\begin{array}\\
\y_{\llp, \idxspatial, k} & \text{ is the feature at spatial location } \idxspatial \text{ of feature map }k \text{ of layer } \ll \\
\end{array}
$$

(Note that $k, k'$ are really not necessary since the spatial locations are shared across all channels during convolution)

# Saliency maps: Relating a feature to the input

The receptive field of a single location at layer $\ll$ 
- Defines *which* layer $0$ spatial locations affect the layer $\ll$ location
- But it does not measure the *magnitude* of the effect
- Which may be different for each feature $k'$ of layer $0$ at the same spatial location


We therefore compute the *sensitivity* of a feature
- At spatial location $\idxspatial$ of feature map $k$ of  layer $\ll$ of example $i$
- To a change in the feature at spatial location $\idxspatial'$ feature map $k'$ of layer $0$


$$
\mathcal{s}^\ip_{\llp, \idxspatial, k, (0), \idxspatial', k'} =  \frac{\partial \y^\ip_{\llp, \idxspatial, k}}{\partial  \y^\ip_{(0), \idxspatial', k'}}
$$

Note that the sensitivity
- Is *conditional* on the value of the input $\x^\ip$
- Since the derivative's value depends on the particular input


**Saliency maps**

Given layer $\ll$ and feature $1 \le k \le n_\llp$
- We can construct an image
- With same spatial dimension as layer $0$ (the input)
- By creating a grid of
$$\mathcal{s}^\ip_{\llp, \idxspatial, k, (0), \idxspatial', k'}$$
for each $\idxspatial'$ in  layer $0$
- Resulting in a "heat map" of how each input pixel affects the single spatial location of feature $k$ in layer $\ll$

Rather than doing this for 
- each spatial location $\idxspatial$ in layer $\ll$
- each input $\x^\ip$ in $\X$

We will create a handful of Saliency Maps for feature $k$ of layer $\ll$
- Limited to the Maximally Activating Examples for feature map $k$ of layer $\ll$

Recall, Maximally Activating Examples are defined with respect to feature map $k$ of layer $\ll$ as follows
- For each $\x^\ip$: compute the (absolute value) of the maximum (across spatial locations) of the feature map
- The Maximally Activating Examples are the examples with highest maximum values
- That is: the examples to which the feature map reacts most intensely

$\text{MaxAct}_{\llp,k} = [ i_1, \ldots, i_m ]$ is the permutation of example indices, i.e., $[ i | 1 \le i \le m]$ that sorts $\summaxact^\ip_{\llp,k}$
- where $\summaxact^\ip_{\llp,k}$ is the largest expression of the pattern anywhere in the spatial dimension of example $i$
$$
\summaxact^\ip_{\llp,k} = \max{ \idxspatial } \y^\ip_{(\ll), \idxspatial,k}
$$


Note that the Saliency Map
- Has the same number of features/channels as layer $0$: $n_{(0)}$
- Has spatial dimension limited to the receptive field of layer $\ll$
    - the input pixels not in the receptive field of layer $\ll$ will have undefined sensitivity

For any input $\x^\ip$
- We can view the Saliency Map for one layer $\ll$ feature $k$ given $\x^\ip$
- Along with the corresponding *patch* of $\x^\ip$ (part of $\x^\ip$ within receptive field of layer $\ll$)

Here are visualizations of Saliency Maps and corresponding Patches of 9 images (the ones most activating the feature)
- One layer 2 feature map
- On the spatial location $\idxspatial$ with maximum intensity in each image

<center><strong>Saliency Maps and Corresponding Patches<br>Single Layer 2 Feature Map<br>On multiple input images</strong></center>
<table>
    <tr>
        <td><img src="images/ZF_p4_115_row10_col3_mag.png"></td>
        <td><img src="images/ZF_p4_115_row10_col3_patch_mag.png"></td>
    </tr>
    <tr>
        <td colspan=2><center>Layer 2 Feature Map (Row 10, col 3).</center></td>
    </tr>
</table>
Attribution: https://arxiv.org/abs/1311.2901


The images are small because the Receptive Field of layer 2 is not that large.

We can hypothesize that this Feature Map is responsible for creating the synthetic feature
>"There is an eye in the input"

What is particularly interesting is that, by the time we get deeper into the network
- More complex "patterns" are being recognized
- Perhaps due to the Receptive Field getting larger

Is the interpretation of the following feature map (with high intensity "hot" colors on lips and cheeks)
>"Is face with smile present"

<center><strong>Saliency Maps and Corresponding Patches<br>Single Layer 5 Feature Map<br>On 9 Maximally Activating Input images</strong></center>

<table>
    <tr>
        <td><img src="images/ZF_p4_118_row11_col1_mag.png"></td>
        <td><img src="images/ZF_p4_118_row11_col1_patch_mag.png"></td>
    </tr>
    <tr>
        <td colspan=2><center>Layer 5 ? Feature Map (Row 11, col 1).</center></td>
    </tr>
</table>
Attribution: https://arxiv.org/abs/1311.2901


## Video: interactive interpretation of features

There is a nice video by [Yosinski](https://youtu.be/AgkfIQ4IGaM) which examines the behavior of
a Neural Network's layers on video images rather than stills.


# Computing the Saliency Map: Inverting a Convolution
Let's show how to compute the Saliency Map.

- We feed $\x^\ip$ as input
$$
\y_{(0)} = \x^\ip
$$
- Compute $\y^\ip_{\llp, \idxspatial, k}$ by moving left to right through the layers from $0$ to $\ll$
- Compute the sensitivities by moving right to left, from layer $\ll$ to layer $0$

This is very much like the Forward and Backward Passes we saw in
the module [Back propagation](Training_Neural_Network_Backprop.ipynb)


Recall the pictures:


<div>
    <center><strong>Forward Pass: Input to Loss</strong></center>
<br>
<img src="images/NN_Layers_plus_Loss_forward.png">
</div>

<div>
    <center><strong>Backward pass: Loss to Weights</strong></center>
<br>
<img src="images/NN_Layers_plus_Loss_backward.png">
</div>

The main difference is
- We truncate the network at layer $\ll$
- Take the derivative of $\y^\ip_\llp$ (given $\x^\ip$) rather than the Loss $\loss$
- Take derivatives with respect to input features $\y^\ip_{(0), \idxspatial', k'}$ rather than weights $\W$


The Forward Pass
- Mapping $\x^\ip$ to $\y^\ip_\llp$
- Is called *Convolution*

The Backward Pass
- Mapping $\y^\ip_\llp$ to a Saliency Map (grid of sensitivities)
- Is called *Deconvolution* or *Convolution Transpose*
- It is like inverting the Convolution

Naive back propagation does not always give the best results.

Zeiler and Fergus (and similar related papers) modify Back propagation 
- In an attempt to get better intuition as to which input features most affect a layer $\ll$ feature
- For example: ignore the *sign* of the derivatives as they flow backwards
    - Look for strong positive or negative influences, not caring which

This is called *Guided Back propagation*.

We also mention that back propagation through some layers is a technical challenge
- Max Pooling selects one value among all the spatial locations
- Which one ?
- Solution: Switches to record the location of the max on the Forward Pass

<div>
    <center><strong>Conv 2D: Global Pooling (Max/Average)</strong> </center>
    <br>
<img src=images/GlobalPoolingLayer.png>
    </div>

# Conclusion

We explored the idea of "inverting" the Convolution process
- Instead of going from input (layer $0$) to layer $\ll$
- We proceeds backward from a single location in a single feature map of layer $\ll$
- In an attempt to interpret the feature that the layer $\ll$ feature map is recognizing

By mapping back to input layer $0$
- We avoid the difficulty that arises 
when trying to interpret layer $\ll$'s features as combinations of layer $(\ll-1)$'s synthetic features.

Detailed experiments by Zeiler and Fergus
- Support the hypothesis that
- Deeper layers recognize features of increasing complexity

In [4]:
print("Done")

Done
