In [None]:
%%bash
# If you are on Google Colab, this sets up everything needed.
# If not, you will want to pip install the cs7150lib as shown below.
!(stat -t /usr/local/lib/*/dist-packages/google/colab > /dev/null 2>&1) && exit
pip install git+https://github.com/cs7150/cs7150lib@main

# Examining and visualizing convolutions

First we just define a widget that will be used for future experiments.

In [None]:
import torch, os
from torchvision.models import alexnet
from torchvision.transforms import Compose, ToTensor, Normalize, Resize
from baukit import ImageFolderSet, show, renormalize, set_requires_grad
from torchvision.datasets.utils import download_and_extract_archive
from cs7150 import ConvolutionWidget, ConvolutionNetWidget


# 1. Make vertical striped array of numbers

Add a line of code below so that vdata contains a circle of vertical stripes, like this:
```
sdata = torch.tensor([[
    [1.0 if i % 3 == 0 else -1.0 for i in range(32)]
    for _ in range(32)]])
mdata = torch.tensor([[
    [1.0 if (i**2 + j**2 < 12**2) else 0.0 for j in range(-16, 16)]
    for i in range(-16, 16)]])
vdata = sdata * mdata    
```

In [None]:
import PIL
vdata = torch.zeros(1, 32, 32)
# TODO: ADD YOUR CODE HERE.
print(vdata[:8,:8])

# 2. See the interaction with a convolution

Click on middle "convolution" widget below, and see how the vertical stripe data interacts with a convolution.

1. Adjust the convolution to be a vertical edge detector (with a vertical stripe).  What is the result?

2. Adjust the convolution to be a horizontal edge detector (with a horizontal stripe).  What happens?

After you have created a horizontal edge detector that is blind to the vertical edges, now click on the image to interrupt the purely vertical lines.  What effect do you see?

In [None]:
widget = ConvolutionWidget(vdata, kernel_size=3)
show(widget)

## 3. Modify the convolution in code

Modify the code below to alter the convolution in the widget above.
Use the code to make a horizontal edge-detector with row weights [-0.5, 1.0, -0.5].

Why does the convolution weight have four dimensions?

In [None]:
# TODO: add some code here
widget.net[0].weight[0,0,1,:] = 1.0
print(widget.net[0].weight)
widget.redraw()

## 4. Experiment with a stack of two convolutions

The code below provides a stack of two convolutions.

If you stack a vertical edge detector after a horizontal edge detctor, what will it detect?

In [None]:
ConvolutionWidget(vdata, depth=2)

## 5. Make a single-dot piece of data

Now the array `ddata` should be 1x32x32, and it should be which is -1 everywhere but 1 in the center location.

In [None]:
import PIL
ddata = torch.ones(1, 32, 32) * -1
# TODO: ADD YOUR CODE HERE.


## 6. Visualize the effect of a stack of convolutions on a single dot.

Now visualize the downstream pixels that are affected by the dot.

* Try varying the convolution patterns.  What is the biggest area that you can affect?  This is the inverse of the receptive field.  The receptive field asks "what is the biggest area that can affect a single pixel in the output" which is a similar shape, but in the input.

* Try varying the `kernel_size` and the `depth`.  What affect does it have on the inverse receptive field?

* Do you notice any edge effects?  Why do these appear?  What happens if you change the padding?

Once you have played with this, look at the difference between left, right, bottom, and top when you adjust the convolutions:

* Did you notice that the coordinates for convolutions are inverted from image coordinates?


In [None]:
ConvolutionWidget(ddata, kernel_size=3, depth=3, padding=1)

## 7. Visualize the receptive field of a stack of convolutions

Read and understand the code below....

Experiment with a different stack of convolutions.  What does it tell you about the receptive field?

In [None]:
from baukit import show, renormalize
from torch.nn import Sequential, Conv2d
import torch
from cs7150 import sliding_window

with torch.no_grad():
    net = Sequential(
        Conv2d(1, 1, kernel_size=3, padding=1, bias=False),
        Conv2d(1, 1, kernel_size=3, padding=1, bias=False),
    )

    heatmap = torch.zeros(32, 32)
    for inp in sliding_window(heatmap):
        out = net(inp[None])[0,16,16]
        heatmap += inp * out * 30

    show(show.style(width=150, imageRendering='pixelated'), renormalize.as_image(heatmap[None]))


## 8. Load a pretrained alexnet

The code below loads a pretrained Alexnet, the famous network by Alex Krizhevsky in 2012.

Examine the network's layers.  Notice that net.features is a stack of convolutions.

In [None]:
net = alexnet(pretrained=True)
net

## 9. Test the accuracy of alexnet

The code below downloads a small sample of imagenet and tests the accuracy of alexnet on it.

It shows the first 12 examples.  How does it do?

 1. Modify the code (remove the "break") so that it tests all 10k training examples.
 2. Now modify the code (change from the "/train" directory to the "/val" directory) to test it on held-out examples.

What is your impression of the accuracy of the model?

In [None]:
from baukit import pbar
if not os.path.isdir('imagenet10k'):
    download_and_extract_archive('https://cs7150.baulab.info/2022-Fall/data/imagenet10k.zip', 'imagenet10k')
preprocess = Compose([
    ToTensor(),
    Resize(227),
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
ds = ImageFolderSet('imagenet10k/train', transform=preprocess, classification=True, shuffle=True)
with torch.no_grad():
    examples = []
    correct = 0
    tested = 0
    for i, (im, label) in enumerate(pbar(ds)):
        pred = net(im[None]).argmax(1).item()
        if len(examples) < 12:
            examples.append([
                f'pred: {ds.classes[pred]}',
                f'true: {ds.classes[label]}',
                [renormalize.as_image(im, source=ds)]])
            if len(examples) == 12:
                show(show.WRAP, *[examples])
                break
        tested += 1
        if pred == label:
            correct += 1
print('correct:', correct, 'out of', tested)

## 10. Explore the convolutional stack of alexnet

The widget below runs the `features` subnetwork of alexnet on the first dataset example,
and shows the image data as it passes through.

Since each layer deals with many channels of data, each box shows the number of possible channels.

(Note that the maximum channel numbers are 2, 63, 191, 383, 255, 255 - you can read these sizes out of the  network printout below.)

Explore the different channels of alexnet filters.  Can you find any dilters that look like edge-detectors?

In [None]:
net = alexnet(pretrained=True)
w = ConvolutionNetWidget(ds[0][0], net=net.features)
w

In [None]:
net.features

## 11. Extra: we will do a sliding window heatmap of alexnet's salience.

Here we will construct a new example by hand, if enough time, using Matt Zeiler's masking salience technique.

## 12. Explore alexnet using Polo Chan's CNN explainer

Once you're done exploring in pytorch, you can visit the following fancy javascript widget, that lets you interact with alexnet with a pretty UI running inside javascript:

https://poloclub.github.io/cnn-explainer/