# Lab 4: Introduction to Convolutional Neural Networks

## 1 Numpy Edge Detection
To build intuition about convolutions we begin by implementing an image edge detection filter in numpy.

In [None]:
from skimage import data
import matplotlib.pyplot as plt

# Load and visualize a sample image
camera = data.camera()
plt.figure(figsize=(4, 4))
plt.imshow(camera, cmap='gray')
plt.axis('off')
plt.show()

In [None]:
import numpy as np

# Initialize the edge detection kernel
# kernel(numpy.array): kernel for edge detection with size of (3*3)
kernel = np.array([[-1, -1, -1],
                   [-1, 8, -1],
                   [-1, -1, -1]])
kernel = kernel / 8.0
print(f'Edge detection kernel:\n\n {kernel}')

### 1.1 Numpy 2D Convolution
Write a double for loop to convolve the edge detection kernel with the image. Apply the filter with `stride=2`. Plot the absolute value of the edge detection output using matplotlib's `imshow`.
Your final output should look like the image below.

<center>
<img src="https://drive.google.com/uc?id=11fFJ3QrbF87w8ChZF45p-5uXtmxZU8Qv">




In [None]:
# height and width of the image
H, W = camera.shape

# stride of the convolution
stride = 2

# edge detection kernel size
kernel_size = kernel.shape[0]


### TODO: convolve edge detection kernel with the camera image


### TODO: plot absolute value of the output


## 2 PyTorch Convolution

Now let's take a look at `torch.nn.conv2d`. Run the cell below to convolve 5 random kernels on the camera man image and see the shapes of the parameters:


In [None]:
import torch

# initialize convolutional kernel
conv_nn = torch.nn.Conv2d(1, 5, kernel_size=3, stride=2)

# set the kernel bias to zero
conv_nn.bias.data.zero_()

# convert camera image to a torch.tensor of shape (1, 1, H, W)
img_in = torch.tensor(camera, dtype=torch.float32)[None, None, :, :]

# forward pass
filtered_camera = conv_nn(img_in)

print(f'output shape (batch_size, in_channels, H, W): {filtered_camera.shape}')
print(f'kernel shape (out_channels, in_channels, kernel_size[0], kernel_size[1]): {conv_nn.weight.data.shape}\n')

# to compute the output keep in mind these variables and the formula for H,W output in torch.nn.Conv2d
print('Convolution layer parameters:')
print(f'Dilation: {conv_nn.dilation}')
print(f'Stride: {conv_nn.stride}')
print(f'Padding: {conv_nn.padding}')
print(f'Kernel size: {conv_nn.kernel_size}')


### 2.1 Functional 2D Convolution

Consider a minibatch of a randomly generated images (`toy_train_images`). Pass these images through the randomly initialized convolutional layer above.

Take the weights from the convolution layer above and implement the convolution using a double for loop.

**Note**: By default, PyTorch uses channels first representation of images $(N, C, H, W)$ as opposed to $(N, H, W, C)$, where $N=$ number of samples, $H=$ image height, $W=$ image width, and $C=$ number of image channels, e.g. 3 for rgb).


In [None]:
import torch.nn as nn
import copy

# toy minibatch hyperparameters
mini_batch = 10
height, width = (12, 12)
in_channels = 1
out_channels = 5

# generating minibatch from uniform distribution
toy_train_images = torch.rand(mini_batch, in_channels, height, width) 


### TODO: Copy the weights from previous cell's convolution layer
### Hint: You can use copy.deepcopy
my_weights = None

def my_conv_nn(X, kernel_weights):
    """Uses a double for loop to convolve the input image `x` with `my_weights`
    with a fixed stride of 2.

    Args:
        X (torch.Tensor): a minibatch of images of shape (batch_size, in_channels, H, W)

    Returns:
        (torch.Tensor): Convolution result

    Shape:
        - X: Of shape (N, C_in, H_in, W_in)
        - kernel_weights: Of shape (C_out, C_in, 3, 3)
        - output: (N, C_out, H_out, W_out)
    """
    # convolution hyperparameters
    H, W = (height, width)
    stride = 2

    ### TODO: Use for loop to implement a convolution

    return None

Confirm your custom function has the same behavior as `torch.nn.Conv2d` on the camera image.

In [None]:
my_out = my_conv_nn(toy_train_images, my_weights)
torch_out = conv_nn(toy_train_images)
assert my_weights.shape == (5, 1, 3, 3), f"Incorrect shape for 'my_weights' ({my_weights.shape})."
assert torch.is_tensor(my_out), "Your function output is not a torch.Tensor"
assert my_out.shape == torch_out.shape, f"Incorrect output shape ({my_out.shape})."
assert torch.norm(my_out - torch_out) < 1e-3, "Incorrect function output values compared to torch module"
print('Well done! Your function has the same behaviour as torch.nn.Conv2d')

### 2.2 Modular 2D Convolution

Build a small convnet using `torch.nn.Module` with two layers and forward pass the astronaut image through it.<br> You do not need to train the model for this excercise. You should use the `torch.nn.Conv2d` for this part.

The convnet should have the following specifications:<br>

* Activation Function: `ReLU` <br>
* Layer1: filter size `(5,5)`, out_channels `16`, Stride `2` convolution layer <br>
* Layer2: `(2,2)` pooling layer <br>
* Layer3: filter size `(3,3)`, out_channels `32` convolution layer <br>
* Layer4: Linear layer with output of `5`


In your forward function add print statements to show the size of the image at each layer. 


In [None]:
import matplotlib.pyplot as plt
import numpy as np

from skimage import data

# load and the astronaut image
astronaut_np = data.astronaut()
print(f"astronaut.shape: {astronaut_np.shape}")

# visualize the original and preprocessed astronaut image
# fig, ax = plt.subplots(1, 1)
fig = plt.figure()
fig.suptitle("Astronaut")
plt.imshow(astronaut_np)
plt.axis('off')
fig.show()

Run below cell to convert the astronaut image into a tensor and reshape it into the shape that PyTorch expects.

In [None]:
# convert the astronaut image to torch.tensor
astronaut = torch.tensor(astronaut_np, dtype=torch.float32)

# torch convolutions expect channels first representation
# of shape (N, C, H, W)
astronaut = astronaut.permute(2, 0, 1).unsqueeze(0)
print(f'astronaut.shape: {astronaut.shape}')

In [None]:
import torch.nn.functionalize as F
from skimage import data

class MyModel(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        ### TODO: Define layers based on description above
        self.conv1 = None
        self.pool = None
        self.conv2 = None
        self.linear = None


    def forward(self, x):
        ### TODO: Compelete forward pass, print image size after each layer
        pass

astronaut = data.astronaut()
print(f"astronaut.shape: {astronaut.shape}")

model = MyModel()
model(astronaut_processed)

## 3 Pretrained AlexNet Model


In this section, we will visualize a subset of the first layer filters of AlexNet and the result of applying these filters to the astronaut image.

Run the below cell to download the trained [AlexNet](https://papers.nips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html) model using [PyTorch Hub](https://pytorch.org/docs/stable/hub.html)'s [`torch.hub.load()`](https://pytorch.org/docs/stable/hub.html#torch.hub.load) method. The model is switched to `eval()` mode since we will not be doing any training in this lab:

In [None]:
import torch

# load the alexnet model using pytorch hub from:
# https://github.com/pytorch/vision/blob/winbuild/v0.6.0/torchvision/models/alexnet.py
model = torch.hub.load('pytorch/vision:v0.6.0', 'alexnet', pretrained=True)

# switch the model to "eval" mode since we are not doing any further training
model.eval()

# print the model architecture
print(model)

Since we are using a pretrained model, we need to make sure that our data has a similar distribution to the training data that the model was trained on. For our case here, this means that we need to preprocess the data in a similar manner to how it was done in the [original training pipeline](https://github.com/pytorch/examples/blob/97304e232807082c2e7b54c597615dc0ad8f6173/imagenet/main.py#L197-L198).

Run the below cell to preprocess and visualize the astronaut image.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

from skimage import data

def image_normalizer(image):
    r"""Normalizes the input to scale [0 1].

    Args:
        image (np.ndarray or torch.Tensor): image to be rescaled

    Returns:
        (np.ndarray or torch.Tensor): rescaled image

    Shape:
        - image: (*) Any shape
        - output: Same shape as input
    """
    return (image - image.min()) / (image.max() - image.min())

# the mean and standard deviations of ImageNet dataset 
# that were used for preprocessing AlexNet training data
mean = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225])

# preprocess the astronaut image from the part 2
astronaut_processed = astronaut / 255.0
astronaut_processed = (astronaut_processed - mean[None, :, None, None]) / std[None, :, None, None]

# visualize the original and preprocessed astronaut image
astro_processed_np = astronaut_processed.squeeze().permute(1, 2, 0).cpu().numpy()
fig, ax = plt.subplots(1, 2)
ax[0].set_title("Original Image")
ax[0].imshow(astronaut_np)
ax[0].axis('off')
ax[1].set_title("Preprocessed Image")
ax[1].imshow(image_normalizer(astro_processed_np))
ax[1].axis('off')
fig.show()

### 3.1 Visualizing AlexNet Kernels

The kernels (filters) in the first layer of AlexNet are of size 11. Visualize a randomly selected subset of 20 of these first layer filters as well as the respective output of convolving each kernel with the astronaut image. Your answer will look something like this
<center>
<img src="https://drive.google.com/uc?id=1azdohWuo3EEO9KC0szZmJOywX54jXlPz">

In [None]:
# random seed
np.random.seed(691)

# get the weights of the first layer's kernels of the model
# https://github.com/pytorch/vision/blob/9dff1b40ee9741216686556cc59fbf16964c8156/torchvision/models/alexnet.py#L18
conv_sequential = model.features
conv0 = conv_sequential[0]
conv0_weights = conv0.weight

# indices of kernels to show
random_inds = np.random.permutation(64)[0:20]

### TODO: convolve the astronaut image with the kernel weights and obtain the outputs


### TODO: plot 10 kernels corresponding to the 10 indices in `random_inds` and 
### their convolution outputs. You may use the provided `image_normalizer()`
### function in above cell for scaling the kernel weights and outputs
### for visualization.
for i, ind in enumerate(random_inds):
    pass