<a href="https://colab.research.google.com/github/wandb/edu/blob/main/lightning/cnn/convolution_and_pooling.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Convolution and Pooling

## Utility Code

In [None]:
%%capture
import torch
import torch.tensor as T

import matplotlib.pyplot as plt
from skimage import io

# pull down images
!wget https://raw.githubusercontent.com/wandb/edu/main/lightning/cnn/ims/smiley.png
!wget https://raw.githubusercontent.com/wandb/edu/main/lightning/cnn/ims/dog.jpg

In [None]:
def prepare_kernel(kernel):
    """prepare a kernel for use with torch.conv2d
    technical details not necessary for understanding convolutions"""
    kernel = kernel.double()  # cast to double precision
    kernel -= torch.mean(kernel)  # standardize: centered at 0
    kernel /= torch.linalg.norm(kernel)  # standardize: unit "length"
    kernel = kernel[None, :, :]  # conv2d has an input channel dimension
    kernel = kernel[None, :, :, :]  # conv2d expects a bank of filters 
    return kernel

def imshow(im):
    """convenience function for plotting images"""
    plt.imshow(torch.atleast_2d(torch.squeeze(im)), cmap="Greys")
    plt.axis("off"); plt.colorbar()

## Load an Image

In [None]:
impath = "smiley.png"  # options: dog.jpg, smiley.png
raw_image = io.imread(impath, as_gray=True)
image = T(raw_image)[None, None, :, :]  # detail: conv2d expects batches of images with multiple channels
image = -1 * (image - torch.mean(image))

imshow(image)

## Define a convolution kernel and view it

In [None]:
kernel = T([[0, 0, 1],  # change this around to define your own kernels!
            [0, 1, 1],  # see exercises for a suggestion
            [1, 1, 1]])
# kernel = 1 - kernel  # try this to flip the kernel: dark edges to light edges, etc.
t_kernel = prepare_kernel(kernel)

imshow(kernel)

## Apply the kernel and see the results

In [None]:
features = torch.conv2d(image, t_kernel)

# exercise code goes here; see below

imshow(features)

## Exercises

#### **Exercise**: Convolutions are usually followed by nonlinearities. Add `torch.relu` to the operations above. How does the output change?

#### **Exercise**: CNNs use pooling to reduce the size of representations passing through the network. Add `torch.max_pool2d` to the operations above (after the non-linearity, if you have one). How does the output change? _Note_: `max_pool2d` requires two arguments: the array to apply the pooling to, and a `kernel_size` for the pooling. Reasonable values might be between 2 and 5.

#### **Exercise**: The [Sobel operator](https://en.wikipedia.org/wiki/Sobel_operator) can be used as a rough detector for edges in any direction. One piece of it, $G_x$ (defined below), makes for a nice horizontal edge detector as a convolution kernel. Try it out as a `kernel` above.

$$
G_x = \left[\begin{array}{ccc}
    1 & 2 & 1 \\
    0 & 0 & 0 \\
    -1 & 2 & -1
    \end{array}\right]
$$