# Convolutional Neural Networks

We'll explore convolutions and implement some convolutional neural networks. 

## Convolutions in 1D

We'll start by implementing a simple 1D convolution with a _rectangular kernel_ that works as a running average.

In [None]:
import numpy as np

signal = np.array([1, 2, 3, 4, 5, 4, 3, 2, 1])

filter1d = np.ones(2) / 2

conv1d_length = signal.shape[0] - filter1d.shape[0] + 1;
conv1d = np.zeros((conv1d_length,))
for i in range(conv1d_length):
    conv1d[i] = np.sum(signal[i:i+filter1d.shape[0]] * filter1d)

print(conv1d)

Try to modify the code above by changing the `signal` and the `kernel`.
For example, use the following kernels:

- _Prewitt kernel_ `[1, 0, -1]` for edge detection or differentiation (equal to the _Sobel kernel_ in 1D).

- _Gaussian kernel_ `[.25 .5 .75 .5 .25]` for smoothing.

## Convolutions in 2D

We'll now move on and implement a 2D convolution with a _rectangular kernel_ that works as a local averaging.

In [None]:
import numpy as np

image = np.array([
    [1, 1, 0, 0, 1, 1, 0, 0, 1, 1], 
    [1, 1, 0, 0, 1, 1, 0, 0, 1, 1], 
    [0, 0, 1, 1, 0, 0, 1, 1, 0, 0], 
    [0, 0, 1, 1, 0, 0, 1, 1, 0, 0], 
])

filter2d = np.ones((2, 2)) / 4

conv2d_height = image.shape[0] - filter2d.shape[0] + 1;
conv2d_width = image.shape[1] - filter2d.shape[1] + 1;
conv2d = np.zeros((conv2d_height, conv2d_width))
for i in range(conv2d_height):
    for j in range(conv2d_width):
        conv2d[i, j] = np.sum(
            image[i:i+filter2d.shape[0], j:j+filter2d.shape[1]] * filter2d
        )
        
print(conv2d)

Try modifying the code above changing the `signal` and the `kernel`.
For example, use the following kernels:

- _Prewitt kernel_ to detect edges:
    ```python
    kernel = np.array([
        [-1, 0, 1],
        [-1, 0, 1],
        [-1, 0, 1]
    ])
    ```

- _Sobel kernel_ to detect edges:
    ```python
    kernel = np.array([
        [-1, 0, 1],
        [-2, 0, 2],
        [-1, 0, 1]
    ])
    ```

- _Gaussian kernel_ to smooth the image:
    ```python
    kernel = np.array([
        [.04, .08, .12, .08, .04], 
        [.08, .16, .24, .16, .08], 
        [.12, .24, .36, .24, .12], 
        [.08, .16, .24, .16, .08], 
        [.04, .08, .12, .08, .04]
    ])
    ```

## Convolutional Layers

We'll now implement a convolutional layer in PyTorch. 

### Sample Image

We start by creating a grayscale image, i.e., an image with a single color channel, stored in the `image` PyTorch tensor.
This array has for dimensions, corresponding to batch size (`1`), color channels (`1`), height (`H`), and width (`W`).

In [None]:
import torch

H = 12
W = 16
S = 4
image = torch.zeros(1, 1, H, W)
for idx in range(0, H, S):
    for idy in range(0, W, S):
        image[0, 0, idx:idx + S, idy:idy + S] = (-1)**(idx / S + idy / S)

We can now implement the `plot_image()` function...

```python
def plot_image(image):
    import matplotlib.pyplot as plt
    
    plt.imshow(image, cmap="gray", aspect="equal", 
               extent=[0, image.shape[1], 0, image.shape[0]])
    plt.colorbar()
    plt.xticks(range(0, image.shape[1] + 1))
    plt.yticks(range(0, image.shape[0] + 1))
    plt.grid(color="red", linewidth=1)
    plt.tight_layout()
    plt.show()
```

... and use it to show the image.

In [None]:
from plotting_cnn import plot_image

plot_image(image.squeeze())

### Convolutional Layer

We'll now implement a convolutional layer in PyTorch with one input channel (`in_channels=1`), two output channels (`out_channels=2`), and a square kernel with size $1 \times 3$ (`kernel_size=(1, 3)`).

We then initialize its weights to perform a local averaging and a horizonthal edge detection.

In [None]:
import torch.nn as nn

conv = nn.Conv2d(in_channels=1, out_channels=2, kernel_size=(1, 3))
filters = torch.zeros(conv.out_channels, conv.in_channels, *conv.kernel_size)
filters[0, 0, :, :] = torch.Tensor([[1, 1, 1],]) / 3
filters[1, 0, :, :] = torch.Tensor([[-1, 0, 1],])
conv.weight = nn.Parameter(filters)

features_conv = conv(image)

We can now implement the `plot_channels()` function...

```python
def plot_channels(channels):
    import matplotlib.pyplot as plt

    fig, axs = plt.subplots(1, channels.shape[0], figsize=(15, 5))

    for channel, ax, i in zip(channels, axs, range(channels.shape[0])):
        im = ax.imshow(channel, cmap="gray", aspect="equal", 
                       extent=[0, channel.shape[1], 0, channel.shape[0]])
        plt.colorbar(im)
        ax.set_title(f"Channel {i}")
        ax.set_xticks(range(0, channel.shape[1] + 1))
        ax.set_yticks(range(0, channel.shape[0] + 1))
        ax.grid(color="red", linewidth=1)

    plt.tight_layout()
    plt.show()
```

... and use it to plot the output features of the convolutional layer.

In [None]:
from plotting_cnn import plot_channels

plot_channels(features_conv[0].detach())

### ReLU Activation

We'll now add the `torch.nn.Sequential` ReLU activation to the convolutional layer.

To combine the convolutional layer and the relu, we need to use the `torch.nn.Sequential` model.

In [None]:
relu = nn.ReLU()
model_relu = nn.Sequential(conv, relu)
features_relu = model_relu(image)
plot_channels(features_relu[0].detach())

### Pooling Layer

We'll now add to the `torch.nn.MaxPool2d` convolutional layer a pooling layer with a square kernel with size $2 \times 1$ (`kernel_size=(2, 1)`) and stride of 2 in the vertical direction (`stride=(2, 1)`).

In [None]:
pool = nn.MaxPool2d(kernel_size=(2, 1), stride=(2, 1))
model_pool = nn.Sequential(conv, pool)
features_pool = model_pool(image)
plot_channels(features_pool[0].detach())

### Upsampling Layer

We'll now add to the convolutional layer the `torch.nn.Upsample` upsampling layer with a scale factor of and 2 in the vertical direction (`scale_factor=(2, 1)`).

In [None]:
upsample = nn.Upsample(scale_factor=(2, 1))
model_upsample = nn.Sequential(conv, upsample)
features_upsample = model_upsample(image)
plot_channels(features_upsample[0].detach())

## CNN Architectures for Image Transformation

We can combine multiple convolutional, activation, downsampling, and upsampling layers to contruct complex convolutional architectures to transform images.

Here, we show an example with two convolutional layers with ReLU activation and max-pooling.

In [None]:
model_trans = nn.Sequential(
    nn.Conv2d(in_channels=1, out_channels=16, kernel_size=3),
    nn.ReLU(),
    nn.MaxPool2d(kernel_size=2, stride=2),
    nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3),
    nn.ReLU(),
    nn.MaxPool2d(kernel_size=2, stride=2),
)

image_trans = model_trans(image)

print(f"Input image with {image.shape}")
print(f"Output image with {image_trans.shape}")

## CNN Architectures for Image Classification

For image classification, we typically need to flatten the images and use some dense layers at the output (in this case with softmax activation).

In [None]:
model_trans = nn.Sequential(
    nn.Conv2d(in_channels=1, out_channels=16, kernel_size=3),
    nn.ReLU(),
    nn.MaxPool2d(kernel_size=2, stride=2),
    nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3),
    nn.ReLU(),
    nn.MaxPool2d(kernel_size=2, stride=2),
    nn.Flatten(),
    nn.Linear(in_features=32 * 1 * 2, out_features=2),
    nn.LogSoftmax(dim=1),
)

classification = model_trans(image)

print(f"Input image with {image.shape}")
print(f"Output classes with {classification.shape}")