# 3. Convolutional Neural Networks


## Lesson 1: Convolutioanl Neural Networks


### 1. Define a convolutional layer + a pooling layer

In [2]:
import torch.nn as nn
import torch.nn.functional as F

As we define a custom MLPs, we need to define 2 parts:

1. a convolutional layer
2. a feedforward behavior


In [3]:
class cnn(nn.Module):
    def __init__(self):
        super().__init__();
        # convolutional layer
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=5, kernel_size=3, stride=1, padding=0)
        
        # pooling layer
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = self.pool(x)
        
        return x

**Note:**

1. `kernel_size` can be a scalar or a tuple, e.g. `kernel_size=3` === `kernel_size=(3, 3)`
2. `stride` can be a scalar or a tuple too

[torch.nn.Conv2d doc](https://pytorch.org/docs/stable/nn.html#conv2d)

[torch.nn.MaxPool2d doc](https://pytorch.org/docs/stable/nn.html#maxpool2d)


### 2. Sequential Models

We can also create a CNN using a `Sequential` in `__init__` function

In [None]:
class cnn2(nn.Module):
    def __init__(self):
        super().__init__()
        
        self.features = nn.Sequential(
            nn.Conv2d(1, 16, 2, stride=2),
            nn.MaxPool2d(2, 2),
            nn.ReLU(true),
            
            nn.Conv2d(16, 32, 3, padding=1),
            nn.MaxPool2d(2, 2),
            nn.ReLU(true)
        )

**Calculate number of parameters in a CNN**

Let:

1. `K` : number of filters
2. `F` : kernel size (assume kernel is square)
3. `D_in` : input depth

Then:

1. For each filter, the number of its parameters is `F*F*D_in`
2. For *K* filters, the number of their parameters is `K*F*F*D_in`
3. For the CNN, each filter should work with a bias, so the total number of parameters is `K*F*F*D_in + K`

**Calculate the shape of a CNN output**

Let:

1. `K` : number of filters
2. `F` : kernel size
3. `S` : kernel stride (step size)
4. `P` : padding for the input
5. `W_in` : input size (assume output of previous layer is a square)

The spatial dimensions of the CNN layer is **$\frac{W\_in - F + 2*P}{S} + 1$**

### 3. Flattening

To finish all of convolutional layer parts, at the end of the CNN kingdom, we need to flatten the output of the last convolutional layer, so that all parameters can be seen as a vector passing to the next kingdom (typically is MLPs).


## 4. Quiz

1. Quiz on depth
2. Quiz on output size
3. Quiz on number of parameters


In [46]:
# 4.2 Quiz on output size
class cnn3(nn.Module):
    def __init__(self):
        super(cnn3, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=10, kernel_size=3)
        self.pool1 = nn.MaxPool2d(kernel_size=4, stride=4)
        
        self.conv2 = nn.Conv2d(in_channels=10, out_channels=20, kernel_size=5, padding=2)
        self.pool2 = nn.MaxPool2d(2, 2)
    
    def forward(self, x):
        x = F.relu(self.pool1(self.conv1(x)))
        x = F.relu(self.pool2(self.conv2(x)))
        
        return x

In [47]:
import torch
import numpy as np

# input image is (130, 130, 3)
input_img = np.random.randn(1, 3, 130, 130)
input_tensor = torch.from_numpy(input_img).float()
print(input_tensor.shape)
print(input_tensor)

# define the cnn3
c3 = cnn3()
c3

torch.Size([1, 3, 130, 130])
tensor([[[[-1.8209e+00, -2.1117e+00, -1.5673e+00,  ...,  1.0953e-01,
            2.1953e-01, -9.2199e-01],
          [-1.9189e+00, -4.6497e-01, -2.0496e-01,  ..., -1.1678e+00,
           -7.7826e-01, -5.4527e-02],
          [-2.2725e-01,  4.9749e-01,  1.9517e-03,  ...,  2.4311e-01,
           -9.0033e-02,  7.9273e-01],
          ...,
          [ 2.1072e-01, -1.2518e+00,  1.6175e+00,  ...,  7.2644e-01,
            6.7363e-01, -2.0206e+00],
          [ 6.7155e-01,  5.9673e-01,  7.1884e-01,  ...,  8.4635e-01,
            8.0099e-01, -1.3762e+00],
          [ 4.9792e-01,  1.0314e+00, -2.3386e-01,  ..., -1.9264e+00,
            8.6474e-01, -5.2759e-01]],

         [[-4.0858e-01, -1.8259e+00, -1.6555e-01,  ..., -3.1100e-01,
            4.8991e-01, -9.1660e-01],
          [-3.6151e-01, -2.2194e-01, -1.3652e+00,  ..., -8.9918e-01,
            3.2796e-01, -6.7639e-01],
          [-1.1627e+00, -9.0478e-01, -4.8279e-02,  ..., -1.4363e+00,
           -8.5079e-01,  1.76

cnn3(
  (conv1): Conv2d(3, 10, kernel_size=(3, 3), stride=(1, 1))
  (pool1): MaxPool2d(kernel_size=4, stride=4, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(10, 20, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
  (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)

**Note:**

According to the [post](https://stackoverflow.com/questions/57237381/runtimeerror-expected-4-dimensional-input-for-4-dimensional-weight-32-3-3-but) and [torch.nn.Conv2d doc](https://pytorch.org/docs/stable/nn.html#conv2d), `nn.Conv2d` expects input size $(N, C, H, W)$, output $(N, C, H_{out}, W_{out})$


In [48]:
# feed-forward
output = c3(input_tensor)
print(output.shape)

torch.Size([1, 20, 16, 16])


In [55]:
# Quiz 3: check total number of parameters in cnn3
from torchsummary import summary
summary(c3, input_size=(3, 130, 130))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 10, 128, 128]             280
         MaxPool2d-2           [-1, 10, 32, 32]               0
            Conv2d-3           [-1, 20, 32, 32]           5,020
         MaxPool2d-4           [-1, 20, 16, 16]               0
Total params: 5,300
Trainable params: 5,300
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.19
Forward/backward pass size (MB): 1.52
Params size (MB): 0.02
Estimated Total Size (MB): 1.74
----------------------------------------------------------------


**Note:**

Install torchsummary: Run ```pip install torchsummary```

[github repo](https://github.com/sksq96/pytorch-summary)