Undestanding the number of parameters in a convolutional layer
======================

In [3]:
import torch
import torch.nn as nn

## Example 1

### Model Architecture

Take the following simple ConvNet model with only one convolutional layer:

1. **Convolutional Layer (conv1)**
    - Input channels: 3
    - Output channels: 16
    - Kernel size: 3x3
    - Padding: 1

2. **Fully Connected Layer 1 (fc1)**
    - Input features: 16 * 16 * 16 (after pooling)
    - Output features: 32

3. **Fully Connected Layer 2 (fc2)**
    - Input features: 32
    - Output features: 2

In [5]:
class SimpleNet(nn.Module):
    def __init__(self):
        super(SimpleNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 16, kernel_size=3, padding=1)
        self.pool = nn.MaxPool2d(2)
        self.fc1 = nn.Linear(16 * 16 * 16, 32)
        self.fc2 = nn.Linear(32, 2)

    def forward(self, x):
        out = self.pool(F.tanh(self.conv1(x)))
        out = out.view(-1, 16 * 16 * 16)
        out = F.tanh(self.fc1(out))
        out = self.fc2(out)
        return out

### Parameter Calculation

How many parameters does it have?

In [6]:
model = SimpleNet()

numel_list = [p.numel() for p in model.parameters()]
sum(numel_list), numel_list

(131618, [432, 16, 131072, 32, 64, 2])

Let's break down the number of parameters estimate in `SimpleNet`.

1. **Convolutional Layer (`conv1`)**: `self.conv1 = nn.Conv2d(3, 16, kernel_size=3, padding=1)`
    - Number of parameters = (Number of filters) * (Channels per filter) * (Filter height) * (Filter width) + (Number of filters)
    - Number of filters: 16
    - Channels per filter: 3
    - Filter height: 3
    - Filter width: 3
    - Biases: 16
    - Total parameters: $16 \times 3 \times 3 \times 3 + 16 = 16 \times 27 + 16 = 448$

2. **Fully Connected Layer 1 (`fc1`)**: `self.fc1 = nn.Linear(16 * 16 * 16, 32)`
    - Number of parameters = (Input features) * (Output features) + (Output features)
    - Input features: 16 * 16 * 16 = 4096
    - Output features: 32
    - Biases: 32
    - Total parameters: $4096 \times 32 + 32 = 131072 + 32 = 131104$

3. **Fully Connected Layer 2 (`fc2`)**: `self.fc2 = nn.Linear(32, 2)`
    - Number of parameters = (Input features) * (Output features) + (Output features)
    - Input features: 32
    - Output features: 2
    - Biases: 2
    - Total parameters: $32 \times 2 + 2 = 64 + 2 = 66$

### Total Number of Parameters

Sum all the parameters from each layer:
- Convolutional Layer (conv1): 448
- Fully Connected Layer 1 (fc1): 131104
- Fully Connected Layer 2 (fc2): 66

Total parameters: $448 + 131104 + 66 = 131618$

### Summary

The simplified model `SimpleNet` has a total of **131618 parameters**.

## Example 2

Here is another example: 

    self.conv2 = nn.Conv2d(16, 8, kernel_size=3, padding=1)

### Architecture

- Input channels: 16
- Output channels: 8
- Kernel size: 3x3
- Padding: 1

### Parameter Calculation

- Number of parameters = (Number of filters) * (Channels per filter) * (Filter height) * (Filter width) + (Number of filters)
- Number of filters: 8
- Channels per filter: 16
- Filter height: 3
- Filter width: 3
- Biases: 8

Total parameters: $8 \times 16 \times 3 \times 3 + 8 = 8 \times 144 + 8 = 1152 + 8 = 1160$