## LeNet5 Input Shape Calculation

### [Link to my Youtube Video Explaining this whole Notebook](https://www.youtube.com/watch?v=ys3VRBW4qx8&list=PLxqBkZuBynVRyOJs4RWmB_fKlOVe5S8CR&index=10)

[![Imgur](https://imgur.com/1SEnRQi.png)](https://www.youtube.com/watch?v=ys3VRBW4qx8&list=PLxqBkZuBynVRyOJs4RWmB_fKlOVe5S8CR&index=10)


--------------------------

### LeNet5 Architecture as per original paper

![Imgur](https://imgur.com/P51KCHJ.png)

------------

## Formulae for calculating CNN Output shapes

![Imgur](https://imgur.com/v9QLjHk.png)

### For Pooling Layer the same formulae becomes

![Imgur](https://imgur.com/ug3pqsb.png)

### Actual Manual Calculation of Output Shapes from various Layers

-------------------------

## First Conv Layer

Input Image = 32 * 32 * 1 .

Convolution Layer 1 : K = 6 , S = 1 , P = 0 , kernel_size = 5*5.

    Output Width  = ((input_width - kernel_width + 2 * padding) / stride ) + 1 .
                  
                  = (32 - 5 + 2 * 0) / 1 + 1 = 28 .

    Output Height = (input_height - kernel_height + 2 * padding) / stride + 1 .
                 
                  = (32 - 5 + 2 * 0) / 1 + 1 = 28 .

    Output Depth  = Number of kernels .
                  = 6 .

---------------------

## First Pooling Layer

Pooling Layer 1 : S = 2 , P = 0  , filter_size = 2 * 2. **Input Height and Width = 28**

    Output Width  = (input_width - filter_width) / stride + 1 .
                  = (28 - 2) / 2 + 1 = 14

    Output Height = (input_height - filter_height) / stride + 1 .
                  = (28 - 2) / 2 + 1 = 14

    Output Depth  = The Same Depth .
                  = 6 .

-----------------------------

## Second Conv Layer


Convolution Layer 2 : K = 16 , S = 1 , P = 0 , kernel_size = 5*5. **Input Height and Width = 14**

    Output Width  = (input_width - kernel_width + 2 * padding) / stride + 1 .
                  = (14 - 5 + 2 * 0) / 1 + 1 = 10 .

    Output Height = (input_height - kernel_height + 2 * padding) / stride + 1 .
                  = (14 - 5 + 2 * 0) / 1 + 1 = 10 .

    Output Depth  = 16 .

-----------------------------

## Second Pooling Layer


Pooling Layer 2 : S = 2 , P = 0  , filter_size = 2 * 2. **Input Height and Width = 10**

    Output Width  = (input_width - filter_width) / stride + 1 .
                  
                  = (10 - 2) / 2 + 1 = 5

    Output Height = (input_height - filter_height) / stride + 1 .
                  
                  = (10 - 2) / 2 + 1 = 5

    Output Depth  = 16 .

-----------------------------

## First Fully Connected Linear Layer

Flatten Layer : S = 1 , P = 0 , K = 120 , kernel_size = 5*5 . Input Height and Width = 5

    Output Width  = (input_width - filter_width) / stride + 1 .
                  
                  = (5 - 5) / 1 + 1 = 1

    Output Height = (input_height - filter_height) / stride + 1 .
                  = (5 - 5) / 1 + 1 = 1

    Output Depth  = 120 .

    Output Vector = Output Width * Output Height * Output Depth .
                  = 1 * 1 * 120 .
                  = 120 .


## PyTorch Implementation of above LeNet5

In [2]:
import torch
import torch.nn as nn
from torch.nn import Conv2d , AvgPool2d , Linear , Module
import torch.nn.functional as F
from torchsummary import summary

In [6]:
class LeNet5(nn.Module):
    def __init__(self,num_classes):
        super(LeNet5,self).__init__()
        
        self.layer1 = nn.Sequential(
            nn.Conv2d(in_channels=1, out_channels=6, kernel_size=(5,5), stride=(1,1), padding=(0,0)), #Layer 1
            nn.BatchNorm2d(6),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=(2, 2), stride=(2)) #Layer-2
        )                
        
        self.layer2 = nn.Sequential(
            nn.Conv2d(in_channels=6, out_channels=16, kernel_size=(5,5), stride=(1,1), padding=(0,0)), #Layer 3
            nn.BatchNorm2d(16),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=(2, 2), stride=(2,2)) #Layer 4
        )
        
        self.fc = nn.Linear(400, 120) # Layer 5
        self.relu = nn.ReLU()
        
        self.fc1 = nn.Linear(120, 84) # Layer 6
        self.relu1 = nn.ReLU()
        
        self.fc2 = nn.Linear(84, num_classes) # Final Layer
        
    def forward(self, x):
        output = self.layer1(x)
        output = self.layer2(output)
        # print('output after layer2', output.size()) # torch.Size([32, 16, 5, 5]
        output = output.reshape(output.size(0), -1) # See note below for this line
        # print('output after resize', output.size()) # torch.Size([32, 400])
        output = self.fc(output)
        output = self.relu(output)
        output = self.fc1(output)
        output = self.relu1(output)
        output = self.fc2(output)
        return output

![Imgur](https://imgur.com/P51KCHJ.png)

In [7]:
lenet = LeNet5(10)

summary(lenet, input_size=(1, 32, 32), device='cpu')

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1            [-1, 6, 28, 28]             156
       BatchNorm2d-2            [-1, 6, 28, 28]              12
              ReLU-3            [-1, 6, 28, 28]               0
         MaxPool2d-4            [-1, 6, 14, 14]               0
            Conv2d-5           [-1, 16, 10, 10]           2,416
       BatchNorm2d-6           [-1, 16, 10, 10]              32
              ReLU-7           [-1, 16, 10, 10]               0
         MaxPool2d-8             [-1, 16, 5, 5]               0
            Linear-9                  [-1, 120]          48,120
             ReLU-10                  [-1, 120]               0
           Linear-11                   [-1, 84]          10,164
             ReLU-12                   [-1, 84]               0
           Linear-13                   [-1, 10]             850
Total params: 61,750
Trainable params: 