![](https://debuggercafe.com/wp-content/uploads/2019/11/lenet-5-e1574774376835.png)  
### [LeNet-5 official Paper](http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf)
### Basic of Neural Network Notebook Covered
1. Knowing the torch.nn package in pytorch
2. Building a simple neural network architecture.
3. Defining optimizers and loss functions in pytorch.
4. How to backpropogate the gradients?

### The torch.nn package

- `torch.nn` package Module used as base class for all neural network.
- `torch.nn` contains all the neural network layers such as Convolution and Linear and so on.
- `torch.nn.functional` package define all the intermediate operations during the forward pass of the network, pooling operations, assiging the dropout and activation function etc.

### Define Neural Network

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchsummary import summary

In [2]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, (3, 3)) # (input channel, output channel, kernel size)
        self.conv2 = nn.Conv2d(6, 16, (3, 3)) # (input channel, output channel, kernel size)
        
        # In linear layer(output channels from Conv2d x width x height
        self.fc1 = nn.Linear(16 * 6 * 6, 120) # (input channels, output channels)
        self.fc2 = nn.Linear(120, 84) # (input channels, output channels)
        self.fc3 = nn.Linear(84, 10) # (input channels, output channels)
        
    def forward(self, x):
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2)) # (input channel, output channel, kernel size)
        x = F.max_pool2d(F.relu(self.conv2(x)), (2, 2)) # (input channel, output channel, kernel size)
        x = x.view(x.size(0), -1) # Flatten the tensor
        x = F.relu(self.fc1(x)) # (input channels, output channels)
        x = F.relu(self.fc2(x)) # (input channels, output channels)
        x = self.fc3(x) # (input channels, output channels)
        return x # return the output
    
net = Net()
print(net) 
print(str(summary(net, (1, 32, 32), depth=1)))

Net(
  (conv1): Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(3, 3), stride=(1, 1))
  (fc1): Linear(in_features=576, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)


  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)


Layer (type:depth-idx)                   Output Shape              Param #
├─Conv2d: 1-1                            [-1, 6, 30, 30]           60
├─Conv2d: 1-2                            [-1, 16, 13, 13]          880
├─Linear: 1-3                            [-1, 120]                 69,240
├─Linear: 1-4                            [-1, 84]                  10,164
├─Linear: 1-5                            [-1, 10]                  850
Total params: 81,194
Trainable params: 81,194
Non-trainable params: 0
Total mult-adds (M): 0.27
Input size (MB): 0.00
Forward/backward pass size (MB): 0.06
Params size (MB): 0.31
Estimated Total Size (MB): 0.38
Layer (type:depth-idx)                   Output Shape              Param #
├─Conv2d: 1-1                            [-1, 6, 30, 30]           60
├─Conv2d: 1-2                            [-1, 16, 13, 13]          880
├─Linear: 1-3                            [-1, 120]                 69,240
├─Linear: 1-4                            [-1, 84]               

### `class Net(nn.Module)` Class Explained

- Net class contains two important methods `__init__()` and `forward()`
- `super(Net, self).__init__()` indicate that we inherit all the module property of `nn.Module()` and use all the methods and layers to define the neural network.
- Then Begin to construct the neural network `nn.conv2d(input channels, output channels, kernel size)` [conv2d](https://pytorch.org/docs/stable/nn.html?highlight=conv2d#torch.nn.Conv2d).
![Calculating the Height and Width](https://debuggercafe.com/wp-content/uploads/2019/11/Capture.png)  
*Calculating the Height and Width*

- where $Hin$=input height (let it be 32), $Win$=input width (let it be 32), $Hout$=output height, $Wout$=output width. And by default, padding is zero, dilaton is 1, stride is 1. Note that we have not given any values for padding, dilation, and stride, so the default values will be used.
- Next `nn.Linear(input channels(features), output channels(features))` is used for Fully connected layers.

### Define the Optimizer and Loss Function

- OPTIMIZER - [Article](https://www.deeplearning.ai/ai-notes/optimization/)

In [3]:
# Loss function and optimizer
import torch.optim as optim # optimizer for the network
loss_function = nn.MSELoss() # Mean Squared Error used for regression
optimizer = optim.RMSprop(net.parameters(), lr = 0.001) # RMSprop is a variant of Adam based on adadelta

### Dummy Input and Backpropogation

In [5]:
torch.cuda.get_device_name(0)

'NVIDIA GeForce RTX 3060 Laptop GPU'

### Input(X)

In [8]:
input = torch.randn(1, 1, 32, 32).to('cuda')
out = net(input)
print(out, out.size())

tensor([[-0.0909, -0.0805,  0.0964, -0.0140, -0.0992, -0.0023, -0.0797,  0.2174,
         -0.0558,  0.1773]], device='cuda:0', grad_fn=<AddmmBackward>) torch.Size([1, 10])


### Label(y)

In [9]:
# Dummy variable
labels = torch.rand(10).to('cuda')
labels = labels.view(1,-1)
print(labels, labels.shape)

tensor([[0.1771, 0.6598, 0.2284, 0.6371, 0.8484, 0.5922, 0.3972, 0.3106, 0.5181,
         0.2712]], device='cuda:0') torch.Size([1, 10])


### Loss Function and backward propagation

In [10]:
# define loss function
loss = loss_function(out, labels)
loss.backward()
print(loss)

tensor(0.2887, device='cuda:0', grad_fn=<MseLossBackward>)
