<a href="https://colab.research.google.com/github/sadat1971/PyTorch_practice/blob/main/003_pytorch_Neural_Net.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Neural Network with Pytorch

This notebook will create a convolutional neural network practice session

In [2]:
import torch
from torch.nn.functional import max_pool2d, relu

In [49]:
class convnet(torch.nn.Module):

  def __init__(self):
    super(convnet, self).__init__()
    # 1 input image channel, 6 output channels, 3x3 square convolution kernel

    self.conv1 = torch.nn.Conv2d(1, 6, 3)
    self.conv2 = torch.nn.Conv2d(6, 16, 3) # from the output of the first layer, we took 6 inputs (that were outputs of the first layer), 16 outputs and 3X3 square conv kernel
    self.fc1 = torch.nn.Linear(16*6*6, 120) # Images are 6X6
    self.fc2 = torch.nn.Linear(120, 84)
    self.fc3 = torch.nn.Linear(84, 10)

  def forward(self, x):
    x = max_pool2d(self.conv1(x), (2,2)) #max pooling over 2X2
    x = max_pool2d(self.conv2(x), (2,2)) #max pooling over 2X2
    x = x.view(-1, self.flattened_features(x))
    x = relu(self.fc1(x))
    x = relu(self.fc2(x))
    x = self.fc3(x)
    return x

  def flattened_features(self, x):
    size = x.size()[1:] # The first diememsion is the batch size, we just want to avoid that and have rest of the dimensions
    flattened = 1
    for sz in size:
      flattened = flattened*sz
    return flattened



Convnet = convnet()
print(Convnet)



convnet(
  (conv1): Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(3, 3), stride=(1, 1))
  (fc1): Linear(in_features=576, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)


Now, let's buid a random image and feed in to out convolution net

In [50]:
image_test = torch.randn(1, 1, 32, 32)
Out = Convnet(image_test)
Out

tensor([[-0.0036,  0.1115, -0.0169, -0.0016,  0.1517,  0.1368, -0.0038,  0.1104,
          0.1421, -0.0708]], grad_fn=<AddmmBackward>)

In [51]:
# Zero the gradient buffers of all parameters and backprops with random gradients:


Convnet.zero_grad()
Out.backward(torch.randn(1, 10))

## From the website
* torch.nn only supports mini-batches. The entire torch.nn package only supports inputs that are a mini-batch of samples, and not a single sample.

* For example, nn.Conv2d will take in a 4D Tensor of nSamples x nChannels x Height x Width.

* If you have a single sample, just use input.unsqueeze(0) to add a fake batch dimension.

### Before proceeding further, let’s recap all the classes you’ve seen so far.

Recap:
* torch.Tensor - A multi-dimensional array with support for autograd operations like backward(). Also holds the gradient w.r.t. the tensor.
* nn.Module - Neural network module. Convenient way of encapsulating parameters, with helpers for moving them to GPU, exporting, loading, etc.
* nn.Parameter - A kind of Tensor, that is automatically registered as a parameter when assigned as an attribute to a Module.
* autograd.Function - Implements forward and backward definitions of an autograd operation. Every Tensor operation creates at least a single Function node that connects to functions that created a Tensor and encodes its history.

In [58]:
# Defining the loss

inp = torch.randn(1, 1, 32, 32) # make a random input
out = Convnet(inp) # run it through your defined neural network
Ground_truth = torch.randn(1, 10) # build a random ground truth
Ground_truth = Ground_truth.view(1, -1) # make the GT as same shape as output
LOSS = torch.nn.MSELoss() # make the loss instance
loss = LOSS(out, Ground_truth)
print(loss)

tensor(0.8267, grad_fn=<MseLossBackward>)


## Important:

So, when we call loss.backward(), the whole graph is differentiated w.r.t. the loss, and all Tensors in the graph that has requires_grad=True will have their .grad Tensor accumulated with the gradient.

In [69]:
print(loss.grad_fn) #MSE Loss
print(loss.grad_fn.next_functions[0][0]) #Linear
print(loss.grad_fn.next_functions[0][0].next_functions[0][0]) #RELU

<MseLossBackward object at 0x7f94d40e56a0>
<AddmmBackward object at 0x7f94d4a10080>
<AccumulateGrad object at 0x7f94d4a10a58>


## Backpropagation

In [70]:
# the first step of backpropagation is to zeroing the existing gradient values
Convnet.zero_grad()     # zeroes the gradient buffers of all parameters

print('conv1.bias.grad before backward')
print(Convnet.conv1.bias.grad)

loss.backward()

print('conv1.bias.grad after backward')
print(Convnet.conv1.bias.grad)


conv1.bias.grad before backward
tensor([0., 0., 0., 0., 0., 0.])
conv1.bias.grad after backward
tensor([-0.0028,  0.0024, -0.0189,  0.0066,  0.0020,  0.0076])


Great ! Now that we are done with backpropagation, we need to update the weights

In [71]:
learning_rate = 0.01 #setting the learning rate 
for f in Convnet.parameters():
    f.data.sub_(f.grad.data * learning_rate) 

# In the convnet parameters, we have all the learnable parameters or weights. Now, by using the sub_ module, we are just subtracting the multiplication of the gradient value and learning rate.
# weight = weight - learning_rate * gradient

## Putting all the backprop and update together:

In [74]:
#create an optimizer instance
optim = torch.optim.SGD(Convnet.parameters(), lr=0.01)


#The following portion will go to the training loop
optim.zero_grad()
output = Convnet(inp)
loss = LOSS(output, Ground_truth)
loss.backward()
optim.step() #will do the update