<a href="https://colab.research.google.com/github/pavanraja753/PyTorch_Learning/blob/main/Writing_a_Pytorch_Module.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
import torch
import torchvision
import torch.nn as nn
from torchvision.datasets import MNIST
import torch.nn.functional as F


- We now have all the bricks needed to build our first convolutional network from scratch. The last technical point is the tensor shape between layers.

- Both the convolutional and pooling layers take as input batches of samples, each one being itself a 3d tensor `C × H × W` .
The output has the same structure, and tensors have to be explicitly reshaped before being forwarded to a fully connected layer.

In [None]:
mnist = MNIST('./data/mnist/', train = True, download = True)
d = mnist.train_data
d.size()

In [4]:
x = d.view(d.size(0),1,d.size(1),d.size(2))
x2 = d.unsqueeze(1)

In [5]:
print(x.shape)
print(x2.shape)

torch.Size([60000, 1, 28, 28])
torch.Size([60000, 1, 28, 28])


In [6]:
x = x.view(x.size(0),-1)
x.size()

torch.Size([60000, 784])


Using `-1` for one of the `Tensor.view()’`s arguments automatically computes the proper value, given the original tensor size and the other dimensions.

- PyTorch offers a sequential container moduele `torch.nn.Sequential` to build simple architectures.

- For instance a MLP with a `10` dimension input, `2` dimension output, `ReLU` activation and two hidden layers of dimensions `100` and `50` can be written as:



In [7]:
model = nn.Sequential(
    nn.Linear(10,100), nn.ReLU(),
    nn.Linear(100,50), nn.ReLU(),
    nn.Linear(50,2)
)

However for any model of reasonable complexity, the best is to write a sub-class of `torch.nn.Module.`

To create a `Module`, one has to inherit from the base class and implement the constructor `__init__(self, ...)` and the forward pass `forward(self, x).`

In [8]:
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=5)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=5)
        self.fc1 = nn.Linear(256, 200)
        self.fc2 = nn.Linear(200, 10)

    def forward(self,x):
        x =  F.relu(F.max_pool2d(self.conv1(x),kernel_size=3, stride=3))
        x =  F.relu(F.max_pool2d(self.conv2(x),kernel_size=2, stride=2))
        x =  x.view(-1,256)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

Inheriting from `torch.nn.Module` provides many mechanisms implemented in the superclass

First, the `(...)` operator is redefined to call the `forward(...)` methods and run additional operations. The forward pass should be executed through this operator and not by calling `forward` explicitely

Using the class`Net` we just defined

In [9]:
model = Net()
input = torch.empty(12,1,28,28).normal_()
output= model(input)
print(output.size())

torch.Size([12, 10])


Also, the `Parameters` added as class attributes, or from modules added as class attributes, are seen by `Module.parameters()`

In [10]:
for n, k in model.named_parameters():
    print(n,k.size())

conv1.weight torch.Size([32, 1, 5, 5])
conv1.bias torch.Size([32])
conv2.weight torch.Size([64, 32, 5, 5])
conv2.bias torch.Size([64])
fc1.weight torch.Size([200, 256])
fc1.bias torch.Size([200])
fc2.weight torch.Size([10, 200])
fc2.bias torch.Size([10])


*Parameters added in dictionaries or arrays are not seen.*


In [13]:
class Buggy(nn.Module): 
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(1, 32, kernel_size=5) 
        self.param = nn.Parameter(torch.zeros(123, 456)) 
        self.other_stuff = [ nn.Linear(543, 21) ]
model = Buggy()
for n, k in model.named_parameters(): 
    print(n, k.size())

param torch.Size([123, 456])
conv.weight torch.Size([32, 1, 5, 5])
conv.bias torch.Size([32])


A simple option is to add modules in a `torch.nn.ModuleList,` which is a list of modules properly dealt with by PyTorch’s machinery.

In [14]:
class NotBuggy(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(1,32,kernel_size = 5)
        self.param = nn.Parameter(torch.zeros(123,456))
        self.other_stuff = nn.ModuleList()
        self.other_stuff.append(nn.Linear(543,21))

model = NotBuggy()


for n, k in model.named_parameters():
    print(n,k.size())

param torch.Size([123, 456])
conv.weight torch.Size([32, 1, 5, 5])
conv.bias torch.Size([32])
other_stuff.0.weight torch.Size([21, 543])
other_stuff.0.bias torch.Size([21])


- As long as you use autograd-compliant operations, the backward pass is implemented automatically.

- This is crucial to allow the optimization of the Parameters with gradient descent.