# OOP Networks

Let's create deep networks using ```nn.Module``` from torch.

In [2]:
import torch
import numpy as np

from torch import nn as nn

Let's first make a network prototype. The class must initialise the network and have a module that executes a forward pass. We will rely on ```nn.Module``` to perform backpropagation.

In [3]:
class PrototypeNetwork:
    def __init__(self):
        self.layer = None
    
    def forward(self, x):
        x = self.layer(x)
        return x

Let's use the ```nn.Module``` class and initialise the parent class using the ```super()``` constructor. We will define a convolutional network with:
- Convolutional layer with 6 output channels
- Convolutional layer with 12 output channels
- Dense layer with 120 output nodes
- Dense layer with 60 output nodes
- Dense layer with 10 output nodes

In [4]:
class ConvNetwork(nn.Module):
    
    def __init__(self):
        super(ConvNetwork, self).__init__()
        
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
        
        self.dense1 = nn.Linear(in_features=12*4*4, out_features=120)
        self.dense2 = nn.Linear(in_features=120, out_features=60)
        self.out = nn.Linear(in_features=60, out_features=10)
        
    def forward(self, x):
        # Revisit later
        return x

In [5]:
network = ConvNetwork()
network

ConvNetwork(
  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 12, kernel_size=(5, 5), stride=(1, 1))
  (dense1): Linear(in_features=192, out_features=120, bias=True)
  (dense2): Linear(in_features=120, out_features=60, bias=True)
  (out): Linear(in_features=60, out_features=10, bias=True)
)

We can see the weights in the layers that are randomly initialised.

In [6]:
print(network.conv1.weight.shape)
print(network.conv1.weight)

torch.Size([6, 1, 5, 5])
Parameter containing:
tensor([[[[ 0.1889, -0.1850, -0.0487, -0.0466,  0.1180],
          [-0.1314, -0.1771,  0.0952,  0.0582,  0.0288],
          [ 0.0774,  0.0185, -0.1531,  0.0924,  0.0270],
          [ 0.1145, -0.0294, -0.1455,  0.0755, -0.1431],
          [-0.0671,  0.0907,  0.0849, -0.0041,  0.1573]]],


        [[[ 0.1938, -0.0258, -0.0250, -0.1817,  0.0395],
          [-0.0681, -0.0891,  0.1766,  0.0535, -0.1912],
          [-0.0355, -0.0394, -0.1643, -0.1877, -0.0632],
          [ 0.1463,  0.1823,  0.0186,  0.1890,  0.0853],
          [-0.1790, -0.0165,  0.1380,  0.1172, -0.1004]]],


        [[[-0.1974, -0.0381, -0.1312, -0.1662, -0.1308],
          [-0.0140,  0.0047,  0.0836,  0.0262, -0.0043],
          [ 0.0654, -0.1568, -0.1692, -0.1703,  0.1845],
          [ 0.0156, -0.0285,  0.1867, -0.1354, -0.0268],
          [ 0.1510, -0.0729,  0.1017, -0.0110,  0.0435]]],


        [[[ 0.1860, -0.0351,  0.1267,  0.0503,  0.0356],
          [-0.1416, -0.0794, 

Notice that the size of the weights matches the specifications. Also, the weight tensor has an attribute ```requires_grad``` that flags the weight variable to be included in that computational graph that computes the derivatives during backpropagation. This weight is called a **learnable parameter**.

We can check all the learnable parameters and their dimensions.

In [7]:
for name, param in network.named_parameters():
    print(name, '\t\t', param.shape)

conv1.weight 		 torch.Size([6, 1, 5, 5])
conv1.bias 		 torch.Size([6])
conv2.weight 		 torch.Size([12, 6, 5, 5])
conv2.bias 		 torch.Size([12])
dense1.weight 		 torch.Size([120, 192])
dense1.bias 		 torch.Size([120])
dense2.weight 		 torch.Size([60, 120])
dense2.bias 		 torch.Size([60])
out.weight 		 torch.Size([10, 60])
out.bias 		 torch.Size([10])
