In [6]:
import torch.nn as nn

 Within the nn package, there is a class called Module, and it is the base class for all of neural network modules which includes layers.

This means that all of the layers in PyTorch extend the nn.Module class and inherit all of PyTorch’s built-in functionality within the nn.Module class. In OOP this concept is known as inheritance.

## PyTorch nn.Modules Have A forward() Method
When we pass a tensor to our network as input, the tensor flows forward though each layer transformation until the tensor reaches the output layer. This process of a tensor flowing forward though the network is known as a forward pass.

## Extending PyTorch’s nn.Module Class

In [7]:
class Network:
    def __init__(self):
        self.layer = None

    def forward(self, t):
        t = self.layer(t)
        return t

This is a good start, but the class hasn’t yet extended the nn.Module class. To make our Network class extend nn.Module, we must do two additional things:

Specify the nn.Module class in parentheses on line 1.

Insert a call to the super class constructor on line 3 inside the constructor.

In [8]:
class Network(nn.Module):
    def __init__(self):
        super.__init__()
        self.layer = None

    def forward(self, t):
        t = self.layer(t)
        return t

## Define The Network’s Layers As Class Attributes

In [None]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)

        self.fc1 = nn.Linear(in_features=12 * 4 * 4, out_features=120)
        self.fc2 = nn.Linear(in_features=120, out_features=60)
        self.out = nn.Linear(in_features=60, out_features=10)

    def forward(self, t):
        # implement the forward pass
        return t

## Learnable Parameters

In [10]:
network = Network()

In [11]:
print(network)

Network(
  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 12, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=192, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=60, bias=True)
  (out): Linear(in_features=60, out_features=10, bias=True)
)


### Accessing the nework's layers

In [13]:
network.conv1

Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))

In [14]:
network.conv2

Conv2d(6, 12, kernel_size=(5, 5), stride=(1, 1))

In [15]:
network.fc1

Linear(in_features=192, out_features=120, bias=True)

In [16]:
network.fc2

Linear(in_features=120, out_features=60, bias=True)

In [17]:
network.out

Linear(in_features=60, out_features=10, bias=True)

### Accessing the network's weights

In [18]:
network.conv1.weight

Parameter containing:
tensor([[[[ 0.1701, -0.0298,  0.0305, -0.1445,  0.1099],
          [ 0.0904, -0.1352,  0.0419, -0.1243,  0.1221],
          [-0.1063, -0.1332, -0.1518,  0.1483, -0.0891],
          [-0.0921,  0.1977, -0.1594,  0.1403, -0.1695],
          [ 0.0449, -0.1578, -0.0671,  0.0751, -0.0445]]],


        [[[-0.0573,  0.1034, -0.1148, -0.1492,  0.0107],
          [ 0.1031,  0.0257,  0.0414, -0.0014, -0.0297],
          [-0.0612, -0.1360,  0.1444, -0.1269,  0.0047],
          [ 0.0316, -0.0198, -0.0838,  0.1712,  0.0063],
          [ 0.1424, -0.1660,  0.0058, -0.0915,  0.0710]]],


        [[[ 0.0878,  0.1660, -0.1729,  0.1717, -0.1847],
          [-0.1135,  0.1537, -0.1290,  0.0266,  0.0762],
          [-0.0104,  0.0590,  0.1039, -0.1328, -0.0697],
          [-0.0017,  0.0927, -0.1520,  0.0626,  0.1625],
          [-0.1531,  0.1145,  0.0451,  0.0878, -0.0158]]],


        [[[ 0.0484, -0.1029, -0.1885, -0.0767,  0.0413],
          [ 0.1535, -0.1588,  0.1933,  0.0780,  0.0369

## Weight tensor shape

### Convolutional Layers

In [20]:
network.conv1

Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))

In [19]:
network.conv1.weight.shape

torch.Size([6, 1, 5, 5])

In [21]:
network.conv2

Conv2d(6, 12, kernel_size=(5, 5), stride=(1, 1))

In [22]:
network.conv2.weight.shape

torch.Size([12, 6, 5, 5])

Our tensors are rank-4 tensors. The first axis represents the number of filters. The second axis represents the depth of each filter which corresponds to the number of input channels being convolved.

The last two axes represent the height and width of each filter. We can pull out any single filter by indexing into the weight tensor’s first axis.

### Linear Layers

In [23]:
network.fc1

Linear(in_features=192, out_features=120, bias=True)

In [24]:
network.fc1.weight.shape

torch.Size([120, 192])

In [26]:
network.fc2

Linear(in_features=120, out_features=60, bias=True)

In [27]:
network.fc2.weight.shape

torch.Size([60, 120])