# Layers and Modules

While neurons, layers and models give use enough abstractions to implement a neural network, it turns out that we often find it convenient to speak about components that are larger than and individual layer but smaller than the entire model. That is, modules. A huge network may be composed of a large amount of layers with repeating patterns and we can define them as a module.

In [1]:
import torch
from torch import nn
from torch.nn import functional as F

# nn.Sequential is a special container of modules
# Define a simple module using nn.Sequential
net = nn.Sequential(nn.LazyLinear(256), nn.ReLU(), nn.LazyLinear(10))

X = torch.rand(2, 20)
net(X).shape



torch.Size([2, 10])

## 1. A Custom Module

In [2]:
# every module should be a sub-class of nn.Module
class MLP(nn.Module):
    def __init__(self):
        super().__init__()
        self.hidden = nn.LazyLinear(256)
        self.out = nn.LazyLinear(10)
    
    # override the forward method
    def forward(self, X):
        return self.out(F.relu(self.hidden(X)))

In [3]:
net = MLP()
net(X).shape



torch.Size([2, 10])

## 2. The Sequential Module
We use nn.Sequential a lot. It is a special module essentially.

In [4]:
class MySequential(nn.Module):
    def __init__(self, *args):
        super().__init__()
        for idx, module in enumerate(args):
            self.add_module(str(idx), module)
    
    def forward(self, X):
        for module in self.children():
            X = module(X)
        return X

In [5]:
net = MySequential(
    nn.LazyLinear(256),
    nn.ReLU(),
    nn.LazyLinear(10)
)
net(X).shape



torch.Size([2, 10])

## 3. Executing Code in forward propagation

In [6]:
class FixedHiddenMLP(nn.Module):
    def __init__(self):
        super().__init__()
        self.rand_weight = torch.rand((20, 20))
        self.linear = nn.LazyLinear(20)
    
    def forward(self, X):
        X = self.linear(X)
        # Some computation
        X = F.relu(X @ self.rand_weight + 1)
        # Reuse the fully connected layer. This is equivalent to sharing
        # parameters with two fully connected layers
        X = self.linear(X)
        # Control flow
        while X.abs().sum() > 1:
            X /= 2
        return X.sum()

In [7]:
net = FixedHiddenMLP()
net(X)



tensor(0.0479, grad_fn=<SumBackward0>)