# Builders' Guide

In [23]:
import torch
from torch import nn
from torch.nn import functional as F

In [24]:
# this uses Pytorch to generate a network with one fully connected hidden layer with 256 units, and ReLU activation, followed by a fully connected output layer with 10 units
net = nn.Sequential(nn.LazyLinear(256), nn.ReLU(), nn.LazyLinear(10))
X = torch.rand(2, 20)
net(X).shape



torch.Size([2, 10])

`nn.Sequential` defines a special kind of `Module`. It maintains an ordered list of constituent `Modules`. The forward propagation chains each module in the list together, passing the output of each as input to the next. When we call `net(X)`, it actually is a shorthand for `net.__call__(X)`. 

### A Custom Module

The basic functionalities that each module must provide are:

1. Ingest input data as arguments to its forward propagation method
2. Generate an output by having forward propagation method return a value. 
3. Calculate the gradient of its output with respect to its input, via the backpropagation method
4. Store and provide access to those parameters necessary to execute the forward propagation computation
5. Initialize model parameters are needed

We code up a module from scratch corresponding to an multilayer perceptron with one hidden layer with 256 hidden units, and a 10-dimensional output layer. We inherit from `nn.Module` for most methods, and only define our own constructor and the forward propagation method

In [25]:
class MLP(nn.Module):

    def __init__(self):
        # call the constructor of the parent class nn.Module to perform the initialization
        # unless we implement a new layer, we do not need to worry about the backpropagation method or parameter initialization 
        super().__init__()
        self.hidden = nn.LazyLinear(256)
        self.out = nn.LazyLinear(10)
    
    # define the forward propagation of the model
    def forward(self, X):
        return self.out(F.relu(self.hidden(X)))

In [26]:
net = MLP()
net(X).shape

torch.Size([2, 10])

### The Sequential Module

Designed to daisy-chain other modules together. To build our own `MySequential` we just need to define two key methods:
1. A method to append modules one by one to a list
2. A forward propagation method to pass an input through the chain of modules, in the same order as they were appended.

In [27]:
# our implementation of Sequential
class MySequential(nn.Module):

    def __init__(self, *args):
        super().__init__()
        for idx, module in enumerate(args):
            self.add_module(str(idx), module)
    
    def forward(self, X):
        for module in self.children():
            X = module(X)
        return X

In [28]:
# implmenting MLP using our MySequential class
net = MySequential(nn.LazyLinear(256), nn.ReLU(), nn.LazyLinear(10))
net(X).shape

torch.Size([2, 10])

When we want greater flexibility, executing Python's control flow within forward propagation, or perform arbitrary mathematical functions rather than predefined neural network layers. 

We might also want to incorporate terms that are not result of previous layers. These are called constant parameters. Here is MLP with constant parameters:

In [29]:
class FixedHiddenMLP(nn.Module):

    def __init__(self):
        super().__init__()
        # Random weight parameters that will not compute gradients and therefore keep constant during training
        self.rand_weight = torch.rand((20, 20))
        self.linear = nn.LazyLinear(20)

    def forward(self, X):
        X = self.linear(X)
        X = F.relu(X @ self.rand_weight + 1)
        # Reuse the fully connected later.
        X = self.linear(X)
        # control flow
        while X.abs().sum() > 1:
            X /= 2
        return X.sum()

In `FixedHiddenMLPModel`, we implement a hidden layer whose weights are initialized randomly and kept constant. We also used a while loop testing on the condition its l1. This shows that we can integrate arbitrary code into the flow of neural network computations.

In [30]:
net = FixedHiddenMLP()
net(X)

tensor(-0.2376, grad_fn=<SumBackward0>)

We can mix and match the assembly of modules in various ways

In [32]:
class NestMLP(nn.Module):

    def __init__(self):
        super().__init__()
        self.net = nn.Sequential(nn.LazyLinear(64), nn.ReLU(), nn.LazyLinear(32), nn.ReLU())
        self.linear = nn.LazyLinear(16)

    def forward(self, X):
        return self.linear(self.net(X))
    
chimera = nn.Sequential(NestMLP(), nn.LazyLinear(20), FixedHiddenMLP())
chimera(X)



tensor(-0.0022, grad_fn=<SumBackward0>)