# Managing network

Is a class that allows you to define complex neural networks in Torch. Simply use this class as a descendant.

In [2]:
import torch
from torch import nn

## Sequential

You can use `torch.nn.Sequential` to combine multiple network layers into a sequential chain. Find out more in the [specific page](managing_network/sequential.ipynb).

---

The following cell demonstrates a basic example where a linear transformation is applied to the input, followed by a ReLU activation function.

In [None]:
size = 3

sequential = torch.nn.Sequential(
    torch.nn.Linear(size, size, bias=False),
    torch.nn.ReLU()
)

X = torch.randn([3, 3])
sequential(X)

tensor([[0.0000, 0.0000, 0.8781],
        [0.4362, 0.0000, 0.7350],
        [0.0000, 0.0000, 1.1225]], grad_fn=<ReluBackward0>)

## Separate class

You can define a neural network as a separate class, which allows you to add custom logic for initialization or network-specific procedures. To create a network class, follow these rules:

- **Inherit from `torch.nn.Module`:** This establishes your class as a PyTorch module, providing access to its functionality.
- **Call `super().__init__()` in the constructor:** This initializes the base `nn.Module` class, ensuring proper setup.
- **Define a `forward` method:** This method implements the computational procedure of your network. It defines how input data flows through your layers to produce output. 

---

The following cell defines a set of Linear layers whose size is determined during class creation. The forward method standardizes the data before applying the network. 

In [None]:
class ExampleNetwork(torch.nn.Module):
    def __init__(self, layers_number: int, neurons: int):

        super().__init__()

        self.network = torch.nn.Sequential(*[
            torch.nn.Linear(neurons, neurons)
            for i in range(layers_number)
        ])
    
    def forward(self, X: torch.Tensor):
        X = (X - X.mean(axis=0, keepdim=True))/X.std(axis=0, keepdim=True)
        return self.network(X)

Let's check if the network we've defined works as expected. 

In [None]:
ExampleNetwork(layers_number=10, neurons=3)(X = torch.randn([5, 3]))

tensor([[-0.2482,  0.0882,  0.4507],
        [-0.2465,  0.0897,  0.4466],
        [-0.2531,  0.0827,  0.4587],
        [-0.2463,  0.0899,  0.4459],
        [-0.2461,  0.0892,  0.4429]], grad_fn=<AddmmBackward0>)

## Parameters

To be able to optimize network properly you need tools that allows to access paraterers and manage them. As this section we consider typical methods that help to manage model parameters in torch.

### Access parameters

To access model parameters, use the `torch.nn.Module.parameters` method. This method returns a generator that iterates over the parameters of all layers in the network.

Check the official [documentation on the `parameters` method](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.parameters).

---

In the following cell we have an empty `nn.Module` - so when we try to unpack it generator to list we have just an empty list:

In [6]:
class EmptyNetwork(nn.Module):
    pass
empty_network = EmptyNetwork()
[i for i in empty_network.parameters()]

[]

This cell implements such a descendant of the `nn.Module`, taking some parameters from its files. To be more specific, there are two fully connected layers defined here. So we end up with four tensors, two matrices for fully connected layers and their biases:

In [7]:
class ParametersNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.foo = nn.Linear(3, 3)
        self.bar = nn.Linear(5, 5)

network = ParametersNetwork()
for i in network.parameters():
    print(i.data)

tensor([[-0.4889, -0.2448, -0.1750],
        [ 0.0770, -0.0333,  0.2421],
        [-0.0755, -0.2302, -0.4851]])
tensor([ 0.1668, -0.5771,  0.4508])
tensor([[ 0.1828, -0.3526, -0.3598,  0.4468, -0.2286],
        [-0.0492,  0.3426,  0.2613,  0.2133, -0.2792],
        [-0.2052,  0.2514,  0.0616,  0.4382, -0.2944],
        [-0.2796, -0.0471, -0.4185,  0.4359,  0.2697],
        [ 0.3577,  0.4372, -0.0179,  0.1575,  0.2003]])
tensor([-0.4275,  0.0130,  0.4131, -0.2934,  0.3826])


### Requires grad

You can manipulate the set of parameters that will compute gradients in a neural network. You can do this directly by accessing the parameters and setting their `requires_grad` attribute to `False`. However, there is a `requires_grad_()` method that allows you to set the gradient property for all weights of an `nn.Module`.

---

The following cell defines a network and prints the `requires_grad` attribute of its weights.

In [86]:
torch.manual_seed(10)
model = torch.nn.Sequential(
    torch.nn.Linear(in_features=10, out_features=10),
    torch.nn.ReLU(),
    torch.nn.Linear(in_features=10, out_features=1),
)

for name, parameters in model.named_parameters():
    print(name, parameters.requires_grad)

0.weight True
0.bias True
2.weight True
2.bias True


By default, all parameters require gradients. The following cell applies `requires_grad_(False)` to the entire network and then sets `requires_grad(True)` for just one of the layers.

In [92]:
model.requires_grad_(False)
model[2].requires_grad_(True)

for name, parameters in model.named_parameters():
    print(name, parameters.requires_grad)

0.weight False
0.bias False
2.weight True
2.bias True


As a result, only the corresponding parameters will require gradients. During the optimization process, only those parameters that require gradients will be updated.