In [None]:
%pip install torch

# Understanding PyTorch `nn.Module` 

This code shows how to define a custom neural network model using PyTorch’s `nn.Module` class. It also demonstrates various model-related methods such as parameter registration, buffer registration, and mode switching.

---


## Base Classes
They form the foudation of `Pytorch's Neural Network's` building blocks and utilities.

---

## Module
### Code: `nn.module`
### Common Parameters And Attributes:

| Attribute/Method           | Type   | Description                                                   |
|----------------------------|--------|---------------------------------------------------------------|
| `.training`                | bool   | Indicates whether the model is in training mode.              |
| `.forward()`               | method | Define the forward computation of your model.                 |
| `.parameters()`            | method | Return an iterator over trainable parameters.                 |
| `.children()`              | method | Yields immediate child modules.                               |
| `.modules()`               | method | Recursively yields all modules.                               |
| `.state_dict()`            | method | Returns a dictionary of model's learnable parameters.         |
| `.load_state_dict()`       | method | Loads the state dict into the model.                          |
| `.register_buffer(name, tensor)` | method | Registers a buffer (non-trainable tensor, like running stats). |
| `.register_parameter(name, param)` | method | Register a parameter manually.                                 |
| `.add_module(name, module)` | method | Adds a submodule manually.                                    |
| `.to(device)`              | method | Moves model and parameters to a device.                       |
| `.eval()` / `.train()`    | method | Switches between evaluation and training mode.                |


## 1. Importing Libraries

```python
import torch
import torch.nn as nn
```

---


## 2. Defining a Custom Model Class

```
class MyModel(nn.Module):
```
Here, We are creating a new neural network class that inherits from `nn.Module`, the base class for all Pytorch models.

### Inside `__init__`: Defining the Layers

```
def __init__(self, input_size, hidden_size, output_size):
    super(MyModel, self).__init__()
```
- `input_size:` Number of features in the input.
- `hidden_size:` Number of neurons in the hidden layer.
- `output_size:` Number of output units.

### Manual adding of Three Layers:
```
self.add_module('fc1', nn.Linear(input_size, hidden_size))
self.add_module('relu', nn.ReLU())
self.add_module('fc2', nn.Linear(hidden_size, output_size))
```

- `fc1:` Fully connected layer from input to hidden.
- `relu:` ReLU activation Function.
- `fc2:` Fully Connected layer from hidden to Output

Using `.add_module()` allows you to add layers with custom names explicilty.

### Alternative:
```
self.fc1= nn.Linear(...)
```

### Registering a Buffer:

```
self.register_buffer('running_mean', torch.zeros(output_size))

```
- Buffers are non-trainable tensors (not updated during backpropagation).
- Common use: storing running statistics (like in BatchNorm).
- Here we are creating `running_mean` tensor filled with zeros.

`running_mean:`  <a href='#'> Wil be added</a>

### Registering a Custom Parameter

```
param = nn.Parameter(torch.randn(output_size), requires_grad=True)
self.register_parameter('custom_param', param)
```

- This creates a trainable tensor, not tied to specific layer.
- `nn.Parameter` makes sure it's included in `.parameters()`.
- `register_parameter()` adds it to the model manually.

---

## 3. Defining the Forward Pass

```
def forward(self, x):
    x = self.fc1(x)
    x = self.relu(x)
    x = self.fc2(x)
    return x
```

- This defines how the input `x` flows through the model.
- Layers are applied in sequence: Linear -> ReLU -> Linear.




## 4. Using the model

###  Instantiate the Model

```
model = MyModel(input_size=4, hidden_size=8, output_size=2)
```
Creates an instance of your model with specified dimensions.

---

### Move to GPU or CPU

```
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
```
- Moves model to a GPU (if available) for faster computation.
- Otherwise, it stays on the CPU.

---

### Set to Training Mode

```
model.train()
print(f"In training mode? {model.training}")
```

- `.train()` enables training behaviors (e.g., Dropout, BatchNorm).
- `model.training` returns `True` when in training mode.

---

### Dummy Input + Forward Pass

```
x = torch.randn(1, 4).to(device)
output = model(x)
print("Output:", output)
```
- Creates a random input tensor with 4 features.
- Passes it through the model.

---

###  Inspect Parameters

```
for name, param in model.named_parameters():
    print(f"{name}: {param.shape}")
```

- Lists all trainable parameters, including manually registered ones.
- Each parameter’s name and shape is printed.

---

### List Layers (Children)

```
for child in model.children():
    print(child)
```
- `.children()` returns immediate child module(not nested ones).

---

### List All modules(Recursively)

```
for module in model.modules():
    print(module)
```
Lists all submodules, including nested ones and the model itself.

---

### Save Model State

```
state = model.state_dict()
print(state.keys())
```
- `.state_dict()` returns a dictionary of all learnable parameters and buffers.
- You can use this to save your model later.

---

###  Load Model State

```
model.load_state_dict(state)
```
Load saved parameters back into the model.

---

### Switch to Evaluation Mode

```
model.eval()
print(model.training)
```
- `.eval()` switches the model to evaluation mode (e.g., disables Dropout).
- `model.training` will now return False.

---

### Access Buffer & Custom Parameter

```
print(model.running_mean)
print(model.custom_param)
```
- `running_mean` is the registered buffer.
- `custom_param` is the manually added trainable parameter.

---


### Example

In [3]:
import torch
import torch.nn as nn

class MyModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(MyModel, self).__init__()
        
        # Add modules manually
        self.add_module('fc1', nn.Linear(input_size, hidden_size))
        self.add_module('relu', nn.ReLU())
        self.add_module('fc2', nn.Linear(hidden_size, output_size))

        # Register a buffer (e.g. running mean – not a parameter)
        self.register_buffer('running_mean', torch.zeros(output_size))

        # Register a custom parameter (manually)
        param = nn.Parameter(torch.randn(output_size), requires_grad=True)
        self.register_parameter('custom_param', param)

    def forward(self, x):
        # Forward pass using manually added modules
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x


In [4]:

model = MyModel(input_size=4, hidden_size=8, output_size=2)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

model.train()
print(f"In training mode? {model.training}")

x = torch.randn(1, 4).to(device)

output = model(x)

In training mode? True


In [5]:

print("Output:", output)

print("\nTrainable Parameters:")
for name, param in model.named_parameters():
    print(f"{name}: {param.shape}")

print("\nImmediate Children:")
for child in model.children():
    print(child)

print("\nAll Modules:")
for module in model.modules():
    print(module)

state = model.state_dict()
print("\nState Dict Keys:")
print(state.keys())

model.load_state_dict(state)
model.eval()
print(f"\nIn training mode after .eval()? {model.training}")

print("\nRegistered Buffer:")
print(model.running_mean)

print("\nCustom Registered Parameter:")
print(model.custom_param)


Output: tensor([[-0.1096, -0.0187]], grad_fn=<AddmmBackward0>)

Trainable Parameters:
custom_param: torch.Size([2])
fc1.weight: torch.Size([8, 4])
fc1.bias: torch.Size([8])
fc2.weight: torch.Size([2, 8])
fc2.bias: torch.Size([2])

Immediate Children:
Linear(in_features=4, out_features=8, bias=True)
ReLU()
Linear(in_features=8, out_features=2, bias=True)

All Modules:
MyModel(
  (fc1): Linear(in_features=4, out_features=8, bias=True)
  (relu): ReLU()
  (fc2): Linear(in_features=8, out_features=2, bias=True)
)
Linear(in_features=4, out_features=8, bias=True)
ReLU()
Linear(in_features=8, out_features=2, bias=True)

State Dict Keys:
odict_keys(['custom_param', 'running_mean', 'fc1.weight', 'fc1.bias', 'fc2.weight', 'fc2.bias'])

In training mode after .eval()? False

Registered Buffer:
tensor([0., 0.])

Custom Registered Parameter:
Parameter containing:
tensor([-0.1846, -0.4407], requires_grad=True)
