<div style="text-align:left;">
  <a href="https://code213.tech/" target="_blank">
    <img src="../code213.PNG" alt="code213">
  </a>
  <p><em>prepared by Latreche Sara</em></p>
</div>

# Neural Network Modules  

PyTorch provides the **`torch.nn`** module, which contains:  
- **Predefined layers** (linear, convolution, dropout, etc.).  
- **Activation functions** (ReLU, Sigmoid, Tanh, etc.).  
- **Loss functions** (MSE, CrossEntropy, etc.).  
- **Model containers** (e.g., `nn.Sequential`, custom `nn.Module` classes).  

The `torch.nn.Module` class is the **base class** for all neural network models in PyTorch.  
Every custom neural network you build will extend this class and implement the `forward()` method.  

This notebook will introduce how to use **nn.Module** to build neural networks step by step.  


## Table of Contents  

- [1 - Introduction to `nn.Module`](#1)  
- [2 - Linear Layers](#2)  
- [3 - Activation Functions](#3)  
- [4 - Building a Model with `nn.Sequential`](#4)  
- [5 - Custom Neural Networks with `nn.Module`](#5)  
- [6 - Loss Functions](#6)  
- [7 - Practice Exercises](#7)  


<a name='1'></a>
## 1 - Introduction to `nn.Module`  

In PyTorch, all neural networks are built using the base class **`torch.nn.Module`**.  
- A **module** can be a single layer (e.g., `nn.Linear`) or a complete model.  
- When you create your own network, you subclass `nn.Module` and define the layers inside `__init__()`.  
- The forward pass of the network is defined inside the `forward()` method. 

In [1]:
import torch
import torch.nn as nn

class SimpleModel(nn.Module):
    def __init__(self):
        super(SimpleModel, self).__init__()
        self.linear = nn.Linear(2, 1)  # Linear layer with 2 inputs and 1 output

    def forward(self, x):
        return self.linear(x)

# Create an instance
model = SimpleModel()

# Input: batch of 3 samples, each with 2 features
x = torch.tensor([[1.0, 2.0],
                  [2.0, 3.0],
                  [3.0, 4.0]])

# Forward pass
output = model(x)
print("Model output:\n", output)

Model output:
 tensor([[-0.0127],
        [-0.0882],
        [-0.1637]], grad_fn=<AddmmBackward0>)


<a name='2'></a>
## 2 - Linear Layers  

The **linear layer** (also called a fully connected or dense layer) is one of the most fundamental building blocks in neural networks.  

In PyTorch, it is implemented as:  

```python
nn.Linear(in_features, out_features)

In [2]:
import torch.nn as nn

# Linear layer with 3 input features → 2 output features
linear = nn.Linear(3, 2)

# Random input: batch of 4 samples, each with 3 features
x = torch.randn(4, 3)

# Forward pass
y = linear(x)

print("Input:\n", x)
print("Output:\n", y)
print("Weights:\n", linear.weight)
print("Bias:\n", linear.bias)

Input:
 tensor([[-0.4059,  0.2126,  0.0979],
        [-0.2269, -0.1264, -0.8216],
        [ 1.0402, -0.0248,  1.8817],
        [-0.7757,  0.3093, -0.5460]])
Output:
 tensor([[-0.4505, -0.2330],
        [-0.5220,  0.1017],
        [-0.2077, -0.8706],
        [-0.4882,  0.0284]], grad_fn=<AddmmBackward0>)
Weights:
 Parameter containing:
tensor([[ 0.2937,  0.4725, -0.0392],
        [ 0.1800,  0.3476, -0.4571]], requires_grad=True)
Bias:
 Parameter containing:
tensor([-0.4279, -0.1891], requires_grad=True)


<a name='3'></a>
## 3 - Activation Functions  

Activation functions introduce **non-linearity** into neural networks, which allows them to model complex relationships.  
In PyTorch, many activations are available in `torch.nn`.  

### Common Activation Functions  

1. **ReLU (Rectified Linear Unit)**  
$$
f(x) = \max(0, x)
$$  

2. **Sigmoid**  
$$
f(x) = \frac{1}{1 + e^{-x}}
$$  

3. **Tanh (Hyperbolic Tangent)**  
$$
f(x) = \tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}
$$  


In [3]:
# Sample input
x = torch.tensor([[-1.0, 0.0, 1.0, 2.0]])

# Define activations
relu = nn.ReLU()
sigmoid = nn.Sigmoid()
tanh = nn.Tanh()

print("Input:", x)
print("ReLU:", relu(x))
print("Sigmoid:", sigmoid(x))
print("Tanh:", tanh(x))

Input: tensor([[-1.,  0.,  1.,  2.]])
ReLU: tensor([[0., 0., 1., 2.]])
Sigmoid: tensor([[0.2689, 0.5000, 0.7311, 0.8808]])
Tanh: tensor([[-0.7616,  0.0000,  0.7616,  0.9640]])


<a name='4'></a>
## 4 - Building a Model with `nn.Sequential`

In PyTorch, the **`nn.Sequential`** container allows us to build a model by stacking layers in order.  
This is useful for **simple feed-forward neural networks**.

---

### Example: A Small Neural Network

Suppose we want a model:

- Input size = 4  
- Hidden layer with 8 neurons (ReLU activation)  
- Output size = 2 (e.g., binary classification)

In [4]:
import torch.nn as nn

# Define the model
model = nn.Sequential(
    nn.Linear(4, 8),   # Layer 1: Linear transformation (4 → 8)
    nn.ReLU(),         # Activation function
    nn.Linear(8, 2)    # Layer 2: Linear transformation (8 → 2)
)

print(model)

Sequential(
  (0): Linear(in_features=4, out_features=8, bias=True)
  (1): ReLU()
  (2): Linear(in_features=8, out_features=2, bias=True)
)


In [5]:
#Example: Forward Pass

# Sample input: batch of 3 samples, each with 4 features
x = torch.randn(3, 4)

# Pass through the model
output = model(x)

print("Input:", x)
print("Output:", output)

Input: tensor([[-0.6535,  1.1033, -0.4188,  0.6314],
        [-0.1038, -0.0235, -0.6025,  1.2412],
        [-0.5209,  1.6439, -0.9290,  1.0729]])
Output: tensor([[0.3381, 0.5965],
        [0.3853, 0.5629],
        [0.4075, 0.7071]], grad_fn=<AddmmBackward0>)


<a name='5'></a>
## 5 - Defining Custom Models with `nn.Module`

While `nn.Sequential` is convenient for simple networks, it becomes **limiting** when we need:  
- Multiple inputs/outputs  
- Custom operations inside the forward pass  
- More complex architectures  

In such cases, we subclass **`nn.Module`**.



In [6]:
import torch.nn.functional as F

# Define a custom model by subclassing nn.Module
class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.fc1 = nn.Linear(4, 8)   # First layer
        self.fc2 = nn.Linear(8, 2)   # Second layer
    
    def forward(self, x):
        # Define forward pass
        x = F.relu(self.fc1(x))   # Apply ReLU after first layer
        x = self.fc2(x)           # Output layer
        return x

# Instantiate the model
model = MyModel()
print(model)

MyModel(
  (fc1): Linear(in_features=4, out_features=8, bias=True)
  (fc2): Linear(in_features=8, out_features=2, bias=True)
)


In [7]:
#Forward Pass Example
x = torch.randn(3, 4)  # 3 samples, 4 features each
output = model(x)

print("Input:", x)
print("Output:", output)


Input: tensor([[-0.2067,  0.6353,  0.7016,  0.5031],
        [ 0.7267, -0.0069,  2.0131, -0.2274],
        [-0.9513, -1.0209, -0.0299, -0.2787]])
Output: tensor([[ 0.0930, -0.3943],
        [ 0.0132, -0.5313],
        [ 0.3193, -0.2644]], grad_fn=<AddmmBackward0>)


<a name='6'></a>
## 6 - Loss Functions

Loss functions measure how well the model's predictions match the target labels.  
In PyTorch, loss functions are available in **`torch.nn`**.

---

### Common Loss Functions

1. **Mean Squared Error (MSE)** – for regression:

$$
\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
$$


In [8]:
import torch.nn as nn

# Sample predictions and targets
y_pred = torch.tensor([2.0, 3.0, 4.0])
y_true = torch.tensor([1.5, 2.5, 4.0])

# Define MSE loss
mse_loss = nn.MSELoss()
loss = mse_loss(y_pred, y_true)
print("MSE Loss:", loss.item())

MSE Loss: 0.1666666716337204


In [9]:
# Sample logits (not probabilities) for 3 classes and batch of 2
logits = torch.tensor([[2.0, 0.5, 1.0],
                       [1.0, 3.0, 0.2]])
labels = torch.tensor([0, 1])  # Target class indices

# Define cross-entropy loss
ce_loss = nn.CrossEntropyLoss()
loss = ce_loss(logits, labels)
print("Cross-Entropy Loss:", loss.item())


Cross-Entropy Loss: 0.32173651456832886


<a name='7'></a>
## 7 - Practice Exercises

Try the following exercises to reinforce your understanding of **nn.Module, layers, and loss functions**.



### **Exercise 1: Build a Simple Model**
- Create a custom model using `nn.Module` with:  
  - Input size = 3  
  - Hidden layer = 5 neurons, ReLU activation  
  - Output layer = 2 neurons  
- Print the model summary.  



### **Exercise 2: Forward Pass**
- Create a random input tensor with shape `[4, 3]` (batch of 4 samples, 3 features).  
- Pass it through your model and print the output.  



### **Exercise 3: Compute Loss**
- Assume the target labels for your batch are `[0, 1, 1, 0]`.  
- Use `nn.CrossEntropyLoss()` to compute the loss between model output and target labels.  



### **Exercise 4 (Optional Challenge)**
- Modify your model to add a second hidden layer (5 → 4 → 2).  
- Use ReLU after each hidden layer.  
- Perform a forward pass and compute the loss.


In [10]:
import torch
import torch.nn as nn
import torch.nn.functional as F

# ----------------------------
# Exercise 1: Build a Simple Model
# ----------------------------
class SimpleModel(nn.Module):
    def __init__(self):
        super(SimpleModel, self).__init__()
        self.fc1 = nn.Linear(3, 5)   # Input 3 → Hidden 5
        self.fc2 = nn.Linear(5, 2)   # Hidden 5 → Output 2

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = SimpleModel()
print("Exercise 1 - Model Summary:\n", model)

# ----------------------------
# Exercise 2: Forward Pass
# ----------------------------
x = torch.randn(4, 3)  # Batch of 4 samples, 3 features
output = model(x)
print("\nExercise 2 - Forward Pass Output:\n", output)

# ----------------------------
# Exercise 3: Compute Loss
# ----------------------------
# Target labels for the batch
labels = torch.tensor([0, 1, 1, 0])

# Define CrossEntropyLoss
loss_fn = nn.CrossEntropyLoss()
loss = loss_fn(output, labels)
print("\nExercise 3 - CrossEntropy Loss:", loss.item())

# ----------------------------
# Exercise 4 (Optional Challenge)
# ----------------------------
class ExtendedModel(nn.Module):
    def __init__(self):
        super(ExtendedModel, self).__init__()
        self.fc1 = nn.Linear(3, 5)
        self.fc2 = nn.Linear(5, 4)
        self.fc3 = nn.Linear(4, 2)

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

extended_model = ExtendedModel()
output_extended = extended_model(x)
loss_extended = loss_fn(output_extended, labels)

print("\nExercise 4 - Extended Model Output:\n", output_extended)
print("Exercise 4 - CrossEntropy Loss:", loss_extended.item())


Exercise 1 - Model Summary:
 SimpleModel(
  (fc1): Linear(in_features=3, out_features=5, bias=True)
  (fc2): Linear(in_features=5, out_features=2, bias=True)
)

Exercise 2 - Forward Pass Output:
 tensor([[ 0.1645, -0.0315],
        [ 0.2513, -0.0101],
        [ 0.2480,  0.0928],
        [-0.5656, -0.8524]], grad_fn=<AddmmBackward0>)

Exercise 3 - CrossEntropy Loss: 0.6915081143379211

Exercise 4 - Extended Model Output:
 tensor([[ 0.1505, -0.0812],
        [ 0.1052, -0.0647],
        [ 0.1829, -0.0658],
        [-0.0662,  0.0961]], grad_fn=<AddmmBackward0>)
Exercise 4 - CrossEntropy Loss: 0.742132842540741
