## **Creating Models with `torch.nn`**

The `torch.nn` module provides a flexible way to define and train neural networks. In this section, we will cover:
- Subclassing `nn.Module`
- Understanding `__init__` and `forward`
- Using the model for inference
- Model attributes and methods
- Exploring other useful classes in `torch.nn`

### **Subclassing `nn.Module`**
All PyTorch models should be defined as a subclass of `torch.nn.Module`. This allows PyTorch to track parameters, apply automatic differentiation, and use built-in methods like `.train()` and `.eval()`.

#### **Basic Structure of a PyTorch Model**

In [1]:
import torch
import torch.nn as nn

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()  # Call nn.Module's constructor
        self.layer1 = nn.Linear(4, 5)  # Fully connected layer
        self.activation = nn.ReLU()  # Activation function
        self.layer2 = nn.Linear(5, 2)  # Output layer

    def forward(self, x):
        x = self.layer1(x)
        x = self.activation(x)
        x = self.layer2(x)
        return x

# Create an instance of the model
model = MyModel()
print(model)

MyModel(
  (layer1): Linear(in_features=4, out_features=5, bias=True)
  (activation): ReLU()
  (layer2): Linear(in_features=5, out_features=2, bias=True)
)


#### **Key Parts:**
1. **`__init__(self)`**: 
   - Defines layers as instance variables.
   - Calls `super().__init__()` to inherit `nn.Module` functionalities.

2. **`forward(self, x)`**:
   - Defines how input data flows through the layers.
   - Each layer is called in order, applying transformations.


### **Using the Model**
Once the model is defined, we can:
- **Pass input tensors** through it.
- **Check model parameters**.
- **Use different modes (training/evaluation)**.

In [2]:
# Sample input tensor (batch size of 1, 4 features)
x = torch.randn(1, 4)

# Forward pass
output = model(x)
print("Model Output:", output)

Model Output: tensor([[-0.0013, -0.4343]], grad_fn=<AddmmBackward0>)


### **Model Attributes and Methods**
PyTorch models come with useful attributes and methods.

#### **1. Checking Model Parameters**

In [3]:
# Print model parameters
for name, param in model.named_parameters():
    print(f"Layer: {name} | Size: {param.shape} | Requires Grad: {param.requires_grad}")

Layer: layer1.weight | Size: torch.Size([5, 4]) | Requires Grad: True
Layer: layer1.bias | Size: torch.Size([5]) | Requires Grad: True
Layer: layer2.weight | Size: torch.Size([2, 5]) | Requires Grad: True
Layer: layer2.bias | Size: torch.Size([2]) | Requires Grad: True


#### **2. Switching Between Training and Evaluation Modes**

In [5]:
model.train()
model.eval()

MyModel(
  (layer1): Linear(in_features=4, out_features=5, bias=True)
  (activation): ReLU()
  (layer2): Linear(in_features=5, out_features=2, bias=True)
)

### **Saving and Loading Models**

In [7]:
# Save model weights
torch.save(model.state_dict(), "model.pth")

# Load model weights
model.load_state_dict(torch.load("model.pth", weights_only=True))

<All keys matched successfully>

### **Other Classes in `torch.nn`**
The `torch.nn` module contains many useful classes for building deep learning models. Here are some key ones:

#### **1. Linear Layers**
- `nn.Linear(in_features, out_features)`: Fully connected layer.

#### **2. Convolutional Layers**
- `nn.Conv1d, nn.Conv2d, nn.Conv3d`: 1D, 2D, and 3D convolutions.
- `nn.ConvTranspose2d`: Transposed convolution for upsampling.

#### **3. Recurrent Layers**
- `nn.RNN, nn.LSTM, nn.GRU`: Recurrent layers for sequential data.

#### **4. Normalization Layers**
- `nn.BatchNorm1d, nn.BatchNorm2d, nn.BatchNorm3d`: Batch normalization.
- `nn.LayerNorm, nn.GroupNorm`: Other normalization techniques.

#### **5. Dropout**
- `nn.Dropout(p)`: Randomly sets input elements to zero during training.

#### **6. Embedding**
- `nn.Embedding(num_embeddings, embedding_dim)`: Lookup table for embeddings.

#### **7. Loss Functions**
- `nn.CrossEntropyLoss()`: Classification loss.
- `nn.MSELoss()`: Mean squared error for regression.

### **Example: Defining a CNN**

In [8]:
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=3, stride=1, padding=1)
        self.relu = nn.ReLU()
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.fc1 = nn.Linear(16 * 14 * 14, 10)  # Assuming input is 28x28

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu(x)
        x = self.pool(x)
        x = torch.flatten(x, start_dim=1)  # Flatten for fully connected layer
        x = self.fc1(x)
        return x

# Create CNN instance
cnn_model = CNN()
print(cnn_model)

CNN(
  (conv1): Conv2d(1, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu): ReLU()
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (fc1): Linear(in_features=3136, out_features=10, bias=True)
)


## **Training a Model in PyTorch**

Training a neural network in PyTorch involves:
1. **Defining an optimizer** (`torch.optim`)
2. **Choosing a loss function** (`torch.nn.functional`)
3. **Writing a training loop** (forward, backward, and update steps)
4. **Evaluating the model**
5. **Saving and loading the model**

### **Defining an Optimizer**
An **optimizer** updates model parameters to minimize the loss.

#### **Common PyTorch Optimizers**
- `torch.optim.SGD` - Stochastic Gradient Descent
- `torch.optim.Adam` - Adaptive Moment Estimation
- `torch.optim.RMSprop` - Root Mean Square Propagation

### **Creating an Optimizer**

In [9]:
import torch
import torch.nn as nn
import torch.optim as optim

# Define a simple model
class SimpleModel(nn.Module):
    def __init__(self):
        super(SimpleModel, self).__init__()
        self.fc = nn.Linear(2, 1)  # Single layer

    def forward(self, x):
        return self.fc(x)

# Instantiate model
model = SimpleModel()

# Define optimizer
optimizer = optim.Adam(model.parameters(), lr=0.01)

# Print optimizer details
print(optimizer)

Adam (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    capturable: False
    differentiable: False
    eps: 1e-08
    foreach: None
    fused: None
    lr: 0.01
    maximize: False
    weight_decay: 0
)


### **Choosing a Loss Function**
A **loss function** measures how well the model’s predictions match the target values.

#### **Common Loss Functions**
- `nn.MSELoss()` → Mean Squared Error (Regression)
- `nn.CrossEntropyLoss()` → Classification
- `nn.BCELoss()` → Binary Cross-Entropy for Sigmoid output

#### **Example: Defining a Loss Function**

In [10]:
# Mean Squared Error Loss (for regression)
loss_fn = nn.MSELoss()

### **Writing a Training Loop**
The training loop performs **three key steps**:
1. **Forward Pass:** Compute model predictions.
2. **Backward Pass:** Compute gradients via backpropagation.
3. **Optimization Step:** Update model parameters.

#### **Training Loop Example**

In [11]:
# Sample dataset (inputs and targets)
X_train = torch.tensor([[1.0, 2.0], [2.0, 3.0], [3.0, 4.0]], dtype=torch.float32)
y_train = torch.tensor([[3.0], [5.0], [7.0]], dtype=torch.float32)

# Training loop
epochs = 100
for epoch in range(epochs):
    optimizer.zero_grad()  # Clear previous gradients
    outputs = model(X_train)  # Forward pass
    loss = loss_fn(outputs, y_train)  # Compute loss
    loss.backward()  # Backpropagation
    optimizer.step()  # Update weights

    if epoch % 10 == 0:
        print(f"Epoch {epoch}, Loss: {loss.item()}")

Epoch 0, Loss: 19.230506896972656
Epoch 10, Loss: 14.225132942199707
Epoch 20, Loss: 10.100868225097656
Epoch 30, Loss: 6.879495620727539
Epoch 40, Loss: 4.49819803237915
Epoch 50, Loss: 2.8341054916381836
Epoch 60, Loss: 1.7368441820144653
Epoch 70, Loss: 1.0561864376068115
Epoch 80, Loss: 0.6604399085044861
Epoch 90, Loss: 0.4456111490726471


### **Evaluating the Model**
Switch the model to **evaluation mode** and make predictions.

In [12]:
model.eval()  # Set to evaluation mode
X_test = torch.tensor([[4.0, 5.0]], dtype=torch.float32)
prediction = model(X_test)
print("Model Prediction:", prediction.item())

Model Prediction: 7.479829788208008


### **Saving and Loading a Model**
PyTorch provides two ways to save a model:

1. **Save only weights (recommended for inference):**

In [13]:
torch.save(model.state_dict(), "model_weights.pth")

2. **Save the entire model (structure + weights, less recommended):**

In [14]:
torch.save(model, "full_model.pth")

#### **Loading the Model**

In [16]:
# Load weights
model.load_state_dict(torch.load("model_weights.pth", weights_only=True))
model.eval()  # Always switch to eval mode for inference

SimpleModel(
  (fc): Linear(in_features=2, out_features=1, bias=True)
)