# Lab 2: Training Loop

In this notebook, we'll implement the training loop — the core algorithm that makes our model learn. By the end, our model will discover the correct `weight = 0.4` and `bias = 0.1` values from the data.

## Install Dependencies

Run this cell to install the required libraries.

In [None]:
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
!pip install matplotlib

## Setup from Lab 1

Before we start training, we need to recreate everything from Lab 1: the data, model, and plotting function. Run this cell to set up the environment.

![Linear Model](images/linear-model.svg)

The diagram shows how our `LinearRegressionModel` works:
- **Input X** → The model receives input features
- **Multiply (X × weights)** → Input is multiplied by the learnable weight parameter
- **Add (+ bias)** → The learnable bias parameter is added to the result
- **Output y_pred** → The final prediction

The model computes: `forward(X) = X × weights + bias`

Currently, the model starts with **random values** for weights and bias. Our goal in this lab is to train it to discover the correct values: `weight = 0.4` and `bias = 0.1`.

In [None]:
import torch
import torch.nn as nn
import matplotlib.pyplot as plt

torch.manual_seed(42)

# Target parameters (what we want the model to learn)
weight = 0.4
bias = 0.1

# Create data
X = torch.arange(0, 1, 0.02).unsqueeze(dim=1)
y = weight * X + bias

# Train/test split (80/20)
train_split = int(0.8 * len(X))
X_train, y_train = X[:train_split], y[:train_split]
X_test, y_test = X[train_split:], y[train_split:]

# Model definition
class LinearRegressionModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.weight = nn.Parameter(torch.randn(1), requires_grad=True)
        self.bias = nn.Parameter(torch.randn(1), requires_grad=True)

    def forward(self, x):
        return self.weight * x + self.bias

# Plotting function
def plot_predictions(train_data, train_labels, test_data, test_labels, predictions=None):
    plt.figure(figsize=(10, 7))
    plt.scatter(train_data, train_labels, c="b", s=4, label="Training data")
    plt.scatter(test_data, test_labels, c="r", s=4, label="Test data")
    if predictions is not None:
        plt.scatter(test_data, predictions, c="g", s=4, label="Predictions")
    plt.legend()
    plt.xlabel("X")
    plt.ylabel("y")
    plt.show()

# Create model instance
model = LinearRegressionModel()

print(f"Training samples: {len(X_train)}, Test samples: {len(X_test)}")
print(f"Initial parameters: weight={model.weight.item():.4f}, bias={model.bias.item():.4f}")
print(f"Target parameters:  weight={weight}, bias={bias}")

## 1. Create Loss Function

A **loss function** measures how wrong our predictions are. We use Mean Absolute Error (MAE), also called L1 Loss. It calculates the average absolute difference between predictions and actual values.

Lower loss = better predictions.

In [None]:
# Create loss function
loss_fn = nn.L1Loss()

# Example: calculate loss between some predictions and targets
example_preds = torch.tensor([0.5, 0.6, 0.7])
example_targets = torch.tensor([0.4, 0.5, 0.8])
example_loss = loss_fn(example_preds, example_targets)
print(f"Example loss: {example_loss}")

## 2. Create Optimizer

An **optimizer** updates parameters based on gradients to reduce the loss. We use SGD (Stochastic Gradient Descent) with a learning rate of 0.01.

The learning rate controls how big each update step is — too high and we overshoot, too low and training takes forever.

In [None]:
# Create optimizer
optimizer = torch.optim.SGD(params=model.parameters(), lr=0.01)

print(f"Optimizer: {optimizer}")

## 3. The Training Loop

Now we implement the 5-step training loop. This is the core algorithm that makes learning happen:

1. **Zero gradients** — Clear old gradients (PyTorch accumulates them by default)
2. **Forward pass** — Make predictions with current parameters
3. **Calculate loss** — Measure how wrong predictions are
4. **Backward pass** — Compute gradients (how to adjust parameters)
5. **Update parameters** — Adjust weights using gradients

We repeat this for 100 epochs (complete passes through the data).

**Note on model modes:**
- `model.train()` — Sets the model to training mode (enables dropout, batch normalization behavior for training)
- `model.eval()` — Sets the model to evaluation mode (disables those features for inference)

For our simple linear model, these don't change behavior, but it's good practice to use them consistently.

In [None]:
epochs = 100

epoch_count = []
train_loss_values = []
test_loss_values = []

for epoch in range(epochs):
    model.train()

    # 1. Zero gradients
    optimizer.zero_grad()

    # 2. Forward pass
    y_pred = model(X_train)

    # 3. Calculate loss
    loss = loss_fn(y_pred, y_train)

    # 4. Backward pass
    loss.backward()

    # 5. Update parameters
    optimizer.step()

    # Evaluate on test set
    model.eval()
    with torch.inference_mode():
        test_pred = model(X_test)
        test_loss = loss_fn(test_pred, y_test)

    # Record losses every 10 epochs (for smoother plotting)
    if epoch % 10 == 0:
        epoch_count.append(epoch)
        train_loss_values.append(loss.detach().numpy())
        test_loss_values.append(test_loss.detach().numpy())
        print(f"Epoch {epoch:3d} | Train Loss: {loss:.4f} | Test Loss: {test_loss:.4f}")

## Visualizing Loss Curves

Loss curves show how the model improves over time. Both train and test loss should decrease together.

In [None]:
plt.figure(figsize=(10, 5))
plt.plot(epoch_count, train_loss_values, label="Train Loss")
plt.plot(epoch_count, test_loss_values, label="Test Loss")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.title("Training and Test Loss Curves")
plt.legend()
plt.show()

## Comparing Learned vs Target Parameters

Let's see how close our model got to the true weight and bias values.

In [None]:
print("Learned parameters:")
print(model.state_dict())

print(f"\nComparison:")
print(f"  Learned weight: {model.state_dict()['weight'].item():.4f} | Target: {weight}")
print(f"  Learned bias:   {model.state_dict()['bias'].item():.4f} | Target: {bias}")

The learned parameters should be very close to the target values (weight=0.4, bias=0.1)!

## Visualizing Final Predictions

Now let's see how well our trained model predicts on the test data.

In [None]:
# Make final predictions
model.eval()
with torch.inference_mode():
    final_preds = model(X_test)

# Plot predictions
plot_predictions(X_train, y_train, X_test, y_test, predictions=final_preds)

The green dots (predictions) should now align closely with the test data. Our model has learned the pattern!

## Summary

In this lab, we:
1. Created a **loss function** (L1Loss/MAE) to measure prediction errors
2. Created an **optimizer** (SGD with lr=0.01) to update parameters
3. Implemented the **5-step training loop**
4. Trained for **100 epochs** and watched the loss decrease
5. Verified our model learned `weight ≈ 0.4` and `bias ≈ 0.1`

In **Lab 3**, we'll save this trained model to disk and load it back for future use.