# PyTorch Tutorial: Training Your First Model

Now that you can build neural networks, it's time to train them! This notebook covers the complete training process.

## Learning Objectives

By the end of this notebook, you will:
- Understand the training loop
- Learn about loss functions and optimizers
- Implement a complete training loop
- Understand validation and evaluation
- Visualize training progress

---

## What is Training?

**Training** is the process of teaching a neural network to make good predictions by:
1. Making predictions on data
2. Measuring how wrong they are (loss)
3. Computing gradients
4. Updating parameters to reduce the loss
5. Repeating until the model learns

This is exactly what gradient descent does, but applied to neural networks!

## Setting Up

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt
import numpy as np

torch.manual_seed(42)
np.random.seed(42)

print("PyTorch version:", torch.__version__)

## Creating a Simple Dataset

Let's create a simple dataset to train on. We'll predict y from x where y = 2x + 1 (with some noise):

In [None]:
# Generate synthetic data: y = 2x + 1 + noise
n_samples = 100
x = torch.randn(n_samples, 1) * 5
y_true = 2 * x + 1 + torch.randn(n_samples, 1) * 0.5

print(f"Dataset size: {n_samples} samples")
print(f"X shape: {x.shape}, Y shape: {y_true.shape}")

# Visualize
plt.figure(figsize=(8, 6))
plt.scatter(x.numpy(), y_true.numpy(), alpha=0.6)
plt.xlabel('x', fontsize=12)
plt.ylabel('y', fontsize=12)
plt.title('Training Data: y = 2x + 1 + noise', fontsize=14)
plt.grid(True, alpha=0.3)
plt.show()

## Building a Simple Model

In [None]:
class SimpleLinearModel(nn.Module):
    def __init__(self):
        super(SimpleLinearModel, self).__init__()
        self.linear = nn.Linear(1, 1)
    
    def forward(self, x):
        return self.linear(x)

model = SimpleLinearModel()
print("Model:", model)
print("\nInitial parameters:")
for name, param in model.named_parameters():
    print(f"{name}: {param.data}")

## Loss Functions and Optimizers

**Loss function** measures prediction error. **Optimizer** updates parameters.

In [None]:
# Mean Squared Error for regression
criterion = nn.MSELoss()

# Stochastic Gradient Descent optimizer
learning_rate = 0.01
optimizer = optim.SGD(model.parameters(), lr=learning_rate)

print("Loss function: MSE")
print(f"Optimizer: SGD with lr={learning_rate}")

## The Complete Training Loop

This is the core pattern used in ALL neural network training:

In [None]:
# Reset model
model = SimpleLinearModel()
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

num_epochs = 100
losses = []

print("Training...")
for epoch in range(num_epochs):
    # Forward pass
    predictions = model(x)
    
    # Compute loss
    loss = criterion(predictions, y_true)
    
    # Backward pass
    optimizer.zero_grad()  # Zero gradients
    loss.backward()         # Compute gradients
    optimizer.step()        # Update parameters
    
    losses.append(loss.item())
    
    if (epoch + 1) % 20 == 0:
        print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")

print(f"\nFinal loss: {losses[-1]:.4f}")
print("\nLearned parameters:")
for name, param in model.named_parameters():
    print(f"{name}: {param.data.item():.4f}")
print("Expected: weight â‰ˆ 2.0, bias â‰ˆ 1.0")

## Visualizing Results

In [None]:
plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
plt.plot(losses)
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training Loss')
plt.grid(True, alpha=0.3)

plt.subplot(1, 2, 2)
with torch.no_grad():
    predictions = model(x)
plt.scatter(x.numpy(), y_true.numpy(), alpha=0.6, label='Actual')
plt.scatter(x.numpy(), predictions.numpy(), alpha=0.6, label='Predicted')
plt.xlabel('x')
plt.ylabel('y')
plt.title('Predictions vs Actual')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## Validation Split

In practice, we split data into training and validation sets:

In [None]:
# Split data
train_size = int(0.8 * len(x))
x_train, y_train = x[:train_size], y_true[:train_size]
x_val, y_val = x[train_size:], y_true[train_size:]

# New model
model = SimpleLinearModel()
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

train_losses, val_losses = [], []

for epoch in range(100):
    # Training
    model.train()
    pred = model(x_train)
    train_loss = criterion(pred, y_train)
    optimizer.zero_grad()
    train_loss.backward()
    optimizer.step()
    
    # Validation
    model.eval()
    with torch.no_grad():
        val_pred = model(x_val)
        val_loss = criterion(val_pred, y_val)
    
    train_losses.append(train_loss.item())
    val_losses.append(val_loss.item())

# Plot
plt.figure(figsize=(10, 5))
plt.plot(train_losses, label='Train')
plt.plot(val_losses, label='Val')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training vs Validation Loss')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

## Key Takeaways

1. **Training Loop**: Forward â†’ Loss â†’ Backward â†’ Update
2. **Loss Functions**: MSE for regression, CrossEntropy for classification
3. **Optimizers**: SGD, Adam, etc. - update parameters using gradients
4. **Epochs**: One complete pass through the dataset
5. **Validation**: Test on unseen data to check generalization
6. **model.train()** / **model.eval()**: Set model mode

## What's Next?

Next notebooks: Regression and Classification examples using this training loop!

---

**Great job! You can now train neural networks! ðŸŽ‰**