# TRAINING LOOP

The training loop is a crucial part of training a machine learning model. It iterates over the data, computes predictions, compares them to ground truth using a loss function, and updates model parameters through backpropagation.

We normally take a batch from our training set on every iteration of the loop, which is handled by the dataloader. We then run these batches through our model, and compute the loss from the expected output. We then compute the gradient using optimizer to adjust the weights.

## prerequistes

You need a model, a loss, a dataloader and an optimizer.

### Model

The model is used to make predictions based on the input data, these predictions are then compared to the ground truth to calculate the loss.

### Loss

The loss function/cost function/Objective function, computes how far the model's predictions are from the true labels.

### DataLoader

It is used to load the dataset in batches, allowing for efficient processing of data

### Optimizer

The optimizer is used to adjust the model's weights based on the gradients computed during the backprop process: `adam, SGD, RMSProp`.

### learning rate scheduling (optional param)

Learning rate schedulers adjust the learning rate during training based on predefined rules (e.g., reducing the learning rate after certain epochs).

In [None]:
import torch
import torch.nn as nn
from torchvision import datasets

# Dataset
train_dataset = datasets.CIFAR10(
    download = True,
    shuffle = True,
    root = "data",
    train = True,
)


## Data loader

train_loader = DataLoader(train_dataset, batch_size = 32, shuffle = True)


## Model definition
model = nn.Sequential(
    nn.Linear(10, 50),
    nn.ReLU(),
    nn.Linear(50, 1)
)

# Loss
criterion = nn.MSELoss()

# Optimizer
optimizer = nn.optim.SGD(model.parameters(), lr = 0.01)

# Scheduler stepping
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)

### training loop

Now, in the training loop, we bring all of the above prerequistes into one loop/function


In [None]:
num_epochs = 100
for epoch in range(num_epochs):
    model.train() # sets the model into training mode
    for inputs, labels in train_loader:
        # Forward pass
        predictions  = model(inputs)
        loss = criterion(predictions, labels)

        # Backward pass (backprop)
        optimizer.zero_grad() # zeros out the prev gradients, so that updates are only based on current gradients for this particular batch, not accumulated gradients 
        loss.backward() # it is used to compute the gradients via back prop
        optimizer.step() # updates the model params using gradients computed in the loss.backward()

    # Optional lr scheduling
    scheduler.step()
        
    
    # End of an epoch
    print(f"Epoch {epoch+1}, loss: {loss.item()}")