# Lab 6: Optimization Methods in PyTorch

The goal of this lab is to improve the performance of a deep learning model by implementing various regularization and normalization techniques in PyTorch.

**What You'll Do:**
- Apply **L1/L2 weight decay**.
- Implement **Dropout**.
- Implement **Normalization (BatchNorm, LayerNorm, etc.)**.
- Use **Early Stopping**.
- Experiment with **Data Augmentation (CutMix, Mixup)**.

You'll be given challenges where you must use the **PyTorch documentation** to complete missing parts!

## Part 1: Imports

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.transforms as transforms
import torchvision.datasets as datasets
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt
import numpy as np

torch.manual_seed(24)  # For reproducibility

## Part 2: Load the Dataset
We'll use CIFAR-10, a small image classification dataset.

In [None]:
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

train_dataset = datasets.CIFAR10(root='./data', train=True, transform=transform, download=True)
test_dataset = datasets.CIFAR10(root='./data', train=False, transform=transform, download=True)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)

## Part 3. Define a Simple CNN
We'll start with a basic CNN model and optimize it throughout the challenge.

In [None]:
class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        XXXX
        ...

    def forward(self, x):
        x = XXXX
        x = x.view(x.size(0), -1)
        x = XXXX
        return x

model = SimpleCNN()

## Part 5: Apply L2 Regularization (Weight Decay)
Modify the optimizer to use **L2 regularization**. Check out the PyTorch documentation for the Adam optimizer [here](https://pytorch.org/docs/stable/generated/torch.optim.Adam.html).

In [None]:
optimizer = optim.Adam(model.parameters(), lr=0.001)

## Part 6: Add Dropout to the Model
Modify the **SimpleCNN** class you created above to include **Dropout layers**. Check out the documentation on dropout layers [here](https://pytorch.org/docs/stable/nn.html#dropout-layers).

In [None]:
class DropoutCNN(nn.Module):
    def __init__(self):
        super(DropoutCNN, self).__init__()
        XXXX
        ...

    def forward(self, x):
        x = XXXX
        x = XXXX
        x = x.view(x.size(0), -1)
        x = XXXX
        return x

model = DropoutCNN()

## Part 7: Add Batch Normalization
Modify the CNN to include **BatchNorm** after each convolutional layer.

**Extra challenge**: Try to find the documentation for this on your own!

In [None]:
class BatchNormCNN(nn.Module):
    def __init__(self):
        super(BatchNormCNN, self).__init__()
        XXXX
        ...

    def forward(self, x):
        x = XXXX
        x = x.view(x.size(0), -1)
        x = XXXX
        return x

model = BatchNormCNN()

## Part 8: Implement Early Stopping
**Challenge 1**: Write the training code into the loop

**Challenge 2**: Modify the loop to stop if validation loss doesn't improve after **N epochs**.

In [None]:
best_loss = float('inf')
patience = 5  # Training stops after this many epochs with no improvement
counter = 0  # Count the consecutive epochs without improvement

for epoch in range(50):

    # Insert model training code here

    val_loss = XXXX  # compute validation loss

    if XXXX:
        XXXX = XXXX  # update 'best_loss'
        counter = 0
    else:
        counter += 1
        if XXXX:
            XXXX  # stop training

## Part 9: Data Augmentation

**Here's the big challenge for today**: Implement both ***CutMix*** and ***MixUp*** to your data!

1.   Augment your data with both CutMix and MixUp. In other words, randomly select images and apply one or the other method. Do not apply both methods to the same image.
2.   Use your new augmented data to train a simple CNN with Dropout, BatchNorm, and Early Stopping.

