# **Week 6 Exercise: Noise Handling & Regularization**

Welcome to **Week 6**! We’re focusing on **noise handling** and **regularization** techniques. This exercise demonstrates:

1. **Data Augmentation** (random crops, flips) to handle noise / robustify training.  
2. **Dropout** in a small CNN architecture.  
3. **Weight Decay** (L2 regularization) in the optimiser.

We’ll use **CIFAR-10** (10 classes: airplane, car, bird, cat, etc.), a moderate-sized dataset of 32×32 color images. By applying these techniques, we aim to reduce overfitting and improve robust performance.

---

## 1. Imports & Setup

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms

import numpy as np
import matplotlib.pyplot as plt

print("Torch version:", torch.__version__)
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print("Using device:", device)

### Utility functions

Defining the utility function to plot metrics below.

In [None]:
'''
Description:
Utility function to plot line plots in a single graph

Params:
dataDict: List - A list of tuples containing the data to be plotted and corresponding plot params as a dictionary (optional).
           Expected key names in paramDict:
           -> label: str - String label to be given to the plot. 
                           Only necessary if the 'legend' function param is set to True (False by default).
           -> ha: str - Specifies the horizontal alignment ('left', 'right' or 'center') of text above each point in the plot.
           -> fontsize: int - Sets the font size of text displayed above each point in the plot.
           -> marker: str - Sets the style of marker to be displayed for each data point on the plot. Set to 'o' by default.
           -> decimalPlaces: int - Sets the number of decimal places to display for each data point.
           -> displayPercent: bool - Boolean to decide whether to display numbers in percentage format.
           -> displayOffset: float - positive or negative float value that determines the display offset of text above data point.
title: str - [optional] Title to be set for the graph.
xlabel: str - [optional] Label for the x-axis to be set for the graph.
ylabel: str - [optional] Label for the y-axis to be set for the graph.
figSize: Tuple - [optional] Sets a custom figure size for the plot based on the width and height values passed as a tuple pair.
legend: bool - [optional] Boolean to decide whether to show the legend or not. Set to False by default
'''
def plotMetrics(dataList, X, title='', xlabel='', ylabel='', figSize=None, legend=False):
    if figSize:
            plt.figure(figsize=(figSize))
    for data in dataList:
        y, paramDict = data
        # Getting plot params
        label = paramDict['label'] if 'label' in paramDict else ''
        marker = paramDict['marker'] if 'marker' in paramDict else 'o'
        ha = paramDict['ha'] if 'ha' in paramDict else 'center'
        fontSize = paramDict['fontSize'] if 'fontSize' in paramDict else 8
        decimalPlaces = paramDict['decimalPlaces'] if 'decimalPlaces' in paramDict else 2
        displayPercent = paramDict['displayPercent'] if 'displayPercent' in paramDict else False
        displayOffset = paramDict['displayOffset'] if 'displayOffset' in paramDict else 0.005
        
        plt.plot(X, y, label=label, marker=marker)
        
        # Getting the data values to show on the plotted points along the line
        for i, v in enumerate(y):
            percentMultiplier = 100 if displayPercent else 1
            v_str = f'{v * percentMultiplier:.{decimalPlaces}f}{"%" if displayPercent else ""}'
            plt.text(i + 1, v + displayOffset, v_str, ha=ha, fontsize=fontSize)
        
    plt.xlabel(xlabel)
    plt.ylabel(ylabel)
    plt.title(title)
    if legend:
        plt.legend()
    plt.show()


## 2. Load & Augment CIFAR-10

**Task**:  
1. Use **transforms** for data augmentation: random crop, horizontal flip.  
2. Convert to `Tensor`, normalize.  
3. Create `DataLoader` for train/test sets.

In [None]:
def getTransforms(augmentation=False):
    # mean,std for CIFAR10
    cifar_mean = (0.4914, 0.4822, 0.4465)
    cifar_std  = (0.2470, 0.2435, 0.2616)
    
    test_transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(cifar_mean, cifar_std)
    ])

    if augmentation:
        train_transform = transforms.Compose([
            transforms.RandomCrop(32, padding=4),  # basic data aug
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            transforms.Normalize(cifar_mean, cifar_std)
        ])
        return train_transform, test_transform    

    return test_transform, test_transform

In [None]:
def load_cifar10_data(batch_size=64, augmentation=False):
    """
    1. define train_transform with random crop, flip, normalization
    2. define test_transform with just resize or basic normalization
    3. load CIFAR10 train/test
    4. create DataLoader for each
    return train_loader, test_loader
    """
    
    train_transform, test_transform = getTransforms(augmentation)

    # TODO: set download=True if first time
    train_dataset = torchvision.datasets.CIFAR10(root='./data', train=True, download=False, transform=train_transform)
    test_dataset  = torchvision.datasets.CIFAR10(root='./data', train=False, download=False, transform=test_transform)
    
    train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=2)
    test_loader  = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False, num_workers=2)
    
    return train_loader, test_loader

train_loader, test_loader = load_cifar10_data(batch_size=64)
print("Train set size:", len(train_loader.dataset))
print("Test set size:", len(test_loader.dataset))

## 3. Define a CNN with Dropout

**Task**:  
- Build a small **CNN** with a few convolution layers, each followed by ReLU & maybe pool.  
- Insert **dropout** layers (e.g., `nn.Dropout(0.3)`) to help reduce overfitting.  
- The final linear outputs 10 classes.  
- We'll call it `NetCIFAR`.

In [None]:
class NetCIFAR(nn.Module):
    def __init__(self, dropout_prob=0.3, useRegularization=False):
        super().__init__()
        self.useRegularization = useRegularization
        # example architecture
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3, padding=1)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
        self.pool  = nn.MaxPool2d(2,2)
        
        if self.useRegularization:
            self.dropout = nn.Dropout(dropout_prob)
        
        self.conv3 = nn.Conv2d(64, 128, kernel_size=3, padding=1)
        self.conv4 = nn.Conv2d(128, 128, kernel_size=3, padding=1)
        
        self.fc1 = nn.Linear(128*8*8, 256)  # after 2 pools => 8x8 size
        self.fc2 = nn.Linear(256, 10)
    
    def forward(self, x):
        # shape x: (batch,3,32,32)
        x = F.relu(self.conv1(x))
        x = F.relu(self.conv2(x))
        x = self.pool(x)  # (batch,64,16,16)
        
        if self.useRegularization:
            x = self.dropout(x)
        
        x = F.relu(self.conv3(x))
        x = F.relu(self.conv4(x))
        x = self.pool(x)  # (batch,128,8,8)
        
        x = x.view(x.size(0), -1)  # flatten => (batch, 128*8*8)
        if self.useRegularization:
            x = self.dropout(x)
        
        x = F.relu(self.fc1(x))
        if self.useRegularization:
            x = self.dropout(x)
        
        out = self.fc2(x)
        # no softmax, we use CrossEntropyLoss
        return out

## 4. Training Loop with Weight Decay

**Task**:  
- We define `train_model(...)` that uses:
  - Adam or SGD with **weight_decay** param (like 1e-4).  
  - `nn.CrossEntropyLoss()`.  
  - Standard PyTorch training loop (epochs).

In [None]:
def train(dataloader, model, loss_fn, optimiser):
    size = len(dataloader.dataset)
    model.train()

    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)
        
        optimiser.zero_grad()
        
        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # This is the backprop set up. Explain what each of the steps do
        loss.backward()        
        optimiser.step()       

        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    loss, accuracy = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            loss += loss_fn(pred, y).item()
            accuracy += (pred.argmax(1) == y).type(torch.float).sum().item()
    loss /= num_batches
    accuracy /= size

    return loss, accuracy

def train_loop(train_dataloader, test_dataloader, model, loss_fn, optimiser, epochs):
    train_loss = []
    train_accuracy = []
    test_loss = []
    test_accuracy = []

    # Iterate over each epoch
    for t in range(epochs):
        print(f"Epoch {t+1}:\n")
        train(train_dataloader, model, loss_fn, optimiser)

        # Get the overall loss and accuracy for both train and test datasets
        tr_loss, tr_acc = test(train_dataloader, model, loss_fn)
        ts_loss, ts_acc = test(test_dataloader, model, loss_fn)

        print(f"Train Error: \n Accuracy: {(100*tr_acc):>0.1f}%, Avg loss: {tr_loss:>8f} \n")
        print(f"Test Error: \n Accuracy: {(100*ts_acc):>0.1f}%, Avg loss: {ts_loss:>8f} \n")

        # Store and return the losses and accuracies. We can graph these later
        train_loss = train_loss + [tr_loss]
        train_accuracy = train_accuracy + [tr_acc]
        test_loss = test_loss + [ts_loss]
        test_accuracy = test_accuracy + [ts_acc]

    print("Done training!")
    return train_loss, train_accuracy, test_loss, test_accuracy

## 5. Evaluate & Check Test Accuracy

**Task**:  
1. Measure train and test metrics for non-regularised model with non-augmented data.  
2. Do the same train and test loop for your regularised model on augmented data with weight decay in the optimiser and observe the difference.

### Non-regularised model with non-augmented data

In [None]:
model = NetCIFAR() # Getting the LeNet5 model

# Define the loss function and the optimiser
loss_fn = nn.CrossEntropyLoss()
optimiser = torch.optim.Adam(model.parameters(), lr=1e-3)

In [None]:
epochs = 10
train_loss, train_accuracy, test_loss, test_accuracy = train_loop(train_loader, test_loader, model, loss_fn, optimiser, epochs)

Plotting metrics

In [None]:
epochRange = range(1, epochs+1)
# Defining data and plot params
lossDataList = [(train_loss, {'label': 'Train loss', 'decimalPlaces': 4, 'displayOffset': -0.009}), 
                (test_loss, {'label': 'Test loss', 'decimalPlaces': 4, 'displayOffset': -0.012})]
plotTitle = 'Train and Test Loss for LeNet-5 model'

# Calling my custom util function to plot loss data
plotMetrics(lossDataList, epochRange, xlabel='Epochs', ylabel='Loss', title=plotTitle, legend=True)

In [None]:
# Defining data and plot params
accuracyDataList = [(train_accuracy, {'label': 'Train accuracy', 'displayOffset': 0.0015,
                                      'decimalPlaces': 2, 'displayPercent': True}), 
                (test_accuracy, {'label': 'Test accuracy', 'ha': 'left', 'displayOffset': -0.003,
                                 'decimalPlaces': 2, 'displayPercent': True})]
plotTitle = 'Train and Test Accuracy for LeNet-5 model'

# Calling my custom util function to plot accuracy data
plotMetrics(accuracyDataList, epochRange, xlabel='Epochs', ylabel='Accuracy', title=plotTitle, legend=True)

### With regularisation and data augmentation applied

In [None]:
train_loader, test_loader = load_cifar10_data(batch_size=64, augmentation=True)

model = NetCIFAR(useRegularization=True)

optimiser = torch.optim.Adam(model.parameters(), lr=1e-3, weight_decay=1e-4) # L2 regularization with weight_decay
train_loss, train_accuracy, test_loss, test_accuracy = train_loop(train_loader, test_loader, model, loss_fn, optimiser, epochs)

Plotting metrics

In [None]:
# Defining data and plot params
lossDataList = [(train_loss, {'label': 'Train loss', 'decimalPlaces': 4, 'displayOffset': -0.009}), 
                (test_loss, {'label': 'Test loss', 'decimalPlaces': 4, 'displayOffset': -0.012})]
plotTitle = 'Train and Test Loss for LeNet-5 model'

# Calling my custom util function to plot loss data
plotMetrics(lossDataList, epochRange, xlabel='Epochs', ylabel='Loss', title=plotTitle, legend=True)

In [None]:
# Defining data and plot params
accuracyDataList = [(train_accuracy, {'label': 'Train accuracy', 'displayOffset': 0.0015,
                                      'decimalPlaces': 2, 'displayPercent': True}), 
                (test_accuracy, {'label': 'Test accuracy', 'ha': 'left', 'displayOffset': -0.003,
                                 'decimalPlaces': 2, 'displayPercent': True})]
plotTitle = 'Train and Test Accuracy for LeNet-5 model'

# Calling my custom util function to plot accuracy data
plotMetrics(accuracyDataList, epochRange, xlabel='Epochs', ylabel='Accuracy', title=plotTitle, legend=True)

## Conclusion

**Week 6**: We applied **data augmentation** (random crop, flip) to handle noise or robustify input data, used **Dropout** in the CNN, and included **Weight Decay** in the optimiser. These regularization techniques help reduce overfitting and handle possible noise, aiming for more stable performance.

**End of Week 6 Exercise** – Tweak hyperparams, see if you can push test accuracy further while avoiding overfitting.