# Assignment 3 - Deep Learning

Machine Learning (BBWL), Michael Mommert, FS2023, University of St. Gallen

The **goal** of this assignment is to implement and train a neural network to perform image classification. While a good performance of the resulting trained model is desirable, it is more important to follow the task setup carefully and implement your code following best practices.

The dataset used is [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html), which consists of 32x32 RGB images, showing objects from either of 10 different classes. 

Your **objectives** are the following:
* Implement a neural network architecture with at least 6 layers for the task of image classification. You can use any architecture you like.
* For each training epoch, output the loss on the training dataset and the loss on the validation dataset. Tune the learning rate using this setup (only use full and half decimal powers, e.g., 0.001, 0.005, 0.01, 0.05, ...) to maximize the accuracy on the validation dataset and prevent overfitting. Visualize the training and validation loss as a function of epoch for the best-performing learning rate in the same plot.
* Evaluate your final trained and tuned model on the test dataset by computing accuracy, precision and recall, visualize the confusion matrix and discuss implications.

This assignment will be **graded** based on:
* whether these objectives have been achieved;
* whether the solution follows best practices;
* how well the approach is documented (e.g., using text cells, plots, etc.);
* how clean the code is.

There are no restrictions on the resources that you can use -- collaborating on assignments is allowed -- but students are not allowed to submit identical code.

There will be a leaderboard comparing the accuracies evaluated on the test dataset; the winner will receive a [grand prize](https://en.wikipedia.org/wiki/Mars_(chocolate_bar))!

Please submit your runnable Notebook to [michael.mommert@unisg.ch](mailto:michael.mommert@unisg.ch) **before 17 May 2023, 23:59**. Please include your name in the Notebook filename.

-----

## 1. Basic CNN Model
Disclaimer: I ran the code locally and thus changed the first cell / imncluded it in the first basic model.

In [8]:
import numpy as np
import os
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score
import torchvision
from torchvision import transforms

# Set seed for reproducibility
seed_value = 42
np.random.seed(seed_value)
torch.manual_seed(seed_value)

# Check for CUDA availability
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
torch.cuda.manual_seed(seed_value)
print('[LOG] notebook with {} computation enabled'.format(str(device)))

# Download and preprocess the CIFAR-10 dataset
data_directory = './data_cifar10'
if not os.path.exists(data_directory): os.makedirs(data_directory)

# Data augmentation and normalization for training
transform_train = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomCrop(32, padding=4),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# Just normalization for testing
transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# Load the datasets
trainset = torchvision.datasets.CIFAR10(root=data_directory, train=True, download=True, transform=transform_train)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=128, shuffle=True)

testset = torchvision.datasets.CIFAR10(root=data_directory, train=False, download=True, transform=transform_test)
testloader = torch.utils.data.DataLoader(testset, batch_size=128, shuffle=False)

# Define the model
class Classifier(nn.Module):
    def __init__(self):
        super(Classifier, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

model = Classifier().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

# Training function
def train(model, loader, criterion, optimizer):
    model.train()
    running_loss = 0
    for X_batch, y_batch in loader:
        X_batch, y_batch = X_batch.to(device), y_batch.to(device)
        optimizer.zero_grad()
        y_pred = model(X_batch)
        loss = criterion(y_pred, y_batch)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    return running_loss / len(loader)

# Evaluation function
def evaluate(model, loader, criterion):
    model.eval()
    running_loss = 0
    with torch.no_grad():
        for X_batch, y_batch in loader:
            X_batch, y_batch = X_batch.to(device), y_batch.to(device)
            y_pred = model(X_batch)
            loss = criterion(y_pred, y_batch)
            running_loss += loss.item()
    return running_loss / len(loader)

# Prediction function
def predict(model, loader):
    model.eval()
    y_pred = []
    y_true = []
    with torch.no_grad():
        for X_batch, y_batch in loader:
            y_pred_batch = model(X_batch)
            _, y_pred_batch = torch.max(y_pred_batch, dim=1)
            y_pred.extend(y_pred_batch.tolist())
            y_true.extend(y_batch.tolist())
    return y_true, y_pred

# Evaluation metric function
def print_evaluation_scores(y_true, y_pred):
    print('Accuracy:', accuracy_score(y_true, y_pred))
    print('F1-score:', f1_score(y_true, y_pred, average='weighted'))
    print('Precision:', precision_score(y_true, y_pred, average='weighted'))
    print('Recall:', recall_score(y_true, y_pred, average='weighted'))

# Train and evaluate the model
num_epochs = 10
for epoch in range(num_epochs):
    train_loss = train(model, trainloader, criterion, optimizer)
    valid_loss = evaluate(model, testloader, criterion)
    print(f"Epoch: {epoch+1}/{num_epochs}.. Training Loss: {train_loss:.3f}.. Validation Loss: {valid_loss:.3f}")

# Evaluate the model
y_test_true, y_test_pred = predict(model, testloader)
print("Model evaluation:")
print_evaluation_scores(y_test_true, y_test_pred)

[LOG] notebook with cpu computation enabled
Files already downloaded and verified
Files already downloaded and verified
Epoch: 1/10.. Training Loss: 2.303.. Validation Loss: 2.300
Epoch: 2/10.. Training Loss: 2.293.. Validation Loss: 2.280
Epoch: 3/10.. Training Loss: 2.233.. Validation Loss: 2.135
Epoch: 4/10.. Training Loss: 2.069.. Validation Loss: 1.966
Epoch: 5/10.. Training Loss: 1.950.. Validation Loss: 1.846
Epoch: 6/10.. Training Loss: 1.862.. Validation Loss: 1.754
Epoch: 7/10.. Training Loss: 1.784.. Validation Loss: 1.677
Epoch: 8/10.. Training Loss: 1.724.. Validation Loss: 1.642
Epoch: 9/10.. Training Loss: 1.680.. Validation Loss: 1.574
Epoch: 10/10.. Training Loss: 1.650.. Validation Loss: 1.551
Model evaluation:
Accuracy: 0.4317
F1-score: 0.41881326396450286
Precision: 0.43054134206219496
Recall: 0.4317


----

## 2. Different Improvements
### 2.1 Data augmentation
Data augmentation is a technique in machine learning used to reduce overfitting when training a machine learning model, by training models on several slightly-modified copies of existing data. (Wikipedia)

In [9]:
import torchvision.transforms as transforms
import torchvision.transforms as transforms
from torch.utils.data import DataLoader, Subset
from sklearn.model_selection import train_test_split

seed_value = 42
np.random.seed(seed_value)
torch.manual_seed(seed_value)

data_directory = './data_cifar10'

train_transform = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomCrop(32, padding=4),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

val_test_transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

train_path = data_directory + '/train_cifar10'
cifar10_train = torchvision.datasets.CIFAR10(root=train_path, train=True, download=True, transform=train_transform)
train_loader = DataLoader(cifar10_train, batch_size=128, shuffle=True)

eval_path = data_directory + '/eval_cifar10'
cifar10_eval = torchvision.datasets.CIFAR10(root=eval_path, train=False, download=True, transform=val_test_transform)

indices = list(range(len(cifar10_eval)))
val_indices, test_indices = train_test_split(indices, test_size=0.5, stratify=cifar10_eval.targets, random_state=seed_value)

val_dataset = Subset(cifar10_eval, val_indices)
test_dataset = Subset(cifar10_eval, test_indices)

val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=128)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=128)

# Train and evaluate the model
num_epochs = 10
for epoch in range(num_epochs):
    train_loss = train(model, train_loader, criterion, optimizer)
    valid_loss = evaluate(model, val_loader, criterion)
    print(f"Epoch: {epoch+1}/{num_epochs}.. Training Loss: {train_loss:.3f}.. Validation Loss: {valid_loss:.3f}")

# Evaluate the model
y_test_true, y_test_pred = predict(model, test_loader)
print("Model evaluation:")
print_evaluation_scores(y_test_true, y_test_pred)

Files already downloaded and verified
Files already downloaded and verified
Epoch: 1/10.. Training Loss: 1.622.. Validation Loss: 1.526
Epoch: 2/10.. Training Loss: 1.594.. Validation Loss: 1.502
Epoch: 3/10.. Training Loss: 1.575.. Validation Loss: 1.478
Epoch: 4/10.. Training Loss: 1.552.. Validation Loss: 1.450
Epoch: 5/10.. Training Loss: 1.536.. Validation Loss: 1.447
Epoch: 6/10.. Training Loss: 1.516.. Validation Loss: 1.416
Epoch: 7/10.. Training Loss: 1.501.. Validation Loss: 1.395
Epoch: 8/10.. Training Loss: 1.482.. Validation Loss: 1.377
Epoch: 9/10.. Training Loss: 1.471.. Validation Loss: 1.369
Epoch: 10/10.. Training Loss: 1.458.. Validation Loss: 1.368
Model evaluation:
Accuracy: 0.5012
F1-score: 0.4871487751660596
Precision: 0.5021603326261208
Recall: 0.5012


### 2.2 Different Architecture
We can use the ResNet-18 architecture with pre-trained weights from ImageNet, which is available in the torchvision.models module.

In [10]:
import torchvision.models as models

# Load the pre-trained ResNet-18 model
model_resnet18 = models.resnet18(pretrained=True)

# Adjust the last layer to match the number of CIFAR-10 classes
num_classes = 10
model_resnet18.fc = nn.Linear(model_resnet18.fc.in_features, num_classes)

# Transfer the model to the appropriate device
model_resnet18.to(device)

# Set the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model_resnet18.parameters(), lr=0.001)

# Train and evaluate the model
num_epochs = 10
for epoch in range(num_epochs):
    train_loss = train(model_resnet18, train_loader, criterion, optimizer)
    valid_loss = evaluate(model_resnet18, val_loader, criterion)
    print(f"Epoch: {epoch+1}/{num_epochs}.. Training Loss: {train_loss:.3f}.. Validation Loss: {valid_loss:.3f}")

# Evaluate the model
y_test_true, y_test_pred = predict(model_resnet18, test_loader)
print("Model evaluation:")
print_evaluation_scores(y_test_true, y_test_pred)

Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to C:\Users\lukas/.cache\torch\hub\checkpoints\resnet18-f37072fd.pth
100%|██████████| 44.7M/44.7M [00:07<00:00, 6.46MB/s]


KeyboardInterrupt: 