# Assignment 3 - Deep Learning

Machine Learning (BBWL), Michael Mommert, FS2023, University of St. Gallen

The **goal** of this assignment is to implement and train a neural network to perform image classification. While a good performance of the resulting trained model is desirable, it is more important to follow the task setup carefully and implement your code following best practices.

The dataset used is [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html), which consists of 32x32 RGB images, showing objects from either of 10 different classes. 

Your **objectives** are the following:
* Implement a neural network architecture with at least 6 layers for the task of image classification. You can use any architecture you like.
* For each training epoch, output the loss on the training dataset and the loss on the validation dataset. Tune the learning rate using this setup (only use full and half decimal powers, e.g., 0.001, 0.005, 0.01, 0.05, ...) to maximize the accuracy on the validation dataset and prevent overfitting. Visualize the training and validation loss as a function of epoch for the best-performing learning rate in the same plot.
* Evaluate your final trained and tuned model on the test dataset by computing accuracy, precision and recall, visualize the confusion matrix and discuss implications.

This assignment will be **graded** based on:
* whether these objectives have been achieved;
* whether the solution follows best practices;
* how well the approach is documented (e.g., using text cells, plots, etc.);
* how clean the code is.

There are no restrictions on the resources that you can use -- collaborating on assignments is allowed -- but students are not allowed to submit identical code.

There will be a leaderboard comparing the accuracies evaluated on the test dataset; the winner will receive a [grand prize](https://en.wikipedia.org/wiki/Mars_(chocolate_bar))!

Please submit your runnable Notebook to [michael.mommert@unisg.ch](mailto:michael.mommert@unisg.ch) **before 17 May 2023, 23:59**. Please include your name in the Notebook filename.

-----

The following code cells will setup the environment, download and prepare the data for you. Please do not modify these code cells.

In [2]:
# import standard python libraries
from datetime import datetime
import numpy as np
import os

# import the PyTorch deep learning libary
import torch, torchvision
import torch.nn.functional as F
from torch import nn, optim

# import sklearn classification evaluation library
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from sklearn.model_selection import train_test_split

# import plotting capabilities
import matplotlib.pyplot as plt
import seaborn as sns

# init deterministic seed
seed_value = 42
np.random.seed(seed_value) # set numpy seed
torch.manual_seed(seed_value) # set pytorch seed CPU

# set cpu or gpu enabled device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu').type
torch.cuda.manual_seed(seed_value)
print('[LOG] notebook with {} computation enabled'.format(str(device)))

# create data sub-directory in your local directory
data_directory = './data_cifar10'
if not os.path.exists(data_directory): os.makedirs(data_directory)

# download training images and split data (X) from labels (y)
train_path = data_directory + '/train_cifar10'
cifar10_train = torchvision.datasets.CIFAR10(root=train_path, train=True, download=True)
X_train = cifar10_train.data
y_train = cifar10_train.targets

# download evaluation images and split into val and test datasets
eval_path = data_directory + '/eval_cifar10'
cifar10_eval = torchvision.datasets.CIFAR10(root=eval_path, train=False, download=True)
X_val, X_test, y_val, y_test = train_test_split(cifar10_eval.data, cifar10_eval.targets, test_size=0.5, stratify=cifar10_eval.targets, random_state=seed_value)

# define class names
cifar10_classes = cifar10_train.classes

print('Train: {}, Val: {}, Test: {}'.format(len(X_train), len(X_val), len(X_test)))

[LOG] notebook with cpu computation enabled
Files already downloaded and verified
Files already downloaded and verified
Train: 50000, Val: 5000, Test: 5000


----

In [3]:
# Prepare the data
def preprocess(data):
    data = data.astype(np.float32) / 255.0
    data = np.transpose(data, (0, 3, 1, 2))
    return torch.from_numpy(data)

X_train, X_val, X_test = preprocess(X_train), preprocess(X_val), preprocess(X_test)
y_train, y_val, y_test = torch.tensor(y_train), torch.tensor(y_val), torch.tensor(y_test)

train_dataset = torch.utils.data.TensorDataset(X_train, y_train)
val_dataset = torch.utils.data.TensorDataset(X_val, y_val)
test_dataset = torch.utils.data.TensorDataset(X_test, y_test)

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=128, shuffle=True)
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=128)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=128)


# Define the neural network architecture
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, 3, padding=1)
        self.conv2 = nn.Conv2d(32, 64, 3, padding=1)
        self.conv3 = nn.Conv2d(64, 128, 3, padding=1)
        self.pool = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(128 * 4 * 4, 256)
        self.fc2 = nn.Linear(256, 128)
        self.fc3 = nn.Linear(128, 10)
        self.dropout = nn.Dropout(0.5)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = self.pool(F.relu(self.conv3(x)))
        x = x.view(-1, 128 * 4 * 4)
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.fc3(x)
        return x

net = Net().to(device)


# Train and evaluate the model
def train_model(model, criterion, optimizer, train_loader, val_loader, epochs=10, verbose=True):
    train_losses, val_losses = [], []
    
    for epoch in range(epochs):
        model.train()
        running_loss = 0
        for images, labels in train_loader:
            images, labels = images.to(device), labels.to(device)
            
            optimizer.zero_grad()
            output = model(images)
            loss = criterion(output, labels)
            loss.backward()
            optimizer.step()
            
            running_loss += loss.item()
        
        model.eval()
        validation_loss = 0
        with torch.no_grad():
            for images, labels in val_loader:
                images, labels = images.to(device), labels.to(device)
                
                output = model(images)
                loss = criterion(output, labels)
                validation_loss += loss.item()
        
        train_losses.append(running_loss/len(train_loader))
        val_losses.append(validation_loss/len(val_loader))
        
        if verbose:
            print("Epoch: {}/{}.. ".format(epoch+1, epochs),
                  "Training Loss: {:.3f}.. ".format(train_losses[-1]),
                  "Validation Loss: {:.3f}.. ".format(val_losses[-1]))
    
    return train_losses, val_losses

In [4]:
from tqdm.auto import tqdm as tqdm_auto

# Update the train_model function to show a progress bar
def train_model(model, criterion, optimizer, train_loader, val_loader, epochs=10, verbose=True):
    train_losses, val_losses = [], []
    
    for epoch in range(epochs):
        model.train()
        running_loss = 0
        
        loop = tqdm_auto(train_loader, leave=False) if verbose else train_loader
        for images, labels in loop:
            images, labels = images.to(device), labels.to(device)
            
            optimizer.zero_grad()
            output = model(images)
            loss = criterion(output, labels)
            loss.backward()
            optimizer.step()
            
            running_loss += loss.item()
            if verbose:
                loop.set_description(f"Epoch {epoch+1}/{epochs}")
                loop.set_postfix(loss=loss.item())
        
        model.eval()
        validation_loss = 0
        with torch.no_grad():
            for images, labels in val_loader:
                images, labels = images.to(device), labels.to(device)
                
                output = model(images)
                loss = criterion(output, labels)
                validation_loss += loss.item()
        
        train_losses.append(running_loss/len(train_loader))
        val_losses.append(validation_loss/len(val_loader))
        
        if verbose:
            print("Epoch: {}/{}.. ".format(epoch+1, epochs),
                  "Training Loss: {:.3f}.. ".format(train_losses[-1]),
                  "Validation Loss: {:.3f}.. ".format(val_losses[-1]))
    
    return train_losses, val_losses

# Now re-run the model training and evaluation code blocks after this update

In [5]:
# Tune learning rate
learning_rates = [0.001, 0.005, 0.01, 0.05]
best_val_loss = float('inf')
best_lr = 0.001

for lr in learning_rates:
    print(f"Trying learning rate: {lr}")
    net = Net().to(device)
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(net.parameters(), lr=lr)
    
    train_losses, val_losses = train_model(net, criterion, optimizer, train_loader, val_loader, epochs=10, verbose=False)
    if val_losses[-1] < best_val_loss:
        best_val_loss = val_losses[-1]
        best_lr = lr

print(f"Best learning rate: {best_lr}")

# Train the model with the best learning rate
net = Net().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=best_lr)
train_losses, val_losses = train_model(net, criterion, optimizer, train_loader, val_loader, epochs=10)

# Plot the training and validation losses
plt.plot(train_losses, label='Training Loss')
plt.plot(val_losses, label='Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

# Evaluate the model on the test dataset
net.eval()
y_pred = []
y_true = []

with torch.no_grad():
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)
        output = net(images)
        _, predicted = torch.max(output.data, 1)
        y_pred.extend(predicted.cpu().numpy())
        y_true.extend(labels.cpu().numpy())

print("Accuracy: {:.2f}%".format(accuracy_score(y_true, y_pred) * 100))
print("\nClassification Report:\n", classification_report(y_true, y_pred, target_names=cifar10_classes))

# Visualize the confusion matrix
cm = confusion_matrix(y_true, y_pred)
plt.figure(figsize=(10, 10))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=cifar10_classes, yticklabels=cifar10_classes)
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.show()

Trying learning rate: 0.001
