# IN3310 Mandatory 1
### Håkon Ganes Kornstad - haakongk

#### General note for Tasks 1 & 2:
The code for these tasks can be found in `ResNetData.py`, `ResNetTrain.py` and `plotting.py`. A run of `haakongk.py` will generate all the data, plots and csv described in this report, provided that the `mandatory1_data` folder is present together with these runfiles.

#### Task 1: Dataset loading

**a)** After downloading the data from the server, a `ResNetData` class is implemented to organize the data and make a stratified split. After some testing, I choose **65% training data**, **16% validation data** and **19% test data**. The *training set* needs to be big enough so that the model can learn from the data, and generalize on unseen data. The *validation set* is used to keep an eye with the training along the epochs, and also possibly adjust any hyperparameter along the way. It should be smaller, however a too small validation set makes the evaluation unstable. Finally the *test set* should be about the same size as the validation, but I choose it to be a bit bigger.

Instead of creating a folder structure `train, val, test` and copying the files here, `ResNetData` creates the file `annotations.csv`, containing the file paths of each image, along with the class name and stratified split info:

In [None]:
import pandas as pd

df = pd.read_csv('annotations.csv')
print(df.head())

   split                                         image_path  label
0  train  /mnt/e/ml_projects/IN3310/2025/IN3310/Mandator...      0
1  train  /mnt/e/ml_projects/IN3310/2025/IN3310/Mandator...      0
2  train  /mnt/e/ml_projects/IN3310/2025/IN3310/Mandator...      2
3  train  /mnt/e/ml_projects/IN3310/2025/IN3310/Mandator...      1
4  train  /mnt/e/ml_projects/IN3310/2025/IN3310/Mandator...      5


**b)** We now want to assert that there is no leakage of similar images into the different splits. I implement this as a helper function in the `ResNetData` class, namely `_verify_disjoint_splits()`. The function simply reads `annotations.csv` and makes three **sets** of file names based on the respective split. Then `intersection()` is used between them: it should return 0 for disjoint sets. I choose to include this function in the constructor for `ResNetData`, so it will run automatically when creating the data.

**c)** A standard PyTorch `DataLoader` is then implemented. The constructor again reads `annotations.csv`, and creates an array of the images we want to include, based on a given split. It also takes in the transforms, and determines whether an *augmented transform* is present. Then, in `__getitem__()`, we split out the `label` information, and `Image.open()` is used on a per-image basis. The transformed image is returned along with the label.

### Task 2: Implementing ResNets

**a)** Please refer to the file `ResNet.py`, where the alterations were done according to the task.

**b-c)** We can now implement a test training on the default ResNet-1

In [5]:
from pathlib import Path
from ResNet import ResNet
from ResNetData import ResNetDataPreprocessor, ResNetDataset
from torchvision import transforms
from torch.utils.data import DataLoader
import torch.nn as nn
import torch.optim as optim
from ResNetTrain import train_model

BASE_PATH = Path.cwd()
DATASET_PATH = BASE_PATH / 'mandatory1_data'

# first we instigate a Preprocessor
preprocessor = ResNetDataPreprocessor(base_path=BASE_PATH, dataset_path=DATASET_PATH)

# we set up the transforms
transform = transforms.Compose([
    transforms.Resize((150, 150)),  # adjusting size
    transforms.ToTensor(),  # converting to PyTorch tensor
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))  # normalizing for RGB-bilder
])

# getting the datasets and -loaders
train_dataset = ResNetDataset(preprocessor.annotations_file, BASE_PATH, split='train', transform=transform)
val_dataset = ResNetDataset(preprocessor.annotations_file, BASE_PATH, split='val', transform=transform)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False)

class_names = preprocessor.get_class_names()

# inititating a model
model = ResNet(img_channels=3, num_layers=18, num_classes=len(class_names))

# setting up the loss function, optimizer and
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# we can now train the model
file_path1 = BASE_PATH / 'resnet1.pth'
train_accs1, val_acc1, map_scores1, class_accs1, train_losses1, val_losses1 = train_model(model, train_loader, val_loader, criterion, optimizer, file_path1, num_epochs=20)


Success! Train, validation and test sets are disjoint
Epoch 1/20, Train Loss: 0.9660, Val Loss: 0.9094, Train Acc: 0.6242, Val Acc: 0.6842, mAP: 0.7843 - Model Saved
Epoch 2/20, Train Loss: 0.6782, Val Loss: 0.6374, Train Acc: 0.7561, Val Acc: 0.7778, mAP: 0.8603 - Model Saved
Epoch 3/20, Train Loss: 0.5666, Val Loss: 0.5663, Train Acc: 0.7943, Val Acc: 0.7901, mAP: 0.8744 - Model Saved
Epoch 4/20, Train Loss: 0.5231, Val Loss: 0.4544, Train Acc: 0.8133, Val Acc: 0.8311, mAP: 0.9161 - Model Saved
Epoch 5/20, Train Loss: 0.4641, Val Loss: 0.5414, Train Acc: 0.8337, Val Acc: 0.8046, mAP: 0.8950
Epoch 6/20, Train Loss: 0.4250, Val Loss: 0.4871, Train Acc: 0.8464, Val Acc: 0.8233, mAP: 0.9162 - Model Saved
Epoch 7/20, Train Loss: 0.3966, Val Loss: 0.4273, Train Acc: 0.8574, Val Acc: 0.8486, mAP: 0.9304 - Model Saved
Epoch 8/20, Train Loss: 0.3883, Val Loss: 0.4088, Train Acc: 0.8638, Val Acc: 0.8617, mAP: 0.9284
Epoch 9/20, Train Loss: 0.3513, Val Loss: 0.4496, Train Acc: 0.8792, Val Acc: 

**d)** After this "sneak peek", we are ready to do the full training. After some initial runs to test performance, the following three models are chosen for this task (`batch_size=32` used for all):
1) **ResNet34** with CrossEntropyLoss, Adam optimizer, Learning Rate: 0.001, using basic transforms, batch size 32
2) **ResNet34** with CrossEntropyLoss, Adam optimizer, Learning Rate: 0.001, using a set of augmented transforms (see below), batch size 32
3) **ResNet34** with CrossEntropyLoss, Stochastic Gradient Descent, Learning Rate: 0.005, using the basic transforms, batch size 32

For the augmentation, the intuition was limited on what to use, but in order to experiment a bit, the following `transforms` was set up:

In [None]:
# data augmentation
augm_transform = transforms.Compose([
    transforms.RandomResizedCrop(150, scale=(0.8, 1.0)), # scaling and cropping the image
    transforms.RandomHorizontalFlip(p=0.5), # performing a horizontal flip of 50% of the images
    transforms.RandomRotation(15), # random rotation within 15 degrees
    transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1), # some color augmentation
    # the basics:
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

Throughout the training, the command ```watch -n 1 nvidia-smi``` kept track of the GPU, showing less than 2GB memory consumption at all times for these models. 

A "brute force in main" functionality was then implemented to select the best of these three models after running them all, based on the mAP-values. Also, `early stopping` was introduced with 3 epochs set as the threshold: the training would then stop after three epochs if the mAP value had not been increased. Quite consistently, the best performing model turned out to be **model 1** with `maP = 0.9402`. 

Mean Accuracy values per Class was recorded during training per Epoch, and a plot shows the variance in the results during training. Initially the classes are trained from a low score, and then are quite unstable for the first 2-6 epochs. At epoch 9-10, the accuracy is perhaps most stable in all of the classes. From a visual inspection, this would be a good place to stop training. However, the model manages to find a better mAP score at epoch 14, before stopping early at epoch 17 after three uneventful runs.

![mAP and mean accuracy per class for each epoch](plots/plot_map_scores1.png)



The Train/Validation Loss Plot is classic in its shape, with a validation loss starting out slightly over the training loss. For this particular training the validation loss decreases quite quickly to join the training loss values. Then, around epoch 9-10, the validation loss starts to increase, which is a sign that the model has begun overtraining. Interestingly, the overtraining seems to be evident from where we concluded that the model was most stable in the previous plot: perhaps we should have stopped training at epoch 10.

![train val plot](plots/plot_train_val_loss1.png)


#TODO: Accuracy per class

**e)** We now performed an evaluation on the test set, rendering the following output:

```
Test Accuracy: 0.8695
Test Loss: 0.3945
Test mAP: 0.9373
Test Mean Accuracy per Class: 0.8690
```

The assignment then asked for a plot of mAP scores on a per epoch basis for the test run. However, following a discussion on Mattermost, this was deemed unneccessary. This is in line with the practice that the test set should generally be kept in a vault until it is time to do a final test on a promising model. Testing along with each epoch might be good for illustration, however it is "dangerous", as we risk doing the mistake that the model trains on the test data.

**f)** A pre-trained ResNet was now imported:

*