# Computer Vision Assignment
In this assignment you will be required to create a CV classifier on the [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset. To achieve this you will be required to create and compare different classifier. More precisely you will be asked to:

*   Create the right data-loading pipeline (e.g., Data augmentation, batch size, loading strategy, etc.)
*   Select the proper transfer learning strategy (e.g., fine-tuning, transfer learning, training from scratch) 
*   Select the right hyperparameters (e.g., learning rate, optimizer)

More importantly, you are required to **explain** the choices that you make. To do so, you can also perform different experiments (e.g., comparing two different learning rates or transfer learning strategies) and provide a comment on why one perform better than the other at the end of the notebook. 
You can re-use the notebook that we have been using in the classroom.

You are only provided with:


*   The dataset (directly provided by Torchvision)
*   The neural architecture: a Resnet18 `network = torchvision.models.resnet18()`






In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
import torch.backends.cudnn as cudnn
import numpy as np
import torchvision
from torchvision import datasets, models, transforms
import matplotlib.pyplot as plt
import copy
from assignment_utils import *
%reload_ext autoreload
%autoreload 2

cudnn.benchmark = True
plt.ion()   # interactive mode

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print("Device", device)

Device cuda:0


## Loading Data

We will use torchvision and torch.utils.data packages for loading the
data.

The problem we're going to solve today is to train a model to classify ten different objects: _airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck_. We have 50000 training images and 10000 validation images.

We can directly load the dataset from its torchvision class.

---

When loading data, we perform data augmentation. This allows to increase the dataset size by:

- Performing `RandomResizedCrop`, which is defined as follows by PyTorch:\
  \
  _The `RandomResizedCrop` transform crops an image at a random location, and then resizes the crop to a given size._\
  Illustration:
  ![img](https://pytorch.org/vision/main/_images/sphx_glr_plot_transforms_013.png)

- Performing a horizontal flip with a given probability. Our images should still be classifiable even if they are horizontally flipped, this is not always acceptable depending on the problem.\
  ![img](https://pytorch.org/vision/main/_images/sphx_glr_plot_transforms_024.png)
- Convert to tensor (this is a type formality, not very interesting in terms of data augmentation)
- Normalizing (center & scale) the images to prevent scale differences

On the validation set, we don't apply a `RandomResizedCrop`, but a `CenterCrop`, which doesn't perform the resize. This keeps the image sharper and may facilitate classification. Compare the following illustration with the one of `RandomResizedCrop`:

![img](https://pytorch.org/vision/main/_images/sphx_glr_plot_transforms_003.png)

By cropping towards the center, we also try not to remove part of the object to be classified (that may lay on the side of the image). This may be acceptable in the training set so that the algorithm learns to classify from a part of the object, but we don't want the algorithm to report low accuracy because a data augmentation technique removed too much of the object from the image.\
Furthermore, [some sources](https://stackoverflow.com/a/61637688/11680331) indicate that random transformations are only allowed on the training set (*i.e.* not on the validation set), as they should be undoable. 

In [2]:
data = Data(dl_batch_size=64, use_subset=True, subset_n_samples=2048)
pred = Prediction()

Files already downloaded and verified
Files already downloaded and verified
Train size: 50000
Val size: 10000
Class names: ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']


Now we have to create the dataloader, as we in the previous lab for the logistic regression. Notice however that we are using one more parameter:

*   ``num_workers`` is used to *parallelize* the loading from disk. 

Indeed, image datasets are too big normally to fit in memory and images are loaded from disk at every batch. In this case we do that using 4 threads to load the 16 images in parallel (4 per thread). 


### Visualize a few images
Let's visualize a few training images so as to understand the data
augmentations.



In [3]:
data.imshow_train_val(num_img=4)

## Select and compare different transfer - learning strategy

You need to compare the 3 learning strategy and comment the results obtained:

- Transfer Learning (remember to directly extract the features!)
- Fine tuning
- Training from scratch

Start simple! Traning from scratch may be very expensive with this dataset since we have 50000 images.

Also, to ensure that your code is working, you can use `torch.data.Subset(dataset, indexes)` to use a smaller version of the dataset


In [4]:
num_classes = 10
lr = 1e-4
weight_decay = 1e-4

# Model
model = torchvision.models.resnet18(num_classes=num_classes).to(DEVICE)

# Loss
loss = F.cross_entropy

# Optimizer
opt = optim.AdamW(params=model.parameters(), lr=lr, weight_decay=weight_decay)

# Function
model = pred.train_model(
  model=model,
  train_dl=data.train_dl,
  val_dl=data.val_dl,
  loss=loss,
  optim=opt,
  num_epochs=10,
)


Epoch 0/9
----------
Training complete in 0m 4s
Best val Acc: 0.000000


RuntimeError: CUDA out of memory. Tried to allocate 196.00 MiB (GPU 0; 3.95 GiB total capacity; 3.25 GiB already allocated; 79.44 MiB free; 3.38 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

: 

## Select and compare hyperparameters
Once you found the best transfer learning strategy, I want you to find the best hyperparameters: 


*   Learning Rate
*   Data augmentation
*   Optional: optimizer, batch size, etc.




In [83]:
## FILL IT YOURSELF

## Evaluation of the assignment
The evaluation of the assignment will be based on 3 different aspects:

*   Percentage of assignment completed (50 % of the grade)
*   Correctness of the comments used to explain the result (40% of the grade)
*   Validation accuracy of the final model provided (10 % of the grade)

