# Exercise Solutions

These are some solutions to the exercises in Workshop_5, these are not the only way to solve the exercises but are more intended as guide. 
***
1. Can we use a different backbone? Some very simple changes would be to try a larger ResNet, the number 18 in the one we have used refers to the number of layers but there are versions with 34, 50, 101 and 152 layers. Have a look at the cell defining the model, specifically the lines:
- `from torchvision.models import resnet18, resnet`
- `self.backbone = nn.Sequential(*list(resnet18(weights=resnet.ResNet18_Weights.DEFAULT).children())[:-1])`

In [None]:
# 1. SOLUTION
# IN the CVModel definition we just need to import and change the resnet number to load a larger model.

import torch
from torch import nn
from torchvision.models import resnet34, resnet

class CVModel(nn.Module):

    def __init__(self, n_classes):
        super().__init__()
        self.backbone = nn.Sequential(*list(resnet34(weights=resnet.ResNet34_Weights.DEFAULT).children())[:-1])
        self.flatten = nn.Flatten(start_dim=1)
        self.classifier = nn.Linear(512, n_classes)

    def forward(self, x):
        x = self.backbone(x)
        x = self.flatten(x)
        return self.classifier(x)
        

model = CVModel(n_classes=3)

***
2. We have evaluated the model on the validation set and returned an overall accuracy score but does this represent the best way to validate the performance of the model? Is there any other metrics we could calculate on this dataset?

Use this code to obtain all predictions and labels for the validation set and think about what else you could calculate:
```python
preds = []
labels = []
for x, y in val_dl:
    logits, softmax, argmax = predict(model, x)
    preds.extend(argmax.tolist())
    labels.extend(y.tolist())
```

In [None]:
# 2. Solution
# We could look at the precision and recall for each class given the above
from sklearn.metrics import classification_report

print(classification_report(labels, preds))

***
3. What happens if the normalization steps are removed from the transform pipeline, how does this affect the values of `x` in the batches from the training dataloader? How does this affect the model training?

To do this just comment out the `v2.Normalization` in the transform pipelines. You will likely find it affects the accuracy and that it goes but probably not by a huge amount, as we are finetuning the model the input layer weights are also being adapted during training so they should be able to compensate.

***
4. Are there any other transforms that could be added to the training transform pipeline - have a look [here](https://pytorch.org/vision/stable/transforms.html#v2-api-reference-recommended) and try a few!

Really this one can be any that you find interesting. Just add them to the following cell in the notebook:
```python
from torchvision.transforms import v2

train_transforms = v2.Compose(
    [
        SquareImage(),
        v2.Resize(224),
        v2.RandomHorizontalFlip(),
        v2.ToImageTensor(),
        v2.ConvertImageDtype(),
        v2.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    ]
)

val_transforms = v2.Compose(
    [
        SquareImage(),
        v2.Resize(224),
        v2.ToImageTensor(),
        v2.ConvertImageDtype(),
        v2.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    ]
)
```



***
5. When we use the pretrained model we are 'cheating' a little bit - it has been trained on ImageNet and the image net dataset includes many animals including dogs and cats so the model actually already knows how to extract features. What happens if we don't use a pretrained model, take a look at this line in the model definition and modify it so we start with a completely fresh model:
- `self.backbone = nn.Sequential(*list(resnet18(weights=resnet.ResNet18_Weights.DEFAULT).children())[:-1])`

How does this change the accuracy achieved in 5 epochs?

Here you just need to change the `weights=...` in the above line in the `CVModel` definition.

```python
import torch
from torch import nn
from torchvision.models import resnet18, resnet

class CVModel(nn.Module):

    def __init__(self, n_classes):
        super().__init__()
        self.backbone = nn.Sequential(*list(resnet18(weights=None).children())[:-1])
        self.flatten = nn.Flatten(start_dim=1)
        self.classifier = nn.Linear(512, n_classes)

    def forward(self, x):
        x = self.backbone(x)
        x = self.flatten(x)
        return self.classifier(x)
        

model = CVModel(n_classes=3)
```

You are now training a completely fresh neural network, it will likely not achieve a very good accuracy in just 5 epochs and will require training for much longer/require much more data. 