# Deepcamp: Codelab 3

**In this tutorial we will cover**:

- Pytorch: a Deep learning framework
- Fine-tuning a pretrained Neural Network on a 
- Using a Web service for fine-tuning


**Author**:
- Alessio Devoto (alessio.devoto@uniroma1.it)


**Duration**: 50 mins 


**Warning**: Make sure your colab has the GPU option enabled!

# German Traffic Sign Recognition Benchmark 

We need a model to classify photos of traffic signs.

What are we going to do? 🤔

1. Data import & preprocessing 💾
2. Download of a pretrained neural network
3. Fine-tuning & evaluation of the neural network 📉

We first import the necessary libraries as usual

In [24]:
import torch
from torch.nn import Linear
import torchvision
from torchvision.transforms import Compose, ToTensor, Normalize, Resize, ToPILImage

from torchvision.datasets import GTSRB # 43 classes

## 1. Data import and preprocessing
There is plenty of [torchvision datasets](https://pytorch.org/vision/stable/datasets.html) to use off-the-shelf if you ever want to try yourself.

## 1.1 Data import
In this case, we can leverage torchvision's built-in dataset collection. The torchvision.datasets modules makes several famouse datasets available. 

In general, we can pick between train and validation set via some flags.

In [25]:
trainset = GTSRB(root='./data', split='train', download=True, transform=None) # only download train split
testset = GTSRB(root='./data', split='test', download=True, transform=None)

image, label = trainset[0]
# image will be the input datum to our model
# label is the class corresponding to that image

In [26]:
# this is a <PIL Image>
# PIL is a library for imaging 
image.show()

In [27]:
# the class corresponding to this image
label

0

We have 43 different classes in this dataset. This would be too much for our poor Colab free GPU 

Let us create new datasets that contain only 3 of the initial 43 classes.

In [None]:
"""labels_to_keep = {0,1,2}
keep_train_idx = [idx for idx, elem in enumerate(trainset) if elem[1] in labels_to_keep]
keep_test_idx = [idx for idx, elem in enumerate(testset) if elem[1] in labels_to_keep]

trainset = torch.utils.data.Subset(trainset, keep_train_idx)
testset = torch.utils.data.Subset(trainset, keep_test_idx)"""


### 1.1 Transforms

Let me introduce you torchvision transforms!
Transforms are a powerful concept that allows us to preprocess each sample in the dataset right after it is retrieved. They can be used for a lot of tasks, like resizing images or performing data augmentation.

IMAGE of torchvision transforms

Thanks to the Resize transform, we can resize the image to a bigger format and have a better look at it

In [4]:
Resize((128,128))(image).show()

In [28]:
# for later
transform = Compose(
    [
    ToTensor(),
    Resize((32,32)),
    Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
    ])

You can combine transforms using transforms.Compose. 
You have a lot of options found in [torchvision.transforms.](https://pytorch.org/vision/stable/transforms.html)

In [23]:
# Exercise: Apply a sequence of transforms a random image in the dataset and display it

im = trainset[100][0]
im.show()

transformed_image = Compose([torchvision.transforms.RandomHorizontalFlip(), Resize((128,128))])(im)
transformed_image.show()

### 1.2 Preprocessing and dataloaders

[`torch.utils.data.DataLoader`](https://pytorch.org/docs/stable/data.html?highlight=dataloader#torch.utils.data.DataLoader) is an iterator which provides:

- Data batching
- Data shuffling
- Parallel data loading using `multiprocessing` workers. Meaning that while the `GPU` is performing some computation on a batch, in parallel you can load the next batch.


Creating a dataloader from a dataset is straightforward, this is one example that highlights some of the most used parameters:

In [29]:
batch_size = 4

trainset = GTSRB(root='./data', split='train', download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,
                                          shuffle=True, num_workers=2)

testset = GTSRB(root='./data', split='test', download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,
                                         shuffle=False, num_workers=2)



## 2. Finally: hands on a Neural Network

In Pytorch you can build your own neural networks with custom architectures by using the `nn.Module` class. Unfortunately, we don't have time for that here :( 

Good news though: Pytorch gives us the chance to download pre-designed and in some cases even pre-trained architectures. 

IMAGE of pretraining


Let us download the pre-trained version of ResNet-18, a very famous Convolutional Neural Network. 
This net was trained on thousands of data already, so we expect it to be quite good at a variety of tasks.

In [30]:
from torchvision.models import resnet18

res = resnet18(weights=True)




Let us have a look at ResNet under the hood

In [31]:
res

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
  

Cool! The model is composed by a number of blocks and a final fully connected layer, that had 1000 output neurons... 

Our dataset only had 47 classes though! What shall we do? 
We can simplt replace the fully connected linear layer with a new (randomly initialized one).

In [32]:
# in case we want to freeze the feature extractor
for param_name, param in res.named_parameters():
    param.requires_grad = False

In [33]:
# Exercise: replace the last fully connected layer of the network with a linear layer
# Hint: the linear layer must have the same number of input neurons but a only 47 output neurons

res.fc = Linear(512, 47)

### 2.1 Training

The training process of a neural network can be long and even expesive in terms of energy and time required. 

Based on the hardware you have, it can take a lot to train even small models. This is the reason why a lot of the "big-tech" companies invest a lot of capital for buying GPUs. 


The general scheme for the training is usually the following:

IMAGE of training

Because this would take some time to code, we make use of a library which makes the training easier, by managing the whole training and validation routing in a transparent way.

Pytorch-Lightning makes the training more efficient and just requires us to define what the network should do in the training and validation steps.

In [38]:
from pytorch_lightning import LightningModule, Trainer
from torch.nn.functional import cross_entropy
from torchmetrics import Accuracy

class LightningResNet(LightningModule):
    def __init__(self, model):
        super().__init__()
        self.model = model


    def forward(self, x):
        return self.model(x)
    

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.model.fc.parameters(), lr=1e-3)
        return optimizer


    def training_step(self, train_batch, batch_idx):
        image, label = train_batch
        prediction = self.model(image)
        loss = cross_entropy(prediction, label)
        self.log('train_loss', loss)
        return loss

    def validation_step(self, val_batch, batch_idx):
        image, label = val_batch
        prediction = self.model(image)
        loss = cross_entropy(prediction, label)
        self.log('val_loss', loss)

        accuracy = Accuracy('multiclass', num_classes=47)
        acc = accuracy(prediction, label)
        self.log('accuracy', acc, on_epoch=True)
    
    def test_step(self, val_batch, batch_idx):
        image, label = val_batch
        prediction = self.model(image)
        loss = cross_entropy(prediction, label)
        self.log('val_loss', loss)

        accuracy = Accuracy('multiclass', num_classes=47)
        acc = accuracy(prediction, label)
        self.log('accuracy', acc, on_epoch=True)

lit_resnet = LightningResNet(res)


In [36]:
trainer = Trainer(accelerator='cpu', max_epochs=1) # change accelerator to gpu # max_time="00:00:02:00"
trainer.fit(lit_resnet, train_dataloaders=trainloader, val_dataloaders=testloader) 

GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name  | Type   | Params
---------------------------------
0 | model | ResNet | 11.2 M
---------------------------------
24.1 K    Trainable params
11.2 M    Non-trainable params
11.2 M    Total params
44.802    Total estimated model params size (MB)


Epoch 0: 100%|██████████| 6660/6660 [05:18<00:00, 20.93it/s, v_num=1]      

`Trainer.fit` stopped: `max_epochs=1` reached.


Epoch 0: 100%|██████████| 6660/6660 [05:18<00:00, 20.92it/s, v_num=1]


In [37]:
trainer.test(lit_resnet,dataloaders=testloader)

MisconfigurationException: No `test_step()` method defined to run `Trainer.test`.