## Introduction

In [None]:
from google.colab import drive
drive.mount("/content/drive")


In [None]:
pip install pytorch-lightning==0.7.6 > /dev/null

In [None]:
cd '/content/drive/My Drive/i2dl_exercises/exercise_11'

Transfer learning is a research problem in machine learning that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. For example, knowledge gained while learning to recognize cars could apply when trying to recognize trucks. Transfer learning allows one innovation to contribute to many others, and makes cross-applications possible. 

<img src="https://miro.medium.com/max/1050/1*Z11P-CjNYWBofEbmGQrptA.png"
     alt="transfer learning"
     width="600"/>

Besides, transfer learning can compensate for the problem of too little training data. Imagine you’re trying to build a deep learning model but don’t have much training data. Maybe you’re trying to identify a rare skin disease and only have 100 images. Meanwhile, someone else has trained an image recognition model on a 100,000 labeled photos of dogs and has managed to get 96 percent accuracy at classifying different breeds. These tasks don’t seem related, but that doesn’t mean the dog breed classifier is irrelevant.

## Preparation

### Imports

Here we import some necessary packages which you are already familiar with.

In [None]:
import os
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import torchvision
from torch.utils.data import DataLoader
from torchvision import transforms
from PIL import Image
import pytorch_lightning as pl
from pytorch_lightning import Trainer
import numpy as np

%matplotlib inline
plt.rcParams['figure.figsize'] = (4.8, 4.8) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# for auto-reloading external modules
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2

### Get Device

The default device for this notebook is CPU, you may also try GPU e.g. Google Colab.

In [None]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

print('Using device:', device)

### Load Data

In this exercise, we use the CIFAR10 dataset for transfer learning.  
However, we need to resize our input image shape to fit the model better.

In [None]:
# Set training image size. Cifar10 images are 32x32, ImageNet images are 224x224
image_size = 64

transform = torchvision.transforms.Compose(
    [torchvision.transforms.Resize((image_size, image_size), interpolation=2),
     torchvision.transforms.ToTensor(),
     torchvision.transforms.Normalize(mean=[0.49191375, 0.48235852, 0.44673872],
                                      std=[0.24706447, 0.24346213, 0.26147554])])

CIFAR_ROOT = "../datasets/cifar10"

data = torchvision.datasets.ImageFolder(root=CIFAR_ROOT, transform=transform)
print("Dataset contains %d images." % len(data))

train_set, val_set, test_set = torch.utils.data.random_split(data, [40000, 5000, 5000])

dataset = (train_set, val_set, test_set,)

### Visualize Some Data

In [None]:
sample_num = 5
classes = os.listdir(CIFAR_ROOT)
print(classes)

for idx, c in enumerate(classes):
    sampled_idx = torch.randint(1, 4000, (5,))
    for i in range(sample_num):
        plt.subplot(1, sample_num, i + 1)
        image = Image.open(CIFAR_ROOT + "/" + c + "/" + str(sampled_idx[i].item()).zfill(4) + '.png')
        plt.imshow(image)
    plt.show()

### Setup Tensorboard

In [None]:
%load_ext tensorboard
%tensorboard --logdir lightning_logs --port 6008

## Load the Pretrained Model

In this notebook we use a pretrained mobilenet_v2 model [(Mark Sandler et. al)](https://arxiv.org/abs/1801.04381). The mobilenet_v2 is a mobile convolutional network with good performance. 
The mobilenet_v2 model contains 2 components: 
- __Conv feature extractor:__ Convolutional layers, ReLU-activations and BatchNorm to encode the visual features of an image
- __Classification head:__ Dropout and fully-connected layer for classification.

Here, the mobilenet_v2 provided by torchvision is trained on [(ImageNet)](http://www.image-net.org/), which has inputs of the shape 224x224. 

<img src="https://pytorch.org/assets/images/mobilenet_v2_2.png"
     alt="mobilenet_v2"
     width="300"/>

The main idea behind transfer learning for image classifiaction is that we can use the convolutional network for encoding the CIFAR10 images as the network was trained on a much larger dataset. We hope that the feature extraction generalizes well on CIFAR10. However, we have to re-train our specific classifier and a few high-level convolutional layers, because we have less and different classes than in the ImageNet dataset. 

Luckily, PyTorch already provides a range of pretrained networks for various tasks like classification and object detection. You can check out the different models here: https://pytorch.org/docs/stable/torchvision/models.html

In [None]:
# We first load the model and have a look at the different parts of the network

# load the pretrained model
pretrained_model = torchvision.models.mobilenet_v2(pretrained=True).cuda()

# features
print('Conv feature extractor:')
print(pretrained_model.features)
# classifier
print('Classification head:')
print(pretrained_model.classifier)

Here we take a look at the network architecture with ```torchsummary```, in case you haven't used it before, download with ```pip install torchsummary```

In [None]:
# Print a more thorough summary with the pretrained ImageNet input size 224x224
import torchsummary

torchsummary.summary(pretrained_model, (3, 224, 224))

In [None]:
# In the following, we only make use of the pretrained convolutional layers. However, to save computation time,
# we are not training our network on 224x224 images but 64x64. That is the beauty of convolution layers,
# as they can be applied on any resolution.
feature_extractor = pretrained_model.features

torchsummary.summary(feature_extractor, (3, 64, 64))

We build a new classifier ```PretrainedClassifier``` in which we would like to incorporate the pretrained feature extractor networks. To tailor the model to our sepcific task we have to build and re-train a new classifier as well as high-level convolutional layers. However, we want to freeze the weights of the low-level convolutional layers, as they extract simple features like edges and curves, and should not change during training. Therefore, we have to freeze the weights by disabling the ```requires_grad``` property of the parameters. In this way the feature extraction layers will not update!

Note that we only use a single fully-connected layer in our self-defined Classification-head, no dropout layer needed, but can still get good results.

In [None]:
class PretrainedClassifier(nn.Module): 
    def __init__(self, pretrained=True):
        super(PretrainedClassifier, self).__init__()

        if pretrained:
            self.feature_extractor = feature_extractor
            # We only freeze the parameters of the 10 low-level convolutional bottlenecks.
            for param in self.feature_extractor[0:10].parameters():
                param.requires_grad = False
        else:
            self.feature_extractor = torchvision.models.mobilenet_v2(pretrained=False).features
        
        # Pooling is reliant on the input image size, e.g. for size 64 => (2, 2).
        self.avg_pool = nn.AvgPool2d((2, 2))
        
        self.classifier = nn.Linear(in_features=1280, out_features=10, bias=True)
        
    def forward(self, x): 
        x = self.feature_extractor(x)
        x = self.avg_pool(x)
        x = x.reshape(x.shape[0], -1)
        x = self.classifier(x)
        return x

## Training with Pytorch Lightning

In this exercise we train the model with Pytorch Lightning framework.

### Define Pytorch Lightning Model

In [None]:
from exercise_code.transfer_learning.MyPytorchModel import MyPytorchModel

# Feel free to change the hyperparameters here!
hparams = {
    "batch_size": 64,
    "learning_rate": 3e-4,
}

finetune_model = PretrainedClassifier(pretrained=True)

model = MyPytorchModel(hparams, dataset, finetune_model)

### Start Training

The training may take 5-10 mins per epoch with CPU depending on your hardware, and 1-2 mins with GPU. 

In [None]:
trainer = None

trainer = pl.Trainer(
    max_epochs=1,
    gpus=1 if torch.cuda.is_available() else None
)

trainer.fit(model)

### Model Evaluation

In [None]:
from exercise_code.transfer_learning.Util import test_model
test_model(model)

## Conclusion
We can reach a high test accuracy i.e. about 85% after finetuning the pretrained mobilenet_v2 model. 
If you want you can play around and use other CNN backbones for your classifier. Feel free to add other networks from here: https://pytorch.org/docs/stable/torchvision/models.html