<a href="https://colab.research.google.com/github/mancinimassimiliano/DeepLearningLab/blob/master/Lab3/finetune_alexnet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction

### A quick guide to Transfer-learning

Transfer learning allows you to "*transfer*" the knowledge gained while solving a task to solve another **related** task at hand. Fine-tuning a network, previously trained on a big dataset (e.g., ImageNet), to classify a given dataset containing limited annotated training samples is one of the simplest *transfer-learning* approaches.

Specefically, in this lab we are going to see how to use a pre-trained Alexnet for the task of object recognition/classification. The will get you aquainted with the technique for fine-tuning any given pre-trained network.

### Mount your Google drive folder on Colab

First things first, we are going to put data in google drive and then copy from google drive to colab local drive.

In [0]:
from google.colab import drive
drive.mount('/content/gdrive')

### Download dataset

After mounting the drive we will download OfficeHome dataset. 

Add [this](https://drive.google.com/file/d/10xMJi5uD8kVh9xkksq1pXngpUl-ckRBa/view?usp=sharing) .tar file to your Unitn google drive.



Copy the .tar file from gdrive to local colab drive. This will make data loading faster.

First create a directory with `!mkdir dataset` in your current path

In [0]:
!mkdir dataset

In [0]:
!cp "gdrive/My Drive/datasets/OfficeHomeDataset.tar" dataset/

Unzip the .tar file

```
!tar -xf OfficeHomeDataset.tar
```
    Wait till its unzipped.

In [0]:
# unzip here
!tar -xf OfficeHomeDataset.tar

In [0]:
# import necessary libraries
import torch
import torchvision
import torch.nn.functional as F
import torchvision.transforms as T

# Library needed for visualization purposes
from tensorboardcolab import TensorBoardColab

# Instantiate visualizer
tb = TensorBoardColab(graph_path='./log')

### Alexnet 

![Alexnet architecture](https://www.oreilly.com/library/view/tensorflow-for-deep/9781491980446/assets/tfdl_0106.png)

This is the network architecture of [Alexnet](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf). In this tutorial we are going to finetune a pre-trained Alexnet for classifying Office-Home images. Alexnet has been pre-trained on ILSVRC-2012 dataset, a dataset that has more than **1 million** images with 1000 classes.

***A little bit about the Office-Home dataset.***

Office Home has images from 4 different domains with each domain having 65 distinct categories. In this lab session we are going to use the `Art` domain.

![Office-Home](http://hemanthdv.github.io/profile/images/DataCollage.jpg)

***Steps for Fine tuning Alexnet***

1.   Discard the final layer (or the output layer) of Alexnet which contains 1000 output neurons. 
2.   Randomly initialize the final layer with the number of output categories present in the dataset using `torch.nn.Linear`. In this case its 65 because OfficeHome has 65 classes. Keep all the other layers intact.
3. Train the network with a low learning rate for the pre-initialized layers and and a higher learning rate for the newly initialized layer. 



### How to load datasets from folders containing images

We have already seen that PyTorch's `torchvision` package provides some very basic dataloaders like `torchvision.datasets.MNIST`, `torchvision.datasets.SVHN`, etc. But it might happen (and is frequently the case) that we need to load our own datasets from folders.

PyTorch is kind enough to provide `torchvision.datasets.ImageFolder` for loading datasets from folders with just one line of code. But we need to ensure that the images in the folders are arranged in a certain way. A sample format is shown below:

OfficeHomeDataset

        |
        |--- Alarm_Clock
        |                
        |      |--- 00046.jpg
        |      |--- 00050.jpg
        |          
        |--- Couch
               | --- 00007.jpg
               | --- 00023.jpg
 
 In other words, the parent folder (*OfficeHomeDataset*) contains the sub-folders (e.g *Alarm_Clock*, *Couch*). These sub-folders are the *classes*. Further, each sub-folder contains a bunch of images (eg. 00046.jpg, 00050.jpg for the class *Alarm_Clock*). Internally, PyTorch will assign a class label to each sub-folder and will also load the corresponding images.
 
 More details goes [here](https://pytorch.org/docs/stable/torchvision/datasets.html#torchvision.datasets.ImageFolder).
 
N.B. Your own folder structure might look different, so provide path to your parent folder accordingly.

### Define the function that fetches a data loader that is then used during iterative training.

We are going to see some more PyTorch data transformations during loading the data.

In [0]:
'''
Input arguments:
  batch_size: mini batch size used during training
  img_root: path to the dataset parent folder. 
            The folder just above the sub-folders or class folders
'''

def get_data(batch_size, img_root):
  
  # Prepare data transformations for the train loader
  transform = list()
  # TODO for image resize                                     # Resize each PIL image to 256 x 256
  # TODO for image crop                                       # Randonly crop a 224 x 224 patch
  # TODO for conversion to Tensor                             # converts Numpy to Pytorch Tensor
  # TODO for normalization                                    # mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
                                                              # Normalize with ImageNet mean
  # TODO for Compose                                          # Composes the above transformations into one.
    
  # Load data
  # TODO
  
  # Create train and test splits
  # We will create a 80:20 % train:test split
  num_samples = len(officehome_dataset)
  training_samples = int(num_samples * 0.8 + 1)
  test_samples = num_samples - training_samples

  training_data, test_data = torch.utils.data.random_split(officehome_dataset, 
                                                           [training_samples, test_samples])

  # Initialize dataloaders
  train_loader = torch.utils.data.DataLoader(training_data, batch_size, shuffle=True)
  test_loader = torch.utils.data.DataLoader(test_data, batch_size, shuffle=False)
  
  return train_loader, test_loader

### Define Alexnet model

PyTorch provides a bunch of pre-trained models whose parameters have been learned on ImageNet dataset. All these models can be found [here](https://pytorch.org/docs/stable/torchvision/models.html).

We will load a pre-trained Alexnet using PyTorch's `torchvision` package. Then we will re-initialize the output layer suited for our classification task.

Before that lets have a look at the [code](https://github.com/pytorch/vision/blob/master/torchvision/models/alexnet.py) of Alexnet.

In [0]:
'''
Input arguments
  num_classes: number of classes in the dataset.
               This is equal to the number of output neurons.
'''

def initialize_alexnet(num_classes):
  # load the pre-trained Alexnet
  # TODO
  
  # get the number of neurons in the penultimate layer
  # TODO for getting number of features
  
  # re-initalize the output layer
  # TODO
  
  return alexnet

### Visualize our custom Alexnet model

In [0]:
print(initialize_alexnet(65))

### Define cost function

In [0]:
def get_cost_function():
  cost_function = torch.nn.CrossEntropyLoss()
  return cost_function

### Define the optimizer

The optimizer will be different from the previous experiments.

Previously the *learning rate* of all the layers of the network was the same.

Now, we will have different layers learning at a different rate. The pre-trained layers need to be updated at a lesser rate than the newly initialized layer. Details are available [here](https://pytorch.org/docs/stable/optim.html).

In [0]:
def get_optimizer(model, lr, wd, momentum):
  
  # we will create two groups of weights, one for the newly initialized layer
  # and the other for rest of the layers of the network
  
  # TODO for final layer weights
  # TODO for rest of the net weights
  
  # we will iterate through the layers of the network
  # TODO
  
  # so now we have divided the network weights into two groups.
  # We will train the final_layer_weights with learning_rate = lr
  # and rest_of_the_net_weights with learning_rate = lr / 10
  
  # TODO
  
  return optimizer

### Train and test functions

In [0]:
def test(net, data_loader, cost_function, device='cuda:0'):
  samples = 0.
  cumulative_loss = 0.
  cumulative_accuracy = 0.

  net.eval() # Strictly needed if network contains layers which has different behaviours between train and test
  with torch.no_grad():
    for batch_idx, (inputs, targets) in enumerate(data_loader):
      # Load data into GPU
      inputs = inputs.to(device)
      targets = targets.to(device)
        
      # Forward pass
      outputs = net(inputs)

      # Apply the loss
      loss = cost_function(outputs, targets)

      # Better print something
      samples+=inputs.shape[0]
      cumulative_loss += loss.item() # Note: the .item() is needed to extract scalars from tensors
      _, predicted = outputs.max(1)
      cumulative_accuracy += predicted.eq(targets).sum().item()

  return cumulative_loss/samples, cumulative_accuracy/samples*100


def train(net,data_loader,optimizer,cost_function, device='cuda:0'):
  samples = 0.
  cumulative_loss = 0.
  cumulative_accuracy = 0.

  
  net.train() # Strictly needed if network contains layers which has different behaviours between train and test
  for batch_idx, (inputs, targets) in enumerate(data_loader):
    # Load data into GPU
    inputs = inputs.to(device)
    targets = targets.to(device)
      
    # Forward pass
    outputs = net(inputs)

    # Apply the loss
    loss = cost_function(outputs,targets)

    # Reset the optimizer
      
    # Backward pass
    loss.backward()
    
    # Update parameters
    optimizer.step()
    
    optimizer.zero_grad()

    # Better print something, no?
    samples+=inputs.shape[0]
    cumulative_loss += loss.item()
    _, predicted = outputs.max(1)
    cumulative_accuracy += predicted.eq(targets).sum().item()

  return cumulative_loss/samples, cumulative_accuracy/samples*100

### Wrapping everything up

Finally, we need a main function which initializes everything + the needed hyperparameters and loops over multiple epochs (printing the results).

In [0]:
'''
Input arguments
  batch_size: Size of a mini-batch
  device: GPU where you want to train your network
  weight_decay: Weight decay co-efficient for regularization of weights
  momentum: Momentum for SGD optimizer
  epochs: Number of epochs for training the network
  num_classes: Number of classes in your dataset
  visualization_name: Name of the visualization folder
  img_root: The root folder of images
'''

def main(batch_size=128, 
         device='cuda:0', 
         learning_rate=0.001, 
         weight_decay=0.000001, 
         momentum=0.9, 
         epochs=50, 
         num_classes=65, 
         visualization_name='alexnet_sgd', 
         img_root=None):
  
  train_loader, test_loader = get_data(batch_size=batch_size, 
                                       img_root=img_root)
  
  # TODO initialize network
  
  optimizer = get_optimizer(net, learning_rate, weight_decay, momentum)
  
  cost_function = get_cost_function()

  print('Before training:')
  train_loss, train_accuracy = test(net, train_loader, cost_function)
  test_loss, test_accuracy = test(net, test_loader, cost_function)

  print('\t Training loss {:.5f}, Training accuracy {:.2f}'.format(train_loss, train_accuracy))
  print('\t Test loss {:.5f}, Test accuracy {:.2f}'.format(test_loss, test_accuracy))
  print('-----------------------------------------------------')
  
  # Add values to plots
  tb.save_value('Loss/train_loss', visualization_name, 0, train_loss)
  tb.save_value('Loss/test_loss', visualization_name, 0, test_loss)
  tb.save_value('Accuracy/train_accuracy', visualization_name, 0, train_accuracy)
  tb.save_value('Accuracy/test_accuracy', visualization_name, 0, test_accuracy)
    
  # Update plots 
  tb.flush_line(visualization_name)

  for e in range(epochs):
    train_loss, train_accuracy = train(net, train_loader, optimizer, cost_function)
    test_loss, test_accuracy = test(net, test_loader, cost_function)
    print('Epoch: {:d}'.format(e+1))
    print('\t Training loss {:.5f}, Training accuracy {:.2f}'.format(train_loss, train_accuracy))
    print('\t Test loss {:.5f}, Test accuracy {:.2f}'.format(test_loss, test_accuracy))
    print('-----------------------------------------------------')
    
    # Add values to plots
    tb.save_value('Loss/train_loss', visualization_name, e + 1, train_loss)
    tb.save_value('Loss/test_loss', visualization_name, e + 1, test_loss)
    tb.save_value('Accuracy/train_accuracy', visualization_name, se + 1, train_accuracy)
    tb.save_value('Accuracy/test_accuracy', visualization_name, e + 1, test_accuracy)
    
    # Update plots 
    tb.flush_line(visualization_name)

  print('After training:')
  train_loss, train_accuracy = test(net, train_loader, cost_function)
  test_loss, test_accuracy = test(net, test_loader, cost_function)

  print('\t Training loss {:.5f}, Training accuracy {:.2f}'.format(train_loss, train_accuracy))
  print('\t Test loss {:.5f}, Test accuracy {:.2f}'.format(test_loss, test_accuracy))
  print('-----------------------------------------------------')

Let's train!

In [0]:
main(visualization_name='alexnet_sgd_0.01_RW', 
     img_root = '/content/dataset/OfficeHomeDataset_10072016/Real World')