<a href="https://colab.research.google.com/github/Steve-YJ/Assignment_Standalone_DL/blob/master/%5BPytorch_Tutorial%5D_Finetuning_Torchvision_Models.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Fine-Tuning Torchvision Models
- how to finetune and feature extract the torchvision models, all of which have been pretrained on the 1000-class imagenet dataset
- how to work with several modern CNN architectures
- build an intuition for fine-tunning any PyTorch model
- Since each model architecture is different, there is no boilerplate fine-tunning code that will work in all scenarios
- Rather, the researcher must look at the existing architecture and make custom adjustments for each model

## In this document

- we'll perform two types of transfer learning
    - fine-tuning
    - feature extraction
- In fine-tuning
    - we start with a pre-trained model and update all of the model's parameters for our new task, in essence retraining the whole model
- In feature extraction
    - we start with a pre-trained model and only update the final layer weights from which we derive predictions.
    - It is called feature extraction because we use the pre-trained CNN as a fixed feature-extractor, and only change the output layer

✅ Fine-Tune: <br>
* re-training the whole model<br>
* update all of the model's parameters for our new task

✅ Feature Extraction: <br>
* start with pre-trained model and only update the final layer weights from which derive predictions<br>
* It is called feature extraction because we use the pre-trained CNN as a fixed feature-extrator, only change the output layer

In general both transfer learning methods follow the same few steps:

- Initialize the pre-trained model
- Reshape the final layer(s) to have the same number of outputs as the number of classes in the new dataset
- Define for the optimization algorithm which parameters we want to update during training
- Run the training step

In [1]:
"""
    Just do it!
"""

'\n    Just do it!\n'

## Reference
* <a ref='https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html'>Pytorch.org - 'Fine-Tuning Torchvision Models'</a>

# Drive Mount
* for using google drive

In [6]:
from google.colab import drive

# mount drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


In [7]:
! ls

drive  sample_data


In [9]:
%cd drive/My\ Drive/TIL(Today-I-learned)/[Pytorch-Tutorial]

/content/drive/My Drive/TIL(Today-I-learned)/[Pytorch-Tutorial]


In [10]:
! ls

 hymenoptera_data  '[Pytorch-Tutorial] Finetuning Torchvision Models.ipynb'


# Import Library

In [5]:
from __future__ import print_function
from __future__ import division

import numpy as np

import torch
import torch.nn as nn
import torch.optim as optim

import torchvision
from torchvision import datasets, models, transforms

import matplotlib.pyplot as plt

import time
import os
import copy

print("Pytorch Version: ", torch.__version__)
print("Torchvision Version: ", torchvision.__version__)

Pytorch Version:  1.7.0+cu101
Torchvision Version:  0.8.1+cu101


## Inputs

* Dataset
    * We will use the hymenoptera_data dataset
    * This dataset contains two classes, bees and ants, and is structured such that we can use the ImageFolder dataset,

* Download the data and set the <code>data_dir</code> input to the root directory of the dataset. 
* The <code>model_name</code> input is the name of the model you wish to use and must be selected from this list:
    
        > [resnet, alexnet, vgg, squeezenet, densnet, inception]


* If you want to find more, click <a>this</a>

✅ Summary<Br>
* <code>data_dir</code>
* <code>model_name</code>
* <code>num_classes</code>
* <code>batch_size</code>
* <code>num_epochs</code>
* <code>feature_extract</code> is a boolean that defines if we are fine-tuning or feature extracting
    * if <code>feature_extract = False</code>: the model is fine-tuned and all model parameters are updated
    * if <code>feature_extract = True</code>: only last layer parameters are updatated, the others remain fixed

In [11]:
# Top level data directory, Here we assume the format of the directory conforms to the ImageFolder structure
data_dir = "./data/hymenoptera_data"

# Models to choose from [resnet, alexnet, vgg, squeezenet, densenet, inception]
model_name = "squeezenet"

# Number of classes in the dataset
num_classes = 2

# Batch size for training (chane depending on how much memory you have)
batch_size = 8

# Number of epochs to train for
num_epochs = 15

# Flag for feature extracting. When False, we finetune the whole model,
#   when True we only update the reshaped layer params
# Feature Extracting 할거니 아니면 전체 모델을 fine-tune 할거니

feature_extract = True

## Helper Functions
* Q. Wht is Helper Functions?
    * Before we write the code for adjusting the models, lets define a few helper functions.

## Model Training and Vlidation Code

* The <code>train_model</code> function handles the training and validation of a given model.
* As input, it takes a PyTorch model, a dictionary of dataloaders, a loss functions, as optimizer, a specified number of epochs to train and validate for, and a boolean flag for when the model is an Inception model.
* ❓ The <code>is_inception</code> flag is used to accomodate the Inception v3 model, as that architecture uses an auxiliary output and he overall model loss respects both the auxiliary output and the final output, as described here. 
* ❓ The function trains for the specified number of epochs and after each epoch runs a full validation step. 
* ❓ It also keeps track of the best performing model (in terms of validation accuracy), and at the end of training returns the best performing model. After each epoch, the training and validation accuracies are printed.



In [13]:
# Define train_model function

def train_model(model, dataloaders, criterion, optimizer, num_epochs, is_inception=False):
    since = time.time()

    val_acc_history = []

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs-1))
        print('-'*10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)  # cuda
                labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    # Get model outputs and calculate loss
                    # Special case for inception because in training it has an auxiliary output.
                    # In train mode, we calculate the loss by summing the final output and the auxiliary output
                    # but in testing we only consider the final output

                    if is_inception and phase == 'train':
                        # From https://discuss.pytorch.org/t/how-to-optimize-inception-model-with-auxiliary-classifiers/7958
                        outputs, aux_outputs = model(inputs)
                        loss1 = criterion(outputs, labels)
                        loss2 = criterion(aux_outputs, labels)
                        loss = loss1 + 0.4*loss2  # How funny is it

                    else:
                        outputs = model(inputs)  # If not Inception Model, outputs is just model's output
                        loss = criterion(outputs, 1)
                    
                    _, preds = torch.max(outputs, 1)

                    # backward + optimize only if in training phase 
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)
            # Actually, the loss of an epoch is usually defined as the average of the loss of batches in that epoch. 
            # So you can accumulate the loss values during an epoch and at the end divide it by the number of batches in the epoch:
            epoch_loss = running_loss / len(dataloaders[phase].dataset)
            epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset)

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())
            if phase == 'val':
                val_acc_history.append(epoch_acc)

        print()
    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed//60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model, val_acc_history

## Initialize and Reshape the Networks

## ResNet
* Hi :)

## Alexnet

## VGG

## Squeezenet

## Inception v3

# Load Data

# Run Training and Validation Step

# Comparison with Model Trained from Scratch

# Final Thoughts and Where to Go Next 