<a href="https://colab.research.google.com/github/ricglz/CE888_activities/blob/main/assignment/Project_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
! [ ! -z "$COLAB_GPU" ] && pip install torch torchvision skorch efficientnet_pytorch



## Preparations

Before we begin, lets mount the google drive to later on read information from it:

---



In [None]:
from google.colab import drive

drive_path = '/content/gdrive'
drive.mount(drive_path, force_remount=False)
drive_path += '/MyDrive'

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


## The Problem

We are going to train a neural network to classify **ants** and **bees**. The dataset consist of 120 training images and 75 validiation images for each class. First we create the training and validiation datasets:

In [None]:
import torchvision.transforms as T
import torchvision.datasets as datasets
from os import path

data_dir = path.join(drive_path, 'Flame')
resize = T.Resize(254)
normalize = T.Normalize([0.485, 0.456, 0.406], 
                         [0.229, 0.224, 0.225])
train_transforms = T.Compose([
  T.RandomHorizontalFlip(),
  T.RandomVerticalFlip(),
  T.ToTensor(),
  normalize
])
transforms = T.Compose([
  resize,
  T.ToTensor(),
  normalize
])

train_ds = datasets.ImageFolder(
    path.join(data_dir, 'Training'), transforms)
test_ds = datasets.ImageFolder(
    path.join(data_dir, 'Test'), transforms)

The train dataset includes data augmentation techniques such as cropping to size 224 and horizontal flips.The train and validiation datasets are normalized with mean: `[0.485, 0.456, 0.406]`, and standard deviation: `[0.229, 0.224, 0.225]`. These values are the means and standard deviations of the ImageNet images. We used these values because the pretrained model was trained on ImageNet.

## Loading pretrained model

We use a pretrained `ResNet18` neural network model with its final layer replaced with a fully connected layer:

In [None]:
from torch import load, FloatTensor
from torch.nn import Linear, Module
from torchvision.models import resnext101_32x8d
from efficientnet_pytorch import EfficientNet

module_model = 'resnext'
model_path = f'Models/best_{module_model}.pt'
f_params = path.join(drive_path, model_path)

class PretrainedModel(Module):
    def __init__(self, model='resnext', use_pretrained=True):
        super().__init__()
        if model == 'resnext':
            self.build_resnext_model(use_pretrained)
        elif model == 'efficientnet':
            self.build_efficientnet(use_pretrained)
    
    def get_state_dict(self):
        remove_model_prefix = lambda string: string[6:]
        return { remove_model_prefix(k): v for k, v in load(f_params).items() }

    def build_resnext_model(self, use_pretrained):
        self.model = resnext101_32x8d(pretrained=(not use_pretrained))
        num_ftrs = self.model.fc.in_features
        self.model.fc = Linear(num_ftrs, 1)
        if use_pretrained:
            self.model.load_state_dict(self.get_state_dict())
    
    def build_efficientnet(self, use_pretrained):
        model_name = 'efficientnet-b7'
        if use_pretrained:
            self.model = EfficientNet.from_name(model_name, num_classes=1)
            self.model.load_state_dict(self.get_state_dict())
        else:
            self.model = EfficientNet.from_pretrained(model_name, num_classes=1)
        
    def forward(self, x):
        return self.model(x).squeeze(-1).type(FloatTensor)

Since we are training a binary classifier, the output of the final fully connected layer has size 2.

## Using skorch's API

In this section, we will create a `skorch.NeuralNetClassifier` to solve our classification problem. 

### Callbacks

Next, we create a `Checkpoint` callback which saves the best model by by monitoring the validation accuracy. 

In [None]:
from skorch.callbacks import Checkpoint

f_params = path.join(drive_path, 'MyDrive/Models/best_model.pt')
checkpoint = Checkpoint(f_params=f_params, monitor='valid_acc_best')

Lastly, we create a `Freezer` to freeze all weights besides the final layer named `model.fc`:

In [None]:
from skorch.callbacks import Freezer

freezer = Freezer(lambda x: not x.startswith('model.fc'))

In [None]:
from skorch.callbacks import EarlyStopping

early_stopping = EarlyStopping(patience=3)

### skorch.NeuralNetClassifier

With all the preparations out of the way, we can now define our `NeuralNetClassifier`:

In [None]:
from torch import float64
from skorch.classifier import NeuralNetClassifier
from skorch.utils import to_tensor

class MyClassifier(NeuralNetClassifier):
    def infer(self, x, **fit_params):
        x = to_tensor(x, device=self.device)
        if isinstance(x, dict):
            x_dict = self._merge_x_and_fit_params(x, fit_params)
            return self.module_(**x_dict).to(device=self.device, dtype=float64)
        return self.module_(x, **fit_params).to(device=self.device, dtype=float64)

    def train_step_single(self, Xi, yi, **fit_params):
        self.module_.train()
        y_pred = self.infer(Xi, **fit_params)
        yi = yi.to(device=self.device, dtype=float64)
        loss = self.get_loss(y_pred, yi, X=Xi, training=True)
        loss.backward()
        return { 'loss': loss,'y_pred': y_pred }

    def validation_step(self, Xi, yi, **fit_params):
        self.module_.eval()
        y_pred = self.infer(Xi, **fit_params)
        yi = yi.to(device=self.device, dtype=float64)
        loss = self.get_loss(y_pred, yi, X=Xi, training=False)
        return { 'loss': loss,'y_pred': y_pred }

In [None]:
from torch.optim import Adam
from torch.nn import BCEWithLogitsLoss
from skorch.dataset import CVSplit

lr = 1e-2
net = MyClassifier(
    PretrainedModel,
    module__use_pretrained=False,
    module__model=module_model,
    optimizer=Adam,
    lr=lr,
    criterion=BCEWithLogitsLoss,
    batch_size=8,
    max_epochs=50,
    iterator_train__shuffle=True,
    iterator_train__num_workers=16,
    iterator_valid__shuffle=True,
    iterator_valid__num_workers=16,
    train_split=CVSplit(0.2),
    callbacks=[early_stopping, checkpoint],
    device='cuda'
)

That is quite a few parameters! Lets walk through each one:

1. `model_ft`: Our `ResNet18` neural network
2. `criterion=nn.CrossEntropyLoss`: loss function
3. `lr`: Initial learning rate
4. `batch_size`: Size of a batch
5. `max_epochs`: Number of epochs to train
6. `module__output_features`: Used by `__init__` in our `PretrainedModel` class to set the number of classes.
7. `optimizer`: Our optimizer
8. `optimizer__momentum`: The initial momentum
9. `iterator_{train,valid}__{shuffle,num_workers}`: Parameters that are passed to the dataloader.
10. `train_split`: A wrapper around `val_ds` to use our validation dataset.
11. `callbacks`: Our callbacks 
12. `device`: Set to `cuda` to train on gpu.

Now we are ready to train our neural network:

In [None]:
net.fit(train_ds, y=None)
print()

RuntimeError: ignored

In [None]:
# net.initialize()
# print()

In [None]:
from sklearn.metrics import accuracy_score
import numpy as np
from skorch.utils import to_numpy

def score(net, X, y=None):
    y_true = y
    if y_true is None:
      ds = net.get_dataset(X) 
      target_iterator = net.get_iterator(ds, training=False) 
      y_true = np.concatenate([to_numpy(y) for _, y in target_iterator])

    if y_true is None:
      return 1

    y_pred = net.predict(X)
    if y_pred is None:
      return 0
    return accuracy_score(y_true, y_pred)

score(net, test_ds, y=None)

The best model is stored at `best_model.pt`, with a validiation accuracy of roughly 0.96.

Congrualations! You now know how to finetune a neural network using `skorch`. Feel free to explore the other tutorials to learn more about using `skorch`.