**NOTE: This notebook is written for the Google Colab platform, which provides free hardware acceleration. However it can also be run (possibly with minor modifications) as a standard Jupyter notebook, using a local GPU.** 



In [None]:
#@title -- Installation of Packages -- { display-mode: "form" }
import sys
!{sys.executable} -m pip install torchinfo
!{sys.executable} -m pip install git+https://github.com/michalgregor/class_utils.git

In [None]:
#@title -- Import of Necessary Packages -- { display-mode: "form" }
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score
from class_utils.pytorch_utils import BestModelCheckpointer, freeze_except_last
from torch.optim.lr_scheduler import ExponentialLR
from torchvision import models, transforms
from torch.utils.data import DataLoader, TensorDataset
from torchvision.datasets import ImageFolder
import torchinfo
import torch.nn as nn
import torch

In [None]:
#@title -- Downloading Data -- { display-mode: "form" }
from class_utils.download import download_file_maybe_extract
download_file_maybe_extract("https://www.dropbox.com/s/w4pg809npvatye0/food5v2.zip?dl=1", directory="data/food5v2")

# also create a directory for storing any outputs
import os
os.makedirs("output", exist_ok=True)

## Transfer Learning: Using the Pre-trained Network as a Feature Extractor

In a previous notebook we have explored the standard approach to transfer learning with frozen weights, training a new final layer and then fine-tuning. However, there is another alternative, where the pretrained network is used as a feature extractor. Under that approach, you would first strip off the network's classification layer so that the network returns the feature vector from the penultimate layer rather than the logits from the final layer. Then you would run over your entire dataset with it and preprocess it. This will make your dataset much, much smaller – unless it was super large to begin with, it should even fit in your memory all at once.

You can then use the preprocessed data and train just your new layers – the training will go much faster, because you will not be loading the images and running them through your large network again and again. One downside is, of course, that you will not be able to use data augmentation techniques – but that might not be too high a price to pay.

### Setting up the Data Loaders

Since we are not going to be using data augmentation, we can just use the default image transforms provided with the pretrained weights directly. Other than that, the datasets and data loaders will be constructed the same way we constructed them in the previous example.



In [None]:
device = "cuda" if torch.cuda.is_available() else "cpu"
weights = models.ResNet50_Weights.IMAGENET1K_V2
image_transforms = weights.transforms()

In [None]:
train_dataset = ImageFolder(
    "data/food5v2/training",
    image_transforms
)

train_dataloader = DataLoader(
    train_dataset,
    batch_size=32,
    shuffle=True,
    num_workers=4
)

valid_dataset = ImageFolder(
    "data/food5v2/validation",
    image_transforms
)

valid_dataloader = DataLoader(
    valid_dataset,
    batch_size=32,
    shuffle=True,
    num_workers=4
)

test_dataset = ImageFolder(
    "data/food5v2/testing",
    image_transforms
)

test_dataloader = DataLoader(
    test_dataset,
    batch_size=32,
    shuffle=True,
    num_workers=4
)

### Loading the Pre-Trained Network

Next, we are going to load our pretrained ResNet50. To strip away the final layer (`.fc`), we are going to replace it with an empty `nn.Sequential` module.



In [None]:
pretrained_net = models.resnet50(weights=weights).to(device)
pretrained_net.fc = nn.Sequential()

### Preprocessing the Data

It's quite easy to use the network to preprocess the data. We'll just iterate over the data loader for each fold (train, valid, test) and collect the desired outputs and the preprocessed inputs into two tensors. Then we are going to set up a `TensorDataset` object and another corresponding data loader for each fold.



In [None]:
def extract_features(feature_extractor, data_loader):
    feature_extractor.eval()
    X = []; Y = []

    for X_batch, Y_batch in data_loader:
        X_batch = X_batch.to(device)

        with torch.no_grad():
            X_batch = feature_extractor(X_batch)

        X.extend(X_batch.cpu())
        Y.extend(Y_batch.cpu())
  
    return torch.stack(X), torch.stack(Y)

In [None]:
X_train, Y_train = extract_features(pretrained_net, train_dataloader)
X_valid, Y_valid = extract_features(pretrained_net, valid_dataloader)
X_test, Y_test = extract_features(pretrained_net, test_dataloader)

train_tensor_dataset = TensorDataset(X_train, Y_train)
train_tensor_dataloader = DataLoader(train_tensor_dataset, batch_size=32, shuffle=True)
valid_tensor_dataset = TensorDataset(X_valid, Y_valid)
valid_tensor_dataloader = DataLoader(valid_tensor_dataset, batch_size=32, shuffle=True)
test_tensor_dataset = TensorDataset(X_test, Y_test)
test_tensor_dataloader = DataLoader(test_tensor_dataset, batch_size=32, shuffle=True)

### Training New Layers

The new top of our network will look the same as it did in the previous example. The training loop will also be quite standard – except we'll now be using our `train_tensor_dataloader` and we'll be training our top separately, i.e. it will not be attached to the pretrained network. Since training is going to be much faster now, we can afford to increase the number of epochs as well.



In [None]:
class ModelTop(nn.Module):
    def __init__(self, num_features, num_outputs):
        super().__init__()
        self.dropout = nn.Dropout(0.5)
        self.fc = nn.Linear(num_features, num_outputs)

    def set_dropout(self, p):
        self.dropout.p = p
    
    def forward(self, x):
        y = torch.flatten(x, 1)
        y = self.dropout(y)
        y = self.fc(y)
        return y

In [None]:
model = ModelTop(X_train.shape[1], 10).to(device)

In [None]:
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
checkpointer = BestModelCheckpointer(checkpoint_path="output/best_model.pt")
loss_train = []
loss_valid = []

for epoch in range(200):
    epoch_train_loss = []
    epoch_valid_loss = []

    model.train()
    for X_batch, Y_batch in train_tensor_dataloader:
        X_batch = X_batch.to(device)
        Y_batch = Y_batch.to(device)
        
        y_batch = model(X_batch)
        loss = criterion(y_batch, Y_batch)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        epoch_train_loss.append(loss.item())

    loss_train.append(np.mean(epoch_train_loss))

    model.eval()
    for X_batch, Y_batch in valid_tensor_dataloader:
        X_batch = X_batch.to(device)
        Y_batch = Y_batch.to(device)
        
        with torch.no_grad():
            y_batch = model(X_batch)
            loss = criterion(y_batch, Y_batch)

        epoch_valid_loss.append(loss.item())

    loss_valid.append(np.mean(epoch_valid_loss))
    checkpointer(loss_valid[-1], model)

    if epoch % 5 == 0:
        print(f"epoch {epoch}, train loss: {np.mean(loss_train[-5:])}, valid loss: {np.mean(loss_valid[-5:])}")

print(f"epoch {epoch}, loss: {loss_train[-1]}")

In [None]:
plt.plot(loss_train, label="train")
plt.plot(loss_valid, label="valid")
plt.xlabel("epoch")
plt.ylabel("loss")
plt.grid(ls='--')
plt.legend()

### Evaluation

Once training is done, we can load back the weights with the best validation loss and evaluate the model on the test set.



In [None]:
model.load_state_dict(torch.load("output/best_model.pt"));

In [None]:
eval_Y = []
eval_y = []

model.eval()
for X_batch, Y_batch in test_tensor_dataloader:
    eval_Y.extend(Y_batch.numpy())
    X_batch = X_batch.to(device)
    Y_batch = Y_batch.to(device)
    
    with torch.no_grad():
        y_batch = model(X_batch)

    eval_y.extend(y_batch.argmax(dim=1).cpu().numpy())

eval_Y = np.array(eval_Y)
eval_y = np.array(eval_y)

cm = pd.crosstab(
    eval_Y, eval_y,
    rownames=['actual'],
    colnames=['predicted']
)
print(cm, '\n')

acc = accuracy_score(eval_Y, eval_y)
print("Accuracy = {}".format(acc))

### Putting the Model Back Together

Finally, we can put the model back together so that we are able to run it on the original data. This is very simple – we'll only need to assign our `model` to `pretrained_net.fc`.



In [None]:
pretrained_net.fc = model

In [None]:
eval_Y = []
eval_y = []

pretrained_net.eval()
for X_batch, Y_batch in test_dataloader:
    eval_Y.extend(Y_batch.numpy())
    X_batch = X_batch.to(device)
    Y_batch = Y_batch.to(device)
    
    with torch.no_grad():
        y_batch = pretrained_net(X_batch)

    eval_y.extend(y_batch.argmax(dim=1).cpu().numpy())

eval_Y = np.array(eval_Y)
eval_y = np.array(eval_y)

cm = pd.crosstab(
    eval_Y, eval_y,
    rownames=['actual'],
    colnames=['predicted']
)
print(cm, '\n')

acc = accuracy_score(eval_Y, eval_y)
print("Accuracy = {}".format(acc))