
Transfer Learning for Art Classification
==========================
**Authors**: Jasper van Tilburg, Martijn Bosma, Thomas Barendse

In this notebook, we intend to reproduce results of the paper by Sabatelli et. al []. We apply two Transfer Learning (TL) procedures to the Rijksmuseum dataset used in the paper, as well as a subset of the iMet dataset (see [link to kaggle]). 

The following code is mainly based on two other attempts at TL:
- https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html
- https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html

These two major TL procedures look as follows:

-  **Finetuning**: Taking the parameters of the model trained on ImageNet as a starting point, we train all parameters of the model further on our dataset.
-  **Off-the-shelf**: Taking the parameters of the model trained on ImageNet as a starting point, we only retrain the last softmax layer of the network. All other parameters remain the same.

The aim of the reproduction is to verify the results presented in the paper.



Importing data from Kaggle
--------------------------

The following link describes how to get a token in order to import the data from kaggle: https://www.kaggle.com/general/74235

In [0]:
%matplotlib inline
from google.colab import files

In [0]:
!pip install -q kaggle

In [3]:
files.upload()

Saving kaggle.json to kaggle.json


{'kaggle.json': b'{"username":"thomasbarendse96","key":"b6742b4c6514723c477f7cef851aefca"}'}

In [0]:
! mkdir ~/.kaggle
! cp kaggle.json ~/.kaggle/
! chmod 600 ~/.kaggle/kaggle.json

In [7]:
!kaggle datasets download -d 'martybosma/rijksbymaterialfiltered' -p /content
!unzip \*.zip

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  inflating: rijks_by_material_filtered/rijks_jpg/val/papier/0058439_RP-P-1911-526.jpg  
  inflating: rijks_by_material_filtered/rijks_jpg/val/papier/0058454_RP-P-OB-24.354.jpg  
  inflating: rijks_by_material_filtered/rijks_jpg/val/papier/0058518_RP-P-1882-A-5325.jpg  
  inflating: rijks_by_material_filtered/rijks_jpg/val/papier/0058563_RP-P-1878-A-1867.jpg  
  inflating: rijks_by_material_filtered/rijks_jpg/val/papier/0058581_RP-P-OB-24.382.jpg  
  inflating: rijks_by_material_filtered/rijks_jpg/val/papier/0058582_RP-P-1895-A-18678.jpg  
  inflating: rijks_by_material_filtered/rijks_jpg/val/papier/0058602_RP-P-OB-24.439.jpg  
  inflating: rijks_by_material_filtered/rijks_jpg/val/papier/0058608_RP-P-OB-24.432.jpg  
  inflating: rijks_by_material_filtered/rijks_jpg/val/papier/0058611_RP-P-OB-24.429.jpg  
  inflating: rijks_by_material_filtered/rijks_jpg/val/papier/0058615_RP-P-OB-24.426.jpg  
  inflating: rijks_by_materia

Python imports
--------------

In [0]:
from __future__ import print_function, division

import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
import numpy as np
import torchvision
from torchvision import datasets, models, transforms
import matplotlib.pyplot as plt
import time
import os
import copy

plt.ion()   # interactive mode

Preprocessing the Data
---------

We use torch.utils.data.DataLoader to load the data such that it can be used for training. Minimal data transformation is applied, but most importantly, pixel values are mapped to the [0,1]-interval, done by tranforms.ToTensor().

We also initialise Colab's GPU for training.


In [0]:
data_transforms = {
    'train': transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.ToTensor(),
    ]),
    'val': transforms.Compose([
        transforms.Resize(224),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
    ]),
}

data_dir = './rijks_by_material_filtered/rijks_jpg/'

image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
                                          data_transforms[x])
                  for x in ['train', 'val']}
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=32,
                                             shuffle=True, num_workers=4)
              for x in ['train', 'val']}
dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'val']}
class_names = image_datasets['train'].classes

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

Training the model
------------------

The following function retrains the initialised model on the provided dataset



In [0]:
def train_model(model, criterion, optimizer, num_epochs=25):
    since = time.time()
    time_elapsed = 0

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch+1, num_epochs))
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)

            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())
        
        epoch_time = time.time() - time_elapsed - since
        time_elapsed = time.time() - since
        print('Epoch complete in {:.0f}m {:.0f}s'.format(
              epoch_time // 60, time_elapsed % 60))
        print('Total time {:.0f}m {:.0f}s'.format(
              time_elapsed // 60, time_elapsed % 60))

    print('Best val Acc: {:4f}'.format(best_acc))

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model

Importing a pretrained model
----------------------------

Functions to import a pretrained model and specify whether to freeze its parameters, referred to as feature extraction (True means freeze)

In [0]:
def set_parameter_requires_grad(model, feature_extracting):
    if feature_extracting:
        for param in model.parameters():
            param.requires_grad = False

def initialize_model(model_name, num_classes, feature_extract, use_pretrained=True):
    # Initialize these variables which will be set in this if statement. Each of these
    #   variables is model specific.
    model_ft = None

    if model_name == "resnet":
        """ Resnet50
        """
        model_ft = models.resnet50(pretrained=use_pretrained)
        set_parameter_requires_grad(model_ft, feature_extract)
        num_ftrs = model_ft.fc.in_features
        model_ft.fc = nn.Linear(num_ftrs, num_classes)

    elif model_name == "alexnet":
        """ Alexnet
        """
        model_ft = models.alexnet(pretrained=use_pretrained)
        set_parameter_requires_grad(model_ft, feature_extract)
        num_ftrs = model_ft.classifier[6].in_features
        model_ft.classifier[6] = nn.Linear(num_ftrs,num_classes)

    elif model_name == "vgg":
        """ VGG11_bn
        """
        model_ft = models.vgg11_bn(pretrained=use_pretrained)
        set_parameter_requires_grad(model_ft, feature_extract)
        num_ftrs = model_ft.classifier[6].in_features
        model_ft.classifier[6] = nn.Linear(num_ftrs,num_classes)

    return model_ft

Finetuning TL
----------------------

Load a pretrained model and allow for retraining all parameters (feature_extract = False). We use the cross entropy loss, and stochastic gradient descent with the parameters as specified in the paper.




In [13]:
model_name = 'alexnet' 
num_classes = len(class_names)

model_ft = initialize_model(model_name, num_classes, feature_extract=False, use_pretrained=True)

model_ft = model_ft.to(device)

criterion = nn.CrossEntropyLoss()

# Observe that all parameters are being optimized
optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9, nesterov=True)

Downloading: "https://download.pytorch.org/models/alexnet-owt-4df8aa71.pth" to /root/.cache/torch/checkpoints/alexnet-owt-4df8aa71.pth


HBox(children=(IntProgress(value=0, max=244418560), HTML(value='')))




In [17]:
model_ft = train_model(model_ft, criterion, optimizer_ft, num_epochs=25)

Epoch 1/25
----------
train Loss: 0.3988 Acc: 0.9035
val Loss: 0.3174 Acc: 0.9173
Epoch complete in 11m 59s
Total time 11m 59s
Epoch 2/25
----------


KeyboardInterrupt: ignored

"Off-the-shelf" TL
----------------------------------

We now set feature_extract = True, such that only the final softmax layers is retrained. We use the same loss function, but for optimisation we now use RMSProp, as the authors of the paper did.

In [0]:
model_name = 'resnet' 
num_classes = len(class_names)

model_ots = initialize_model(model_name, num_classes, feature_extract=True, use_pretrained=True)

model_ots = model_ots.to(device)

criterion = nn.CrossEntropyLoss()

optimizer_ots = optim.RMSprop([param for param in model_ots.parameters() if param.requires_grad == True], 
                               lr=0.001, eps=1e-08, momentum=0.9)

In [0]:
model_ots = train_model(model_ots, criterion, optimizer_ots, num_epochs=25)

Epoch 1/25
----------
train Loss: 2.0319 Acc: 0.8811
val Loss: 2.2615 Acc: 0.8976
Epoch complete in 12m 58s
Total time 12m 58s
Epoch 2/25
----------
