The purpose of this notebook is, while using the same code structure as the notebook *cnn_classification*, to compare models pretrained on other datasets and that are well known by the community. In particular, we will test on the CIFAR10 dataset the following networks : VGG16, ResNet50 and DenseNet121. These networks have won some challenges and were state-of-the-art model during a certain time in image classification. We will see the one to fits best to this task.

## Libraries and data import

In [7]:
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, models
import torch.nn.functional as F
from datasets import load_dataset
import matplotlib.pyplot as plt
import pandas as pd

### CIFAR10 dataset 

We check that the CIFAR10 has been loaded correctly.

In [8]:
cifar = load_dataset("cifar10",)
i = cifar["train"]["img"][0]
print(i.size)

(32, 32)


We extract labels for classes for the dataset object.

In [9]:
labels = cifar["train"].features["label"].names
label2id, id2label = dict(), dict()
for i, label in enumerate(labels):
    label2id[label] = str(i)
    id2label[str(i)] = label
    
print(labels)

['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']


### Dataset Class

Before converting to Dataset object, we define some transformations (where data augmentation could be done).

In [10]:
transform = transforms.Compose([
    # you can add other transformations in this list
     transforms.Resize((32,32)),
    transforms.ToTensor()
])


The Dataset class is defined below.

In [11]:
class Dataset(Dataset):
    
    # Constructor 
    def __init__(self, X_data, Y_data, transform=transform):
        self.len = len(X_data)
        self.x = X_data
        self.y = Y_data
        self.transform = transform
             
    # Getter
    def __getitem__(self, index):
        x = self.x[index] 
        y = self.y[index]
        if self.transform:
            x = self.transform(x)     
        return x, y
    
    # Get Length
    def __len__(self):
        return self.len


We create the DataLoader instances that will be used in the training, validation and testing of the model.

In [12]:
batch_size=256

dataset_val = Dataset( X_data = cifar["train"]["img"][0:10000], Y_data = cifar["train"]["label"][0:10000])
val_loader = DataLoader(dataset=dataset_val, batch_size=batch_size, shuffle=True)
dataset_train = Dataset( X_data = cifar["train"]["img"][10000:-1], Y_data = cifar["train"]["label"][10000:-1])
train_loader = DataLoader(dataset=dataset_train, batch_size=batch_size, shuffle=True)
dataset_test = Dataset( X_data = cifar["test"]["img"], Y_data = cifar["test"]["label"])
test_loader = DataLoader(dataset=dataset_test, batch_size=batch_size, shuffle=True)

## Loading trained models

We use the *torchvision* models interface to load those networks. More networks could have been added to the list. We ensure that the model has been pretrained and that just the final layers will have to be updated.

In [13]:
model_names = ['resnet50', 'vgg16', 'densenet121']
models_to_compare = {}

for name in model_names:
    model = getattr(models, name)(pretrained=True)
    models_to_compare[name] = model



We use the following function to make sure that the output is givenby  10 neurons corresponding to the 10 classes of the dataset.

In [14]:
def modify_model(model, num_classes):
    # Freeze all layers
    for param in model.parameters():
        param.requires_grad = False

    # Modify the final layer based on model architecture
    if isinstance(model, models.ResNet):
        num_ftrs = model.fc.in_features
        model.fc = nn.Linear(num_ftrs, num_classes)

    elif isinstance(model, models.VGG):
        # For VGG models, classifier is a Sequential module
        num_ftrs = model.classifier[0].in_features
        model.classifier = nn.Sequential(
            nn.Linear(num_ftrs, 256),
            nn.ReLU(inplace=True),
            nn.Dropout(0.4),
            nn.Linear(256, num_classes)
        )

    elif isinstance(model, models.DenseNet):
        # For DenseNet models, classifier is a Linear layer
        num_ftrs = model.classifier.in_features
        model.classifier = nn.Linear(num_ftrs, num_classes)
        
    else:
        raise NotImplementedError(f"Model architecture {type(model)} not supported")

    return model

for name in models_to_compare:
    models_to_compare[name] = modify_model(models_to_compare[name], 10)

A metric is defined below and a function to get optimizer that run only on the last layers.

In [15]:
import torch.optim as optim
criterion = nn.CrossEntropyLoss()

def accuracy(outputs, labels):
    _, preds = torch.max(outputs, dim=1)
    return torch.tensor(torch.sum(preds == labels).item() / len(preds))

def get_optimizer(model):
    # Only parameters of final layers are being optimized
    return optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=0.001)


## Training models

In [18]:
import time
import copy

def train_model(model, criterion, optimizer, num_epochs=25):
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")     #Use GPU if avalaible
    model.to(device)
    best_model_wts = copy.deepcopy(model.state_dict())
    best_loss = float('inf')

    epoch_train_loss = []
    epoch_train_acc = []
    epoch_val_loss = []
    epoch_val_acc = []

    for epoch in range(num_epochs):
        print(f'Epoch {epoch}/{num_epochs - 1}')
        print('-' * 10)

        ###### TRAINING #######
        print("Training")
        model.train()

        train_losses = []
        train_acc = []

        for x,y in train_loader:
            x, y = x.to(device), y.to(device)

            # Zero the parameter gradients
            optimizer.zero_grad()

            # Forward pass
            output = model(x)
            loss = criterion(output, y)
            acc = accuracy(output, y)
            loss.backward()
            optimizer.step()

            # Statistics
            train_losses.append(loss.item())
            train_acc.append(acc.item())
        
        epoch_train_loss.append(sum(train_losses)/len(train_losses))
        epoch_train_acc.append(sum(train_acc)/len(train_acc))

        ### VALIDATION ###
        print("Validation")
        val_losses = []
        val_acc = []
        val_loss_total = 0

        model.eval()
        with torch.no_grad():
            for x,y in val_loader:

                x = x.to(device)
                y = y.to(device)
                output = model(x)
                loss = criterion(output, y)
                acc = accuracy(output, y)
                
                val_losses.append(loss.item())
                val_acc.append(acc.item())
                
        epoch_val_loss.append(sum(val_losses)/len(val_losses))
        epoch_val_acc.append(sum(val_acc)/len(val_acc))

        print(f"Valid epoch {epoch} loss:", "{:.4f}".format(sum(val_losses)/len(val_losses)), "acc", "{:.4f}".format(sum(val_acc)/len(val_acc)),"\n")
        print()
        val_loss_total = sum(val_losses)/len(val_losses)
        if val_loss_total < best_loss:
            best_loss = val_loss_total
            best_model_wts = copy.deepcopy(model.state_dict())
            patience = 3  
        else:
            patience -= 1
            if patience == 0:
                break   
            
    model.load_state_dict(best_model_wts)
    return model


Once the training function has been defined, we can call it on the different models and compare the results.

In [19]:
# Train each model
trained_models = {}
for name, model in models_to_compare.items():
    print(f'\nTraining {name} model...')
    optimizer = get_optimizer(model)
    trained_models[name] = train_model(model, criterion, optimizer, num_epochs=5)



Training resnet50 model...
Epoch 0/4
----------
Training
Validation
Valid epoch 0 loss: 1.5826 acc 0.4654 


Epoch 1/4
----------
Training
Validation
Valid epoch 1 loss: 1.5396 acc 0.4714 


Epoch 2/4
----------
Training
Validation
Valid epoch 2 loss: 1.4832 acc 0.4981 


Epoch 3/4
----------
Training
Validation
Valid epoch 3 loss: 1.4787 acc 0.4972 


Epoch 4/4
----------
Training
Validation
Valid epoch 4 loss: 1.4515 acc 0.5051 



Training vgg16 model...
Epoch 0/4
----------
Training
Validation
Valid epoch 0 loss: 1.1994 acc 0.5813 


Epoch 1/4
----------
Training
Validation
Valid epoch 1 loss: 1.1609 acc 0.5977 


Epoch 2/4
----------
Training
Validation
Valid epoch 2 loss: 1.1561 acc 0.6015 


Epoch 3/4
----------
Training
Validation
Valid epoch 3 loss: 1.1533 acc 0.6009 


Epoch 4/4
----------
Training
Validation
Valid epoch 4 loss: 1.1363 acc 0.6025 



Training densenet121 model...
Epoch 0/4
----------
Training
Validation
Valid epoch 0 loss: 1.4708 acc 0.5069 


Epoch 1/4
----

## Evaluating the model on the test data

In [20]:
def evaluate_model(model):
    model.eval()
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    model.to(device)

    test_losses = []
    test_acc = []

    with torch.no_grad():
        for x,y in test_loader:
            x, y = x.to(device), y.to(device)
            output = model(x)
            loss = criterion(output, y)
            acc = accuracy(output, y)
            
            test_losses.append(loss.item())
            test_acc.append(acc.item())

    acc = sum(test_acc)/len(test_acc)
    return acc


The oldest model (VGG16) seems to better perform.

In [21]:

# Evaluate each model
model_performances = {}
for name, model in trained_models.items():
    acc = evaluate_model(model)
    model_performances[name] = acc
    print(f'Model {name} Accuracy: {acc:.4f}')

Model resnet50 Accuracy: 0.5161
Model vgg16 Accuracy: 0.6095
Model densenet121 Accuracy: 0.5320


In [22]:
import pandas as pd

performance_df = pd.DataFrame(list(model_performances.items()), columns=['Model', 'Accuracy'])
print(performance_df)

         Model  Accuracy
0     resnet50  0.516113
1        vgg16  0.609473
2  densenet121  0.532031
