## A ResNet Model to Classify BCN20000 Dataset

This is the second model built to classify the BCN20000 dataset. More details about the dataset can be found in the [first notebookðŸ¡µ](https://github.com/ngpraveen/Deep-Learning-models-for-BCN20000-dataset-classification/blob/main/CNN_Model_for_BCN20000_dataset.ipynb).

In this notebook, we utilize transfer learning using ResNet-18 model with its default weights. All layers except the last fully connected layer are frozen. 


### 1. Import Libraries

In [1]:
import os
import time
import pandas as pd
from PIL import Image

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
from torchvision import transforms
from torch.utils.data import DataLoader, Dataset, random_split
from torchvision import models

In [2]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device

device(type='cpu')

In [3]:
root_dir = '/mnt/c/Users/prave/data/bcn20k_figshare/'

target_map = {'NV': 0, 'MEL': 1, 'BCC': 2, 'BKL': 3, 'AK': 4, 'SCC': 5, 'DF': 6, 'VASC': 7}


### 2. Create a Custom Dataset

In [4]:
class BCNDataset(Dataset):
    """
    Creates a custom dataset class that inherits from PyTorch's Dataset class.
    The metadata content (bcn_20k_train.csv) is added to an attribute `metadata`.
    The output class is defined in the column `diagnosis` which is coded to 0, 1, 2, ...
    as defined in `target_map`.

    Args:
        root_dir (str): The root directory where the dataset is stored. 
        transform (callable, optional): Optional transform to be applied to the input data.
    """
    
    def __init__(self, root_dir: str, transform=None):
        self.root_dir = root_dir
        self.image_dir = os.path.join(root_dir, "train")
        self.transform = transform
        self.metadata = self.load_metadata()


    def __len__(self):
        return len(self.metadata)


    def __getitem__(self, idx:int):
        image, label = self.retrieve_image(idx)
        if self.transform:
            image = self.transform(image)        

        return image, label


    def load_metadata(self):
        metadata = pd.read_csv(
            os.path.join(self.root_dir, "bcn_20k_train.csv")
        )
        
        target_map = {'NV': 0, 'MEL': 1, 'BCC': 2, 'BKL': 3, 'AK': 4, 'SCC': 5, 'DF': 6, 'VASC': 7}
        metadata['target'] = metadata['diagnosis'].map(target_map)
        return metadata
    

    def retrieve_image(self, idx: int):
        image_name = self.metadata["bcn_filename"].iloc[idx]
        image_path = os.path.join(self.image_dir, image_name) 
        label = self.metadata["target"].iloc[idx]
        with Image.open(image_path) as img:
            image = img.convert("RGB")
        return image, label

In [5]:
dataset_row = BCNDataset(root_dir)

In [6]:
dataset_row.metadata

Unnamed: 0,bcn_filename,age_approx,anatom_site_general,diagnosis,lesion_id,capture_date,sex,split,target
0,BCN_0000000001.jpg,55.0,anterior torso,MEL,BCN_0003884,2012-05-16,male,train,1
1,BCN_0000000003.jpg,50.0,anterior torso,MEL,BCN_0000019,2015-07-09,female,train,1
2,BCN_0000000004.jpg,85.0,head/neck,SCC,BCN_0003499,2015-11-23,male,train,5
3,BCN_0000000006.jpg,60.0,anterior torso,NV,BCN_0003316,2015-06-16,male,train,0
4,BCN_0000000010.jpg,30.0,anterior torso,BCC,BCN_0004874,2014-02-18,female,train,2
...,...,...,...,...,...,...,...,...,...
12408,BCN_0000020348.jpg,85.0,head/neck,BCC,BCN_0003925,2013-03-05,female,train,2
12409,BCN_0000020349.jpg,65.0,anterior torso,BKL,BCN_0001819,2016-05-05,male,train,3
12410,BCN_0000020350.jpg,70.0,lower extremity,MEL,BCN_0001085,2015-01-29,male,train,1
12411,BCN_0000020352.jpg,55.0,palms/soles,NV,BCN_0002083,2016-05-08,female,train,0


### 3. Split Dataset and Create a Subset

In [7]:
def split_dataset(dataset, val_fraction=0.15, test_fraction=0.15):
    """
    Randomly split dataset into train, validation and test datasets.
    By default, splits in to 70%, 15% and 15%. 

    Args:
        dataset: The dataset object which is split into train, validation and test datasets.
        val_fraction (float, optional): Fraction of the dataset to be included in the validation set.
        test_fraction (float, optional): Fraction of the dataset to be included in the test set.  
    """
    
    total_size = len(dataset)
    val_size = int(total_size * val_fraction)
    test_size = int(total_size * test_fraction)
    train_size = total_size - val_size - test_size

    train_dataset, val_dataset, test_dataset = random_split(dataset, 
                                                            [train_size, val_size, test_size])
    return train_dataset, val_dataset, test_dataset

In [8]:
train_dataset1, val_dataset1, test_dataset1 = split_dataset(dataset_row, 0.15, 0.0)
len(train_dataset1), len(val_dataset1), len(test_dataset1)

(10552, 1861, 0)

In [9]:
class SubsetWithTransform(Dataset):
    """
    Creates a subset class from BCNDataset objects. 
    Helps with applying different transforms to the train, validation 
    and test datasets. 
    """
    
    def __init__(self, subset, transform=None):
        print(subset)
        self.subset = subset
        self.transform = transform

    def __len__(self):
        return len(self.subset)

    def __getitem__(self, idx):
        image, label = self.subset[idx]
        if self.transform:
            image = self.transform(image)

        return image, label

    

In [10]:
# define different transforms for train and validation+test sets. 
# Train dataset is augmented with horizontal flip, random rotation etc. 
# But validation and test sets are not augmented. 
mean, std = [0.6125, 0.5277, 0.5061], [0.4241, 0.3242, 0.3054]
train_transform = transforms.Compose([
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.RandomRotation(degrees=10),
    transforms.ColorJitter(brightness=0.2),
    transforms.Resize((256, 256)),  # Resize images to 256x256 pixels
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=mean, std=std)
])

# for validation and test sets.
val_transform = transforms.Compose([
    transforms.Resize((256, 256)),  # Resize images to 256x256 pixels
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=mean, std=std)
])

In [11]:
train_dataset = SubsetWithTransform(train_dataset1, transform=train_transform)
val_dataset = SubsetWithTransform(val_dataset1, transform=val_transform)

<torch.utils.data.dataset.Subset object at 0x7f939dd02350>
<torch.utils.data.dataset.Subset object at 0x7f939dd021d0>


### 4. Import ResNet Model

The pretrained ResNet-18 model is imported and all layers are frozen except the last fully connected layer. 

In [12]:
model = torchvision.models.resnet18(pretrained=True)



In [13]:
# get the number of input features for the fully connected layers
fc_in_features_count = model.fc.in_features
fc_in_features_count

512

In [14]:
# Redesigning the fully connected layer so that the number of outputs matches the number of classes.
num_classes = dataset_row.metadata.target.nunique()
print(num_classes)
model.fc = nn.Linear(fc_in_features_count, num_classes)

8


### 5. Train the Model

In [15]:
loss_function = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

In [16]:
torch.manual_seed(32)

<torch._C.Generator at 0x7f926484d010>

In [17]:
train_loader = DataLoader(train_dataset, shuffle=True, batch_size=64, num_workers=4)
val_loader = DataLoader(val_dataset, shuffle=False, batch_size=64)

In [18]:
# freeze all layers except the fully connected layer
ct = 0

for child in model.children():
    ct += 1
    print(ct)
    if ct < 10:
        for param in child.parameters():
            param.requires_grad = False

print("---")
print(ct)

1
2
3
4
5
6
7
8
9
10
---
10


In [20]:
train_loss = []
train_accuracy = []
val_loss = []
val_accuracy = []

epochs = 10
for epoch in range(epochs):
    start_time = time.time()

    print(f"Epoch {epoch}:")
    print("Training...")
    model.train()
    running_loss = 0.0
    running_correct_counts = 0


    for i, (inputs, targets) in enumerate(train_loader):
        print(f"[{i} / {len(train_loader)}]", end="\r")
        inputs, targets = inputs.to(device), targets.to(device)
        optimizer.zero_grad()
        outputs = model(inputs)
        _, preds = torch.max(outputs, 1)
        loss = loss_function(outputs, targets)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
        running_correct_counts += preds.eq(targets).sum().item()


    epoch_loss = running_loss / len(train_dataset)
    epoch_acc = running_correct_counts / len(train_dataset) * 100.0
    train_loss.append(epoch_loss)
    train_accuracy.append(epoch_acc)
    print(f"  Loss: {epoch_loss:.3f} Acc: {epoch_acc:.3f} Time: {time.time()-start_time}")

    model.eval()
    with torch.no_grad():
        running_loss = 0.0
        running_correct_counts = 0

        for inputs, targets in val_loader:
            inputs, outputs = inputs.to(device), outputs.to(device)
            outputs = model(inputs)
            _, preds = torch.max(outputs, 1)
            loss = loss_function(outputs, targets)
            running_loss += loss.item()
            running_correct_counts += preds.eq(targets).sum().item()

        epoch_loss = running_loss / len(val_dataset)
        epoch_acc = running_correct_counts / len(val_dataset) * 100.0

        val_loss.append(epoch_loss)
        val_accuracy.append(epoch_acc)

        print(f"  Val Loss: {epoch_loss:.3f} Acc: {epoch_acc:.3f} Time: {time.time()-start_time}")
        print("")
        
        

Epoch 0:
Training...
  Loss: 0.021 Acc: 51.857 Time: 648.7401292324066
  Val Loss: 0.020 Acc: 56.905 Time: 790.1569657325745

Epoch 1:
Training...
  Loss: 0.018 Acc: 58.908 Time: 620.8193662166595
  Val Loss: 0.019 Acc: 57.227 Time: 739.585294008255

Epoch 2:
Training...
  Loss: 0.017 Acc: 59.647 Time: 613.6119961738586
  Val Loss: 0.019 Acc: 57.603 Time: 726.1651830673218

Epoch 3:
Training...
  Loss: 0.017 Acc: 60.358 Time: 606.1249845027924
  Val Loss: 0.018 Acc: 59.807 Time: 715.8717143535614

Epoch 4:
Training...
  Loss: 0.017 Acc: 60.624 Time: 601.303985118866
  Val Loss: 0.018 Acc: 60.290 Time: 709.8070316314697

Epoch 5:
Training...
  Loss: 0.017 Acc: 61.041 Time: 600.4843907356262
  Val Loss: 0.018 Acc: 58.571 Time: 709.4336223602295

Epoch 6:
Training...
  Loss: 0.017 Acc: 61.458 Time: 600.9018428325653
  Val Loss: 0.018 Acc: 60.021 Time: 709.6395988464355

Epoch 7:
Training...
  Loss: 0.016 Acc: 61.410 Time: 607.8289475440979
  Val Loss: 0.018 Acc: 60.559 Time: 716.511197090

#### 5.1 Repeat another 40 epochs

In [22]:
train_loss = []
train_accuracy = []
val_loss = []
val_accuracy = []

epochs = 40
for epoch in range(epochs):
    start_time = time.time()

    print(f"Epoch {epoch}:")
    print("Training...")
    model.train()
    running_loss = 0.0
    running_correct_counts = 0


    for i, (inputs, targets) in enumerate(train_loader):
        print(f"[{i} / {len(train_loader)}]", end="\r")
        inputs, targets = inputs.to(device), targets.to(device)
        optimizer.zero_grad()
        outputs = model(inputs)
        _, preds = torch.max(outputs, 1)
        loss = loss_function(outputs, targets)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
        running_correct_counts += preds.eq(targets).sum().item()


    epoch_loss = running_loss / len(train_dataset)
    epoch_acc = running_correct_counts / len(train_dataset) * 100.0
    train_loss.append(epoch_loss)
    train_accuracy.append(epoch_acc)
    print(f"  Loss: {epoch_loss:.3f} Acc: {epoch_acc:.3f} Time: {time.time()-start_time}")

    model.eval()
    with torch.no_grad():
        running_loss = 0.0
        running_correct_counts = 0

        for inputs, targets in val_loader:
            inputs, outputs = inputs.to(device), outputs.to(device)
            outputs = model(inputs)
            _, preds = torch.max(outputs, 1)
            loss = loss_function(outputs, targets)
            running_loss += loss.item()
            running_correct_counts += preds.eq(targets).sum().item()

        epoch_loss = running_loss / len(val_dataset)
        epoch_acc = running_correct_counts / len(val_dataset) * 100.0

        val_loss.append(epoch_loss)
        val_accuracy.append(epoch_acc)

        print(f"  Val Loss: {epoch_loss:.3f} Acc: {epoch_acc:.3f} Time: {time.time()-start_time}")
        print("")
        
        

Epoch 0:
Training...
  Loss: 0.016 Acc: 61.296 Time: 603.9469404220581
  Val Loss: 0.018 Acc: 60.183 Time: 711.6400022506714

Epoch 1:
Training...
  Loss: 0.016 Acc: 61.960 Time: 599.7438161373138
  Val Loss: 0.018 Acc: 60.613 Time: 708.285281419754

Epoch 2:
Training...
  Loss: 0.016 Acc: 62.292 Time: 625.2070240974426
  Val Loss: 0.017 Acc: 61.042 Time: 734.9142837524414

Epoch 3:
Training...
  Loss: 0.016 Acc: 62.424 Time: 602.7308971881866
  Val Loss: 0.018 Acc: 60.129 Time: 710.8291759490967

Epoch 4:
Training...
  Loss: 0.016 Acc: 62.007 Time: 597.9945247173309
  Val Loss: 0.018 Acc: 60.559 Time: 706.3828616142273

Epoch 5:
Training...
  Loss: 0.016 Acc: 62.055 Time: 599.4905428886414
  Val Loss: 0.018 Acc: 60.828 Time: 707.587085723877

Epoch 6:
Training...
  Loss: 0.016 Acc: 62.945 Time: 598.2860956192017
  Val Loss: 0.018 Acc: 60.935 Time: 707.0263533592224

Epoch 7:
Training...
  Loss: 0.016 Acc: 62.301 Time: 596.8837099075317
  Val Loss: 0.018 Acc: 58.517 Time: 705.437270879

### 6. Conclusion

We employed transfer learning using ResNet-18 model, keeping all layers except the final fully connected layer frozen. The performance of the model is comparable with the custom CNN model trained in a separate notebook. However, it is to be noted that this dataset is imbalanced. I will examine the effectiveness of sampling methods, hierarchical classification etc.  