## 6.1 Transfer Learning ResNet-34 using Custom Dataset
- Prepare Dataset
- Run Transfer Learning ResNet-34
- Evaluate Model performance

⚠️⚠️⚠️ *Please open this notebook in Google Colab* by click below link ⚠️⚠️⚠️<br><br>
<a href="https://colab.research.google.com/github/Muhammad-Yunus/Belajar-Image-Classification/blob/main/Pertemuan%206/6.1%20transfer_learning_resnet_34.ipynb" target="_blank"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a><br><br><br>
- Click `Connect` button in top right Google Colab notebook,<br>
<img src="https://github.com/Muhammad-Yunus/Belajar-Image-Classification/blob/main/Pertemuan%206/resource/cl-connect-gpu.png?raw=1" width="250px">
- If connecting process completed, it will turn to something look like this<br>
<img src="https://github.com/Muhammad-Yunus/Belajar-Image-Classification/blob/main/Pertemuan%206/resource/cl-connect-gpu-success.png?raw=1" width="250px">

- Check GPU connected into Colab environment is active

In [None]:
!nvidia-smi

<br><br><br><br><br>
#### 6.1.1 Transfer Learning ResNet-34

In [66]:
!pip install gdown

import os
import cv2
import gdown
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader, random_split

import numpy as np

from IPython import display

# clear output cell
display.clear_output()

In [56]:
DATASET_NAME = 'apple2orange' # the dataset name
DATASET_LABELS = ["apple", "orange"] # define dataset labels
DATASET_NUM_CLASS = len(DATASET_LABELS) # number of class in dataset

In [10]:
# default using gdrive_id Dataset `apple2orange_dataset.zip` (1rHN19c2DmqPKcTNyxjbENIePMVONXAYI)
gdrive_id = '1rHN19c2DmqPKcTNyxjbENIePMVONXAYI' # <-----  ⚠️⚠️⚠️ USE YOUR OWN GDrive ID FOR CUSTOM DATASET ⚠️⚠️⚠️

# download zip from GDrive
url = f'https://drive.google.com/uc?id={gdrive_id}'
gdown.download(url, DATASET_NAME + ".zip", quiet=False)

# unzip dataset
!unzip {DATASET_NAME}.zip -d {DATASET_NAME}

# clear output cell
display.clear_output()

In [76]:
# Define Custom Dataset class
# it's just helper to load image dataset using OpenCV and convert to pytorch tensor
# also doing a label encoding using one-hot encoding
class CustomDataset(Dataset):
    def __init__(self, root_dir):
        self.root_dir = root_dir
        self.image_files = sorted([file for file in os.listdir(root_dir) if file.lower().endswith('.png') or file.lower().endswith('.jpg')])

    def __len__(self):
        return len(self.image_files)

    def __getitem__(self, idx):
        # Read image from corresponding .png file
        image_path = os.path.join(self.root_dir, self.image_files[idx])
        image = cv2.imread(image_path)  # Load image using OpenCV
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  # Convert BGR to RGB
        image = cv2.resize(image, (224,224)) # resize input image to comply ResNet18 input dimension 3x224x224
        image = torch.from_numpy(image).to(torch.float32)  # Convert NumPy array to PyTorch tensor
        image = image.permute(2, 0, 1)  # Change the order of dimensions from (H, W, C) to (C, H, W)
        image = image / 255.0  # Normalize pixel values to [0, 1]

        # Read label from corresponding .txt file
        label_path = os.path.splitext(image_path)[0] + ".txt"
        with open(label_path, 'r') as label_file:
            label = label_file.read().strip()  # read label from .txt

        # Apply one-hot encoding into label
        labels_tensor = torch.tensor(DATASET_LABELS.index(label))
        one_hot_encoded = F.one_hot(labels_tensor, num_classes=DATASET_NUM_CLASS).to(torch.float32)

        return image, one_hot_encoded

# instantiate dataset
# in here the image dataset is not loaded yet
# we only read all image files names in fataset folder
all_train_dataset = CustomDataset(root_dir=f'{DATASET_NAME}/dataset/train')
test_dataset = CustomDataset(root_dir=f'{DATASET_NAME}/dataset/test')

In [None]:
print(f"All Train Dataset : {len(all_train_dataset)} data")
print(f"Test Dataset : {len(test_dataset)} data")

num_all_train = len(all_train_dataset)

In [None]:
# Split 'all_train_dataset' into 'train' and 'validation' set using `random_split()` function

num_train = int( num_all_train* 0.75)
num_val = num_all_train - num_train
train_dataset, validation_dataset = random_split(all_train_dataset, [num_train, num_val])

print(f"Train Dataset : {len(train_dataset)} data")
print(f"Validation Dataset : {len(validation_dataset)} data")

In [79]:
# Create data loaders
BATCH_SIZE = 128

train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
validation_loader = DataLoader(validation_dataset, batch_size=BATCH_SIZE, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=BATCH_SIZE, shuffle=False)

#### ❄️❄️❄️ Pretrained Model Freezein Transfer Learning ❄️❄️❄️
- In transfer learning, when using a pre-trained model like ResNet for image classification, <font color="orange">freezing</font> all layers except the last fully connected layer is a common practice for the following reasons:

  1. <font color="orange">Preserve Learned Features</font> :
      - ResNet and similar models are pre-trained on large datasets like ImageNet, which contains millions of labeled images.
      - Through this training, the model learns general features (like edges, textures, shapes) in the early and middle layers, which are applicable across a wide variety of tasks.
      - These features are useful even for your custom dataset, so you want to preserve them instead of retraining those layers from scratch.
  2. <font color="orange">Avoid Overfitting</font> :
      - When you fine-tune a model on a smaller, domain-specific dataset, if you retrain the entire network, it might overfit to your smaller dataset.
      - Freezing the majority of the layers helps prevent this by keeping the pre-learned, general features intact and only adjusting the last layer to make the model specific to your classification task.
  3. <font color="orange">Reduce Training Time</font> :
      - Training all layers in a deep model like ResNet from scratch is computationally expensive and time-consuming.
      - By freezing most layers, you significantly reduce the training time because the gradients do not need to be computed for the frozen layers.<br><br><br>

      <img src="resource/Freeze-Resnet-18-Architecture.png" width="95%">

In [92]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load the pre-trained ResNet-34 model from torch.hub
model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet34', pretrained=True)


# ❄️❄️❄️ Freeze all layers
for param in model.parameters():
    param.requires_grad = False

# Modify the final fully connected layer, adjust the output to match dimension with size DATASET_NUM_CLASS 
model.fc = nn.Linear(model.fc.in_features, DATASET_NUM_CLASS)

# Ensure only the last layer's parameters are trainable
for param in model.fc.parameters():
    param.requires_grad = True


# Move the model to the appropriate device
model = model.to(device)


# setup optimizer, loss function & metric
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
loss_function = nn.CrossEntropyLoss()

- To run training process, we can use the following code

In [None]:
!pip install tqdm

from tqdm import tqdm

In [None]:
def train(model, train_loader, optimizer, loss_function):
    model.train()
    running_loss = 0.0
    correct_predictions = 0
    total_predictions = 0

    # Add progress bar for training loop
    progress_bar = tqdm(train_loader, desc='Training', leave=False)

    for inputs, labels in progress_bar:
        inputs = inputs.to(device) # move inputs to device
        labels = labels.to(device) # move labels to device

        # resets the gradients of all the model's parameters before the backward pass
        optimizer.zero_grad()
        # pass 3D BATCH_SIZEx3x224x224 input tensor to CNN model
        outputs = model(inputs)
        # calc loss value
        loss = loss_function(outputs, labels)
        # computes the gradient of the loss with respect to each parameter in model
        loss.backward()
        # adjust model parameter
        optimizer.step()
        # sum loss value
        running_loss += loss.item()

        # Calculate correct & total prediction
        _, predicted = torch.max(outputs, 1)
        correct_predictions += (predicted == labels.argmax(1)).sum().item()
        total_predictions += labels.size(0)

        # Update progress bar description with current loss
        progress_bar.set_postfix(loss=loss.item())

    # Calculate average training loss
    average_train_loss = running_loss / len(train_loader.dataset)
    # Calculate training accuracy
    train_accuracy = correct_predictions / total_predictions
    return average_train_loss, train_accuracy

def validate(model, val_loader, loss_function):
    model.eval()
    running_loss = 0.0
    correct_predictions = 0
    total_predictions = 0

    # Add progress bar for validation loop
    progress_bar = tqdm(val_loader, desc='Validating', leave=False)

    with torch.no_grad():
        for inputs, labels in progress_bar:
            inputs = inputs.to(device) # move inputs to device
            labels = labels.to(device) # move labels to device

            # pass 2D 3x224x224 input tensor to CNN model
            outputs = model(inputs)
            # calc loss value
            loss = loss_function(outputs, labels)
            # sum loss value
            running_loss += loss.item()

            # Calculate correct & total prediction
            _, predicted = torch.max(outputs, 1)
            correct_predictions += (predicted == labels.argmax(1)).sum().item()
            total_predictions += labels.size(0)

            # Update progress bar description with loss
            progress_bar.set_postfix(loss=loss.item())

    # Calculate average validation loss
    average_val_loss = running_loss / len(val_loader.dataset)
    # Calculate validation accuracy
    val_accuracy = correct_predictions / total_predictions
    return average_val_loss, val_accuracy





# This is a training loop for selected Epoch
# each epoch will process all training and validation set, chunked into small batch size data
# then measure the loss & accuracy of training and validation set
NUM_EPOCH = 50      # you can change this value

train_losses = []
val_losses = []
train_accuracies = []
val_accuracies = []

for epoch in range(NUM_EPOCH):
    print(f"Epoch {epoch+1}/{NUM_EPOCH}")

    train_loss, train_accuracy = train(model, train_loader, optimizer, loss_function)
    val_loss, val_accuracy = validate(model, validation_loader, loss_function)

    train_losses.append(train_loss)
    val_losses.append(val_loss)
    train_accuracies.append(train_accuracy * 100)  # convert to percentage
    val_accuracies.append(val_accuracy * 100)  # convert to percentage

    print(f"Train Loss = {train_loss:.4f}, Val Loss = {val_loss:.4f}, Train Accuracy = {train_accuracy:.4f}, Val Accuracy = {val_accuracy:.4f}\n")

- Plot Loss and Accuracy of Training vs Validation Set

In [None]:
# visualize Loss & Accuracy
import matplotlib.pyplot as plt

epochs = list(range(1, NUM_EPOCH + 1))

# Plotting loss
plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
plt.plot(epochs, train_losses, 'b', label='Training Loss')
plt.plot(epochs, val_losses, 'r', label='Validation Loss')
plt.title('Training and Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.grid(True)

# Plotting accuracy
plt.subplot(1, 2, 2)
plt.plot(epochs, train_accuracies, 'b', label='Training Accuracy')
plt.plot(epochs, val_accuracies, 'r', label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy (%)')
plt.legend()
plt.grid(True)

plt.tight_layout()


- Evaluate Model, find Precision, Recal each class data, measure accuracy and compute confusion matrix

In [None]:
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
import seaborn as sns
import numpy as np

# define evaluate function for test set
def evaluate(model, test_loader):
    model.eval()
    all_labels = []
    all_preds = []

    # Add progress bar for validation loop
    progress_bar = tqdm(test_loader, desc='Evaluating', leave=False)

    with torch.no_grad():
        # iterate over all batched test set
        for inputs, labels in progress_bar:
            inputs = inputs.to(device) # move inputs to device
            labels = labels.to(device) # move labels to device

            # pass 2D 3x224x224 input tensor to CNN model
            outputs = model(inputs)
            # get prediction
            _, preds = torch.max(outputs, 1)
            # collect all labels & preds
            all_labels.extend(labels.cpu().numpy())
            all_preds.extend(preds.cpu().numpy())

    return all_labels, all_preds

# Evaluation on test set
all_labels, all_preds = evaluate(model, test_loader)
all_labels = np.argmax(all_labels, axis=1)

# Calculate classification report
labels = [str(i) for i in range(DATASET_NUM_CLASS)]
print(classification_report(all_labels, all_preds, target_names=labels))

# Confusion Matrix
conf_matrix = confusion_matrix(all_labels, all_preds)

# Plotting the confusion matrix
plt.figure(figsize=(10, 7))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues")
plt.xlabel('Predicted Class')
plt.ylabel('Actual Class')
plt.title('Confusion Matrix')
plt.show()

- Download Model

In [None]:
# Save the model
torch.save(model.state_dict(), 'trained_resnet34_model.pt')

# Download the model file
from google.colab import files
files.download('trained_resnet34_model.pt')

<br><br><br><br>
#### 🧪🧪🧪 Experiment Result ResNet-18 with 50 epoch
- Dataset : Apple2Orange
    - Train : 1510 images
    - Validation : 504 images
    - Test : 514 images
    - Number of class : 2
- Train vs Validation Accuracy & Loss<br>
<img src="resource/Resnet-18-50epoch-apple2orange.png" width="900px"><br><br>
- Classification Report<br>
<img src="resource/Resnet-18-50epoch-apple2orange-report.png" width="500px">
- Confusion Matrix<br>
<img src="resource/Resnet-18-50epoch-apple2orange-eval.png" width="500px">