![MLU Logo](../data/MLU_Logo.png)

# <a name="0">Machine Learning Accelerator - Computer Vision - Lecture 2</a>


## Fine-Tuning with Pre-trained AlexNet 

In this notebook, we use a pre-trained AlexNet on the [MINC](http://opensurfaces.cs.cornell.edu/publications/minc/)  dataset. This notebook is similar to our previous notebook `MLA-CV-DAY1-CNN.ipynb`, so we may skip some details to be concise. We will cover the following topics:

1. <a href="#1">Loading and Transforming Dataset</a>      
2. <a href="#2">Fine-tuning Pretrained AlexNet</a>
3. <a href="#3">Testing and Visualizations</a>


First, let's import the necessary libraries.

In [1]:
! pip install -q -r ../requirements.txt

[31mERROR: Cannot install autogluon-core==1.1.1 and scikit-learn>=1.5.0 because these package versions have conflicting dependencies.[0m[31m
[0m[31mERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts[0m[31m
[0m

In [2]:
import os
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import transforms, datasets, models
import numpy as np
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt

## 1. <a name="1">Loading and Transforming Dataset</a>
(<a href="#0">Go to top</a>)

To load the dataset properly, we need to massage the image data a bit by some `transforms` functions. PyTorch provides a full list of transforms functions to enable a wide variety of data augmentation. 

We will process some simple data transformations in this example. First, we load the image data and resize it to the given size (224,224). Next, we convert the image to a tensor. Last, we normalize the tensor with its mean and standard deviation.

In [3]:
transform_train = transforms.Compose([
    transforms.Resize(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

transform_test = transforms.Compose([
    transforms.Resize(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

Now apply the predefined transform functions and load the train, validation and test sets.

In practice, reading data can be a significant performance bottleneck, especially when our model is simple or when our computer is fast. To make our life easier when reading from the datasets, we use a `DataLoader` of PyTorch, which reads a minibatch of data with size `batch_size` each time.

In [4]:
batch_size = 16

path = '../data/minc-2500'
train_path = os.path.join(path, 'train')
val_path = os.path.join(path, 'val')
test_path = os.path.join(path, 'test')

train_dataset = datasets.ImageFolder(train_path, transform=transform_train)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

validation_dataset = datasets.ImageFolder(val_path, transform=transform_test)
validation_loader = DataLoader(validation_dataset, batch_size=batch_size, shuffle=False)

test_dataset = datasets.ImageFolder(test_path, transform=transform_test)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

## 2. <a name="2">Fine-tuning Pretrained AlexNet</a>
(<a href="#0">Go to top</a>)

To fine-tune a pretrained model, we need the following steps:
1. Load a pretrained AlexNet model.
2. Modify the last fully connected layer to match our number of classes.
3. Set up the optimizer and loss function.
4. Train the model.

In [5]:
def FineTuneAlexnet(num_classes):
    model = models.alexnet(pretrained=True)
    num_ftrs = model.classifier[6].in_features
    model.classifier[6] = nn.Linear(num_ftrs, num_classes)
    return model

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
num_outputs = 6  # 6 output classes
net = FineTuneAlexnet(num_outputs)
net = net.to(device)
print(net)

Downloading: "https://download.pytorch.org/models/alexnet-owt-7be5be79.pth" to /home/ec2-user/.cache/torch/hub/checkpoints/alexnet-owt-7be5be79.pth
100%|██████████| 233M/233M [00:00<00:00, 262MB/s] 


AlexNet(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (4): ReLU(inplace=True)
    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): ReLU(inplace=True)
    (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (9): ReLU(inplace=True)
    (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
  (classifier): Sequential(
    (0): Dropout(p=0.5, inplace=False)
    (1): Linear(in_features=9216, out_features=4096, bias=True)
 

Next, we set up the hyperparameters for training, such as the learning rate of optimization algorithms. We'll use the Adam optimizer and Cross Entropy Loss.

In [6]:
learning_rate = 0.001
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=learning_rate)

Now it's the training time! We'll train for 10 epochs, updating the weights based on the average statistics of each mini-batch.

In [7]:
epochs = 10

for epoch in range(epochs):
    net.train()
    train_loss, train_acc = 0.0, 0.0
    
    for data, labels in train_loader:
        data, labels = data.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = net(data)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        
        train_loss += loss.item() * data.size(0)
        _, predictions = torch.max(outputs, 1)
        train_acc += torch.sum(predictions == labels.data)
    
    net.eval()
    val_loss, val_acc = 0.0, 0.0
    
    with torch.no_grad():
        for data, labels in validation_loader:
            data, labels = data.to(device), labels.to(device)
            outputs = net(data)
            loss = criterion(outputs, labels)
            
            val_loss += loss.item() * data.size(0)
            _, predictions = torch.max(outputs, 1)
            val_acc += torch.sum(predictions == labels.data)
    
    train_loss = train_loss / len(train_dataset)
    train_acc = train_acc.float() / len(train_dataset)
    val_loss = val_loss / len(validation_dataset)
    val_acc = val_acc.float() / len(validation_dataset)
    
    print(f'Epoch {epoch+1}/{epochs}:')
    print(f'Train Loss: {train_loss:.4f} Train Acc: {train_acc:.4f}')
    print(f'Val Loss: {val_loss:.4f} Val Acc: {val_acc:.4f}')

Could not load library libcudnn_cnn_train.so.8. Error: /usr/local/cuda-11.8/lib/libcudnn_cnn_train.so.8: symbol _ZTIN10cask_cudnn14BaseKernelInfoE, version libcudnn_cnn_infer.so.8 not defined in file libcudnn_cnn_infer.so.8 with link time reference
Could not load library libcudnn_cnn_train.so.8. Error: /usr/local/cuda-11.8/lib/libcudnn_cnn_train.so.8: symbol _ZTIN10cask_cudnn14BaseKernelInfoE, version libcudnn_cnn_infer.so.8 not defined in file libcudnn_cnn_infer.so.8 with link time reference
Could not load library libcudnn_cnn_train.so.8. Error: /usr/local/cuda-11.8/lib/libcudnn_cnn_train.so.8: symbol _ZTIN10cask_cudnn14BaseKernelInfoE, version libcudnn_cnn_infer.so.8 not defined in file libcudnn_cnn_infer.so.8 with link time reference
Could not load library libcudnn_cnn_train.so.8. Error: /usr/local/cuda-11.8/lib/libcudnn_cnn_train.so.8: symbol _ZTIN10cask_cudnn14BaseKernelInfoE, version libcudnn_cnn_infer.so.8 not defined in file libcudnn_cnn_infer.so.8 with link time reference
Coul

RuntimeError: GET was unable to find an engine to execute this computation

If you would like to save the trained model, you can use `torch.save`.

In [None]:
torch.save(net.state_dict(), "my_model.pth")

## 3. <a name="3">Testing and Visualizations</a>
(<a href="#0">Go to top</a>)

Let's validate our model predictions and show sample images with their predictions.

In [None]:
def show_images(imgs, num_rows, num_cols, titles=None, scale=1.5):
    """Plot a list of images."""
    figsize = (num_cols * scale, num_rows * scale)
    _, axes = plt.subplots(num_rows, num_cols, figsize=figsize)
    axes = axes.flatten()
    for i, (ax, img) in enumerate(zip(axes, imgs)):
        ax.imshow(img.permute(1, 2, 0))
        ax.axis('off')
        if titles:
            ax.set_title(titles[i])
    plt.tight_layout()
    plt.show()

In [None]:
net.eval()
random_test_sample = DataLoader(test_dataset, batch_size=16, shuffle=True)

data, labels = next(iter(random_test_sample))
show_images(data, 2, 8)

with torch.no_grad():
    outputs = net(data.to(device))
    _, predicted = torch.max(outputs, 1)
    print("Predicted classes:", predicted.cpu().numpy())
    print("Actual classes:   ", labels.numpy())