# Lab 6B - Custom Dataset and Scheduler

In this lab, we shall learn to implement the following two things:
1. Build a custom dataset with your own data
2. Perform learning rate scheduling

In [0]:
from google.colab import drive
drive.mount('/content/gdrive')

In [0]:
cd '/content/gdrive/My Drive/UCCD3074_Lab6'

In [0]:
import os
import numpy as np

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torchvision.models as models
from torch.utils.data import Dataset

from PIL import Image

---
# 1. The Hymenoptera Dataset

The problem we’re going to solve today is to train a model to classify **ants** and **bees**. We have about 120 training images each for ants and bees. There are 75 validation images for each class. Usually, this is a very small dataset to generalize upon, if trained from scratch. Since we are using transfer learning, we should be able to generalize reasonably well. This dataset is a very small subset of imagenet.

<img src="https://pytorch.org/tutorials/_images/sphx_glr_transfer_learning_tutorial_001.png" width=40%>

In [0]:
!wget https://download.pytorch.org/tutorial/hymenoptera_data.zip
!unzip -q hymenoptera_data.zip
rm 'hymenoptera_data/train/ants/imageNotFound.gif'

Take a look at the folder `hymenoptera_data`. It has the following directory structure:
```
hymenoptera_data\
   train\
      ants\
      bees\
   val\
      ants\
      bees\
``` 

# 2. Writing custom dataset

## 2.1 The Abstract `Dataset` Class

PyTorch provides `torch.utils.data.Dataset` to allow you create your own custom dataset. `Dataset` is an abstract class representing a dataset. Your custom dataset should inherit `Dataset` and override the following methods:

* **`__len__`** so that len(dataset) returns the size of the dataset.
* **`__getitem__`** to support the indexing such that dataset[i] can be used to get ith sample

The following code creates a dataset class for the hymenoptera dataset.

In [0]:
#  ... your code here

* **`__init__`**: Get the filenames of all training samples (`self.data`) and their corresponding labels (`self.labels`)
    * Line 10: 
    <br> If `transform` is passed by the user, all images would be transformed using this pipeline when they are read in `__getitem__` later. 
    * Line 11: 
    <br> There are 2 classes in the dataset (0: ants, 1: bees)
    * Line 14-21: 
    <br> For each of the class (line 14), get the names of all the files in their class directories (line 19) and update `self.data` (line 20) and `self.labels` (line 21).

* **`__getitem__`**: Read the image and label. Transform the image if required. Return the transformed image and label.

<br>

**Notes**: *While it is possible to load all images in the `__init__`, we have choosen to read the images only when requested by the user in `__getitem__`. This is more memory efficient because all the images are not stored in the memory at once but read as required. This is the normal setup when the dataset is huge.*


### 2.2 Instantiating `HymenopteraDataset`

Let's instantiate the HymenopteraDataset and look into one of its sample.

In [0]:
#  ... your code here

* `Line 1`: When creating `trainset`, the function `__init__` will be called to populate `trainset.data` and `trainset.labels`.

Next, we look into the first sample in the label. Since we did not transform the image, we can still display the image without undoing the transformation.

In [0]:
#  ... your code here

---

# 3.1 Customizing ResNet18 for Binary Classification

Now, customize ResNet18 (`torchvision.models.resnet18`) to build a classifier to differentiate between *ants* vs *bees*.  

<br>
<center>
<img src="https://www.researchgate.net/profile/Paolo_Napoletano/publication/322476121/figure/tbl1/AS:668726449946625@1536448218498/ResNet-18-Architecture_W640.jpg" width=40%>
</center>

The ResNet18 receives an input of size 18x18 and it outputs a vector of 1000 dimensions since it was pretrained on the ImageNet with 1000 object categories.

In PyTorch implementation, the layers are defined as:

|**Layer Name**| **Name in `torchvision.models.resnet18`** |
|:---:|:---:|
|conv1 | conv1, bn1, relu, maxpool |
|conv2_x | layer 1 |
|conv3_x | layer 2 |
|conv4_x | layer 3 |
|conv5_x | layer 4 |
|average pool | avgpool |
|fully connected  | fc |
|softmax          |  Not implemented. Softmax is implemented <br> in `CrossEntropy` loss|

#### Exercise:
<font color=blue>
Customize <i>resnet18</i> for a binary classification task. Replace the <i>fc</i> layer with the following layers with the following two layers:
<br>
<br>
<li> nn.Linear(512, 1) 
<li> nn.Sigmoid()
<br>
<br>
You may group them into a `nn.Sequential` module.

Expected result:
</font>
```
    ResNet(
        (conv1): ...
        (bn1): ...
        (relu): ...
        (maxpool): ...
        (layer1): ...
        (layer2): ...
        (layer3): ...
        (layer4): ...
        (avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
        (fc): Sequential(
            (0): Linear(in_features=512, out_features=1, bias=True)
            (1): Sigmoid()
        )
    )
```

In [0]:
################################################################################
# START OF CODE: Customize resnet18 for a binary classification task
################################################################################
#  ... your code here ...
################################################################################
#      END OF CODE         
################################################################################

print(model)

To train the model, we shall finetune the the top layers, namely `layer4` and `fc` layers. Freeze all other layers.

In [0]:
freeze_layers = ["layer4", "fc"]

#################################################################
# Freeze all the layers except for the layers defined in 
# freeze_layers  (5 lines)
#################################################################
#  ... your code here ...
#################################################################
#               END OF CODE
#################################################################

Use the following code to check if you have set your model correctly. If you have configured the layers correctly, all the weights and biases for `layer4` and `fc` would be `True` whereas those for the remaining layers would be `False`.

In [None]:
for name, param in model.named_parameters():
    print(name, ":", param.requires_grad)

---

# Training the Model

Now we are ready to train the model. In the following, we define the transformation, set up our optimizer and scheduler, and then define the training function before training the model.

Define the transformation function to augment the dataset.

In [0]:
import torchvision.transforms as transforms
from torch.utils.data import DataLoader

# transform the model
train_transform = transforms.Compose([
    transforms.Resize(256),
    transforms.RandomCrop(224),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

Load the dataset with the defined transformation

In [0]:
trainset = HymenopteraDataset("./hymenoptera_data/train", transform=train_transform)
trainloader = DataLoader(trainset, batch_size=4, shuffle=True, num_workers=0)

Set up the optimizer

In [0]:
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

Set up the scheduler. In the following, we are going to use the **step decay schedule**. We shall drop the learning rate by a factor of 0.1 every 7 epochs.

In [0]:
# ... your code here ...

Train the model. We pass both the dataloader, optimizer and scheduler into the function. In order to reduce the learning rate according to the schedule, you must **`scheduler.step`** at the end of every epoch

In [0]:
def train(net, trainloader, optimizer, scheduler, num_epochs):
    
    history = []
    
    # transfer model to GPU
    if torch.cuda.is_available():
        net = net.cuda()
    
    # set to training mode
    net.train()

    # train the network
    for e in range(num_epochs):    

        running_loss = 0.0
        running_count = 0.0

        for i, (inputs, labels) in enumerate(trainloader):

            labels = labels.reshape(-1, 1).float()
            
            # Clear all the gradient to 0
            optimizer.zero_grad()

            # transfer data to GPU
            if torch.cuda.is_available():
                inputs = inputs.cuda()
                labels = labels.cuda()

            # forward propagation to get h
            outs = net(inputs)
        
            # compute loss 
            loss = F.binary_cross_entropy(outs, labels)

            # backpropagation to get dw
            loss.backward()

            # update the parameters
            optimizer.step()

            # get the loss
            running_loss += loss.item()
            running_count += 1

        # compute the averaged loss in each epoch
        train_loss = running_loss / running_count
        running_loss = 0. 
        running_count = 0.
        print(f'[Epoch {e+1:2d}/{num_epochs:d} Iter {i+1:5d}/{len(trainloader)}]: train_loss = {train_loss:.4f}')       

        # Update the scheduler's counter at the end of each epoch
        ... your code here ...

    return 

Now we are ready to train our model. We should expect training loss of about 0.2.

In [0]:
train(model, trainloader, optimizer, scheduler, num_epochs=25)

# Evaluate the model

The following code then evaluates the model. The expected accuracy is around 93.4%.

In [0]:
def evaluate(model, testloader):
    # set to evaluation mode
    model.eval()
    
    # running_correct
    running_corrects = 0
    running_count = 0

    for inputs, targets in testloader:
        
        # transfer to the GPU
        if torch.cuda.is_available():
            inputs = inputs.cuda()
            targets = targets.cuda()
        
        # perform prediction (no need to compute gradient)
        with torch.no_grad():
            outputs = model(inputs)
            predicted = outputs > 0.5
            running_corrects += (predicted.view(-1) == targets).sum().double()
            running_count += len(inputs)
            print('.', end='')

    print('\nAccuracy = {:.2f}%'.format(100*running_corrects/running_count))

In [0]:
import torchvision.transforms as transforms
from torch.utils.data import DataLoader

# transform the model
val_transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

testset = HymenopteraDataset("./hymenoptera_data/val", transform=val_transform)
testloader = DataLoader(testset, batch_size=4, shuffle=True, num_workers=0)

evaluate(model, testloader)