### HDS-M05: Deep Learning (Exercise II)

**Practical leader and prepared by:** Sharib Ali, PhD

[Setup instructions](https://github.com/sharibox/tutorial/blob/master/HDS-CDT_DL.md)

### Required packages

[1] [matplotlib](http://matplotlib.org) can be used for plotting graphs in python

[2] [pytorch](https://pytorch.org/docs/stable/index.html) is library widely used for bulding deep-learning frameworks

[3] [Tensorboard](https://pytorch.org/docs/stable/tensorboard.html) is used to visualise your training and validation loss and accuracy development - It is important to observe it!!!

[4] [TorchVision](https://pytorch.org/vision/stable/index.html) you can use available datasets that include one you are required to use in this tutorial (CIFAR10) and also use augmentations (meaning you can increase data variability for improved accuracy and model generalisation)

[5] [Scikit-learn] Metrics

**Reference:** https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

<img src="https://upload.wikimedia.org/wikipedia/commons/c/cc/Comparison_image_neural_networks.svg" style="width:800px;height:200px;">
<caption><center> <u>Figure</u>: LeNet and AlexNet for image classification.</center></caption>

    What you will learn here?
    
    - You will implement more complex CNN network --> AlexNet 
    - You will learn to perform validation split 
    - You will learn to plot loss and accuracy here
    - You will use scikit learn to plot confusion matrix and other metrics direcetly as you learnt in ML03 module
    

In [None]:
import torch
import torchvision 
from torch import nn
import numpy as np
# always check your version
print(torch.__version__)

### Data loading and pre-processing
**Steps**

[1] Load data - use torchvision if available in datasets ([torch vision available](https://pytorch.org/vision/stable/datasets.html))

[2] Transform --> Normalise your data - mean and std (e.g., if color then normalise all three channels)
e.g., torchvision.transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))

[3] Transform --> Always convert data to ToTensor (you can do **step 1, 2 and 3** in one line as done in this tutorial)

[4] New: split your data into **train and validation set**

[4] Make [DataLoaders](https://pytorch.org/docs/stable/data.html): It represents a Python iterable over a dataset

[5] Identify labels 


In [2]:
from torchvision import transforms 
from torchvision.datasets import CIFAR10
from torch.utils.data import DataLoader

In [4]:
# Preparing transform for step 2 and step 3
mean = (0.5, 0.5, 0.5)
std = (0.5, 0.5, 0.5)

# Add data augmentation --> training
transform = transforms.Compose([
    transforms.Resize(224),   # Note: here you will resize image to 224 that is input to the AlexNet
    transforms.ToTensor(),
    transforms.ColorJitter(brightness=0.5),
    transforms.ColorJitter(hue=0.25),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.Normalize(mean, std)
    ])

testtransform = transforms.Compose([
    transforms.Resize(224),   # Note: here you will resize image to 224 that is input to the AlexNet
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
    ])

# Load data and include prepared transform (Remember to apply same transform to both train 
# and test data
trainset = CIFAR10("data", download=True, train=True, transform=transform)

# valset and testset needs to have same transform (remeber--> no augmentation here!)
valset = CIFAR10("data", download=True, train=True, transform=testtransform) 
testset = CIFAR10("data", download=True, train=False, transform=testtransform)

# labels of CIFAR10 dataset
classes = ('plane', 'car', 'bird', 'cat','deer', 'dog', 'frog', 'horse', 'ship', 'truck')

Files already downloaded and verified
Files already downloaded and verified
Files already downloaded and verified


#### New: You will split train data into train and val set (say 90-10) 
- This step is crucial to identify under- and over-fitting problems 
- Later, we will visualise performance on both train and test online during training (using tensorboard)

In [None]:
# Step: Split between train and valset from the overall trainset
from torch.utils.data.sampler import SubsetRandomSampler
val_percentage = 0.1
num_train = len(trainset)

indices = list(range(num_train))
split = int(np.floor(val_percentage * num_train))

train_idx, valid_idx = indices[split:], indices[:split]
train_sampler = SubsetRandomSampler(train_idx)
valid_sampler = SubsetRandomSampler(valid_idx)


# Now create data loaders (same as before)
# Now we need to create dataLoaders that will allow to iterate during training
batch_size = 16 # create batch-based on how much memory you have and your data size

traindataloader = DataLoader(trainset, batch_size=batch_size, sampler=train_sampler, num_workers=2)
valdataloader = DataLoader(trainset, batch_size=batch_size, sampler=valid_sampler,
            num_workers=2,)

testdataloader = DataLoader(testset, batch_size=4, shuffle=False, num_workers=2)

In [None]:
print('Number of training samples:', len(traindataloader))
print('Number of validation samples:', len(valdataloader))
print('Number of testing samples:', len(testdataloader))

### Look into data

In [None]:
# function to unnormalise images and using transpose to change order to [H, W, Channel] 
def imshow(img):
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.show()

In [None]:
import matplotlib.pyplot as plt 
# always check the shape of your training data
dataiter = iter(traindataloader)
images, labels = dataiter.next()

print(images.shape) # batchsize , channel, Height, Width 
print(labels.shape)  # array with label in batchsize 

# show images 
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(batch_size)))


### Create your AlexNet model
- Please note that your input image size will make difference on your hard-coded feature sizes in FC-layer
- Always be aware of what size input is used, here for convenience we will follow the original paper and reshape image to 224x224x3 

<img src="https://raw.githubusercontent.com/sharibox/HDS-CDT2020/main/images/AlexNet.png" style="width:800px;height:200px;">
<caption><center> <u>Figure</u>: AlexNet for image classification.</center></caption>

In [None]:
# create your model 
class AlexNet(nn.Module):
    def __init__(self):
        super(AlexNet, self).__init__()
        
#         self.features = nn.Sequential(
            
#         )
# #         self.avgpool = nn.AdaptiveAvgPool2d((6, 6))
#         self.classifier = nn.Sequential(

#         )

    def forward(self, x):
   
        return x

### Training your model

### Prepare an optimizer, set learning rate, and you loss function
- Here you will use model.train and use gradients
- Also, you will use criterion to compute loss 
- Compute metric to know how well it is performing
- save them to display mean for each epoch

In [None]:
# call you model 
model = AlexNet()
lr = 0.0001
optimiser = optim.Adam(model.parameters(), lr=lr) # weight decay (what)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
criterion = nn.CrossEntropyLoss()
model = model.to(device)
criterion = criterion.to(device)

### Prepare accuracy computation to know how your training is going 

[1] Loss function is important to keep track (mostly you minimise it, i.e. it should go down)

[2] Accuracy in classification is important and you want higher accuracy

[3] You can use TensorBoard to visualize both - on new terminal do below

```shell
ssh -L 8889:127.0.0.1:8889 $user@bdicdtvm01.bmrc.ox.ac.uk 
$ ml Anaconda3
$ source activate base

- run your training below and then do this while waiting:
$ tensorboard --logdir runs/ --port 8889
$ http://127.0.0.1:8889/
```

In [None]:
# define accuracy
def topk_accuracy(output, target, topk=(1,)):
    """Computes the precision@k for the specified values of k"""
    maxk = max(topk)
    batch_size = target.size(0)
    _, pred = output.topk(maxk, 1, True, True)
    pred = pred.t()
    correct = pred.eq(target.view(1, -1).expand_as(pred))

    res = []
    for k in topk:
        correct_k = correct[:k].view(-1).float().sum(0)
        res.append(correct_k.mul_(100.0 / batch_size))
    return res

In [None]:
# 4] Run your training loop with optimiser trying to minimise your cost/loss, dont forget to backpropagate your loss
model.to(device)
model.train()
# Tensorboard
from torch.utils.tensorboard import SummaryWriter
# Writer will output to ./runs/ directory by default
writer = SummaryWriter()

# define no. of epochs you want to loop 
epochs = 10
log_interval = 100 # for visualising your iterations 

# New: savining your model depending on your best val score
best_valid_loss = float('inf')
ckptFileName = 'alexNet_CKPT_best.pt'
for epoch in range(epochs):
    train_loss, valid_loss, train_top1,val_top1  = [], [], [], []
  
    
    for batch_idx, (data, label) in enumerate(traindataloader):
        # initialise all your gradients to zero
        optimiser.zero_grad()
        out = model(data.to(device))
        loss = criterion(out, label.to(device))
        loss.backward()
        optimiser.step()
        
        # append
        train_loss.append(loss.item())
        acc_1 = topk_accuracy(out, label.to(device),topk=(1,))
        train_top1.append(acc_1[0].item())
        
        if (batch_idx % log_interval) == 0:
            print('Train Epoch is: {}, train loss is: {:.6f}, train accuracy top1% is {}'.format(epoch, np.mean(train_loss),
                                                                                           np.mean(train_top1)))
            
            ### --------> New -----> validation code!!! ###
            #**New:** Compute validation loss and accuracy --> remember to use no grad (same as test) --> You can write this as function
            # TODO: make function and call for test and val both
            with torch.no_grad():
                for i, (data, label) in enumerate(valdataloader):
                    data, label = data.to(device), label.to(device)
                    out = model(data)
                    loss = criterion(out, label.to(device))
                    acc_1 = topk_accuracy(out, label.to(device),topk=(1,))

                    # append
                    valid_loss.append(loss.item())
                    acc_1 = topk_accuracy(out, label.to(device),topk=(1,))
                    val_top1.append(acc_1[0].item())
    
            print('Val Epoch is: {}, val loss is: {:.6f}, val accuracy top1% is {}'.format(epoch, np.mean(valid_loss),
                                                                                           np.mean(val_top1)))
    
    #New --> save your checkpoint every epoch if best
    if np.mean(valid_loss) < best_valid_loss:
        best_valid_loss = np.mean(valid_loss)
        print('saving my model, improvement in validation loss achieved...')
        torch.save(model.state_dict(), ckptFileName)
        
        
    # every epoch write the loss and accuracy (these you can see plots on tensorboard)        
    writer.add_scalar('AlexNet/train_loss', np.mean(train_loss), epoch)
    writer.add_scalar('AlexNet/train_accuracy', np.mean(train_top1), epoch)
    
    # New --> add plot for your val loss and val accuracy
    writer.add_scalar('AlexNet/val_loss', np.mean(valid_loss), epoch)
    writer.add_scalar('AlexNet/val_accuracy', np.mean(val_top1), epoch)
    
writer.close()

In [None]:
### Test predictions in a CNN (TODO: please fill this by yourself, you can take reference from above)
# Compute total test accuracy
total = 0
model.eval()
with torch.no_grad():
    for data in testdataloader:
        image, labels = data
        output = model(image.to(device))
        _, preds_tensor = torch.max(output, 1)
        acc_1_test = topk_accuracy(output, labels.to(device),topk=(1,) )
        total +=np.mean(acc_1_test[0].detach().cpu().numpy())
print('test accuracy is {}% on 10000 samples of CIFAR10 test dataset'.format(total/len(testdataloader)))

In [None]:
# Alternatively you can compute for each class seperately as well 
#(taken from: https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html)
correct_pred = {classname: 0 for classname in classes}
total_pred = {classname: 0 for classname in classes}

# again no gradients needed
with torch.no_grad():
    for data in testdataloader:
        images, labels = data
        outputs = model(images.to(device))
        _, predictions = torch.max(outputs, 1)
        # collect the correct predictions for each class
        for label, prediction in zip(labels, predictions):
            if label.to(device) == prediction:
                correct_pred[classes[label]] += 1
            total_pred[classes[label]] += 1


# print accuracy for each class
for classname, correct_count in correct_pred.items():
    accuracy = 100 * float(correct_count) / total_pred[classname]
    print("Accuracy for class {:5s} is: {:.1f} %".format(classname,
                                                   accuracy))

#### New => Inference/testing on saved checkpoint
- Load your trained model and apply test on testdataloader 
- If you have checkpoint file (trained weights) you can simply call below for test

In [None]:
ckptFileName = 'alexNet_CKPT_best.pt'
# load the saved weights
model.load_state_dict(torch.load(ckptFileName))

# Apply testing (same as validation above)
test_preds = []
labels = []
with torch.no_grad():
    model.eval()
    for i, batch in enumerate(testdataloader):
        img, label = batch
        img, label = img.to(device, dtype = torch.float), label.to(device, dtype = torch.long)
        output = model(img)
        output = output.detach().cpu().numpy()
        test_preds.extend(np.argmax(output, 1))
        labels.extend(label.detach().cpu().numpy())

In [None]:
# --- New => Use scikit-learn to see the confuse matrix/compute accuracy (recall ML-03)
from sklearn.metrics import accuracy_score, confusion_matrix
print('Confusion matrix:')
print(confusion_matrix(labels, test_preds))
print('Accuracy score: %f' % accuracy_score(labels, test_preds))

### Improving your network peerformance

- Train for larger batch size and epochs (longer)
- Add data augmentation provided in transforms (https://pytorch.org/vision/stable/transforms.html) -- Ask tutor if you  are confused 
- Save your training with augmentation as ``my_AlexNet_withAug.pt``
- Tune your hyperparameters and decay rate
- Add your tensorboard outputs for training and validation (loss and accuracy) as image files


#### Exercise: Perform above improvements on CIFAR10 
You can also refer to: https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html
- You can use this ipython notebook to do this (**Assignment to be submmitted**)
- Due on **Wednesday 3rd November, 2021 (11:59 PM)** (*You will be graded for this exercise*)

<h3>Thanks for completing this lesson!</h3>

Any comments or feedbacks and your solution to exercise, please send to [Sharib Ali](sharib.ali@eng.ox.ac.uk)