# Fast Experiments with PyTorch

The purpose of this notebook is to explore the structuring of the PyTorch code for faster experimentation.
Data used is custom Yoga poses data. More information can be found [here](https://fastpages.fast.ai/)

<font color='red'>NOTE</font> 
- There seems to be some small bug introduced in _training_ of the model. There is absolutely no improvements in loss or accuracies. Sorry for that. Working on it, this notebook will be updated once that bug is fixed. 
- Only default kernel sizes are checked for convolutions. Again will be updated after some experimentations.
- Only very very basic fit method is used. Again, will be updated to be more comprehensive in the later versions. 
- Model saving also will be included once the bugs are fixed. 

This notebook heavily constructed based on the structuring in [JovianML's](https://jovian.ml/) course on PyTorch Zero2Gans. Check out the awesome forum for more info on cool things.

## General Imports

In [None]:
import os
import torch
import pandas as pd
import numpy as np
from torch.utils.data import Dataset, random_split, DataLoader
from torchvision.utils import make_grid
from tqdm.notebook import tqdm
from PIL import Image
from torchvision import models, datasets
import matplotlib.pyplot as plt
import torchvision.transforms as T
from sklearn.metrics import f1_score
import torch.nn.functional as F
import torch.nn as nn
from torchvision.utils import make_grid
%matplotlib inline

## Preparing the Data  

Goes without saying that folder structure recommended by PyTorch has to be maintained.

In [None]:
DATA_DIR = 'data/'

TRAIN_DIR = DATA_DIR + '/train'                           
VALID_DIR = DATA_DIR + '/val'                             

### Data augmentations

In [None]:
#Resizing to Cifar dataset size for faster experimentations

imagenet_stats = ([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])

train_tfms = T.Compose([
#         T.Resize((512,512)),
        T.Resize((32,32)),
        T.RandomHorizontalFlip(),
        T.ToTensor(),
        T.Normalize(*imagenet_stats) ])

valid_tfms = T.Compose([
#     T.Resize((512, 512)),
    T.Resize((32,32)),
    T.ToTensor(), 
    T.Normalize(*imagenet_stats) 
])

PyTorch specific data loading

In [None]:
train_set = datasets.ImageFolder(TRAIN_DIR, transform=train_tfms)
valid_set = datasets.ImageFolder(VALID_DIR, transform=valid_tfms)

In [None]:
batch_size = 64

In [None]:
train_dl = DataLoader(train_set, batch_size=batch_size, shuffle=True, num_workers=4, pin_memory=True)
valid_dl = DataLoader(valid_set, batch_size=batch_size, shuffle=True, num_workers=4, pin_memory=True)

In [None]:
class_names = train_set.classes

Just checking on one data. Note the heavy pixelation because of resizing

In [None]:
def imshow(inp, title=None):
    """Imshow for Tensor."""
    inp = inp.numpy().transpose((1, 2, 0))
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    inp = std * inp + mean
    inp = np.clip(inp, 0, 1)
    plt.imshow(inp)

# Plotting only one image for easier visualisation
inputs, classes = next(iter(valid_dl))
out = make_grid(inputs[0])

imshow(out)

## Architecting Blocks

###### This is the main purpose of this notebook!

Defining a PyTorch architecture is seamless thanks to the _Sequential_ api. However, in any deep learning architecture, there will be a small blocks of repretitive structure that are generally followed. For example -If we observe, the CIFAR classification module had, Conv-Relu-Maxpool layers defined as blocks like below-  
```
nn.Conv2d(in_filter, out_filter, kernel_size, padding=1...)
nn.ReLU()
nn.Conv2d(in_filter, out_filter, kernel_size, padding=1...)
nn.ReLU()
nn.MaxPool2d(kernel_size, stride,...)
```

This can be defined as a function _conv_blocks_ for easier additions and compact code. Similar approach can be followed in creating _linear blocks_.

In [None]:
class ImageClassificationBase(nn.Module):
    ## The conv block - Adjustments of filters is just experimental ##
    def conv_block(self, in_f, out_f, pool_params, **kwargs):
        pool_size, pool_stride = pool_params
        if in_f != 3:
            in_f = in_f*2
            out_f1 = out_f2 = out_f*2
        else:
            out_f1 = out_f
            out_f2 = out_f*2
            
        return nn.Sequential(
            nn.Conv2d(in_f, out_f1,**kwargs),
            nn.ReLU(),
            nn.Conv2d(out_f1, out_f2,**kwargs),
            nn.ReLU(),
            nn.MaxPool2d(pool_size, pool_stride)
        )
    
    ## The linear block ##
    def linear_block(self, in_f, out_f):
        return nn.Sequential(
                nn.Linear(in_f, out_f),
                nn.ReLU()
           )
    
    def training_step(self, batch):
        images, labels = batch 
        out = self(images)                  # Generate predictions
        loss = F.cross_entropy(out, labels) # Calculate loss
        return loss
    
    def validation_step(self, batch):
        images, labels = batch 
        out = self(images)                    # Generate predictions
        loss = F.cross_entropy(out, labels)   # Calculate loss
        acc = accuracy(out, labels)           # Calculate accuracy
        return {'val_loss': loss.detach(), 'val_acc': acc}
        
    def validation_epoch_end(self, outputs):
        batch_losses = [x['val_loss'] for x in outputs]
        epoch_loss = torch.stack(batch_losses).mean()   # Combine losses
        batch_accs = [x['val_acc'] for x in outputs]
        epoch_acc = torch.stack(batch_accs).mean()      # Combine accuracies
        return {'val_loss': epoch_loss.item(), 'val_acc': epoch_acc.item()}
    
    def epoch_end(self, epoch, result):
        print("Epoch [{}], val_loss: {:.4f}, val_acc: {:.4f}".format(epoch, result['val_loss'], result['val_acc']))

### Custom network using defined blocks

Constructing the network becomes a breeze, since its only arrangement of the conv blocks and fc blocks! Also the code looks very compact!

In [None]:
class CustomAsanaNet(ImageClassificationBase):
    def __init__(self, out_class, input_size, conv_channel_list, ip_channel_list, pool_params, **kwargs):
        super().__init__()
        filter_sizes = [input_size[0], *conv_channel_list]
            
        #Iterate over the filter sizes and create some sequential blocks
        conv_blocks = [self.conv_block(in_f, out_f, pool_params, **kwargs) 
                       for in_f, out_f in zip(filter_sizes, filter_sizes[1:])]
        #Stitch them together via an unroll
        self.conv_network = nn.Sequential(*conv_blocks)
        
        #For auto reshape just pass a dummy varaiable to obtain the output shape to feed 
        #to the fully connected layer
        dummy_in = torch.rand(1,*input_size)
        self.neurons = np.prod(self.conv_network(dummy_in).shape)

        #Similar to conv blocks
        filter_sizes = [self.neurons, *ip_channel_list, out_class]
        fc_blocks = [self.linear_block(in_f, out_f) for in_f, out_f in zip(filter_sizes, filter_sizes[1:])]
        self.fc_network = nn.Sequential(*fc_blocks)
    
    
    def forward(self, xb):
        conv_out = self.conv_network(xb)
        conv_out = conv_out.view(-1, self.neurons)
        #Flattening the output to feed to the FC block
        fc_out = self.fc_network(conv_out)
        return fc_out

### Creating a transfer learning network using defined blocks

Now the beauty of breaking the architecture into blocks is that it can be used seamlessly even in transfer learning! Below is an example of creating custom architecture using Resnet. But the code has been adapted to take in a _model_name_ as input, so the experimentations with multiple models supported by PyTorch also becomes a lot easier.

[Learn about ResNets.](https://towardsdatascience.com/an-overview-of-resnet-and-its-variants-5281e2f56035)
Check out torchvision models: https://pytorch.org/docs/stable/torchvision/models.html

In [None]:
class TransferAsanaNet(ImageClassificationBase):
    def __init__(self, model_name, out_class, ip_channel_list):
        super().__init__()
        # Use a pretrained model
        if model_name != None:
            try:
                self.network = getattr(models, model_name, None)(pretrained=True)
            except TypeError:
                print ("No model called "+ model + " found! Please verify the name!")
        
        # Replace last layer
        num_ftrs = self.network.fc.in_features      
        filter_sizes = [num_ftrs, *ip_channel_list, out_class]
        
        fc_blocks = [self.linear_block(in_f, out_f) for in_f, out_f in zip(filter_sizes, filter_sizes[1:])]
        self.network.fc = nn.Sequential(*fc_blocks)
    
    def forward(self, xb):
        return self.network(xb)
    
    def freeze(self):
        # To freeze the residual layers
        for param in self.network.parameters():
            param.require_grad = False
        for param in self.network.fc.parameters():
            param.require_grad = True
    
    def unfreeze(self):
        # Unfreeze all layers
        for param in self.network.parameters():
            param.require_grad = True

## Creating the models

### Custom model using the architecture defined earlier

In [None]:
kwargs = {'kernel_size':3, 'stride':1, 'padding':1}
input_image_dim = [3, 32, 32]
conv_filter_list = [32, 64, 128]
fc_filter_list = [1024,512]
pool_params = [2,2]
out_class = 10
model = CustomAsanaNet(out_class, input_image_dim, conv_filter_list, fc_filter_list, pool_params, **kwargs)

### Transfer learning model using Resnet-18

In [None]:
model_name = 'resnet18'
model = TransferAsanaNet(model_name, out_class, fc_filter_list)

In [None]:
# model

## Model Learning

#### Utility functions 

In [None]:
def get_default_device():
    """Pick GPU if available, else CPU"""
    if torch.cuda.is_available():
        return torch.device('cuda')
    else:
        return torch.device('cpu')
    
def to_device(data, device):
    """Move tensor(s) to chosen device"""
    if isinstance(data, (list,tuple)):
        return [to_device(x, device) for x in data]
    return data.to(device, non_blocking=True)

class DeviceDataLoader():
    """Wrap a dataloader to move data to a device"""
    def __init__(self, dl, device):
        self.dl = dl
        self.device = device
        
    def __iter__(self):
        """Yield a batch of data after moving it to device"""
        for b in self.dl: 
            yield to_device(b, self.device)

    def __len__(self):
        """Number of batches"""
        return len(self.dl)

In [None]:
device = get_default_device()
device

### Training  

<font color='red'>--- Work in progress ---</font> 


In [None]:
def accuracy(outputs, labels):
    _, preds = torch.max(outputs, dim=1)
    return torch.tensor(torch.sum(preds == labels).item() / len(preds))

In [None]:
def evaluate(model, val_loader):
    outputs = [model.validation_step(batch) for batch in val_loader]
    return model.validation_epoch_end(outputs)

def fit(epochs, model, train_loader, val_loader, optimizer, scheduler):
    history = []
    for epoch in range(epochs):
        # Training Phase 
        for batch in tqdm(train_loader):
            loss = model.training_step(batch)
            loss.backward()
            optimizer.step()
            optimizer.zero_grad()
        # Validation phase
        result = evaluate(model, val_loader)
        model.epoch_end(epoch, result)
        scheduler.step()
        history.append(result)
    return history

In [None]:
train_dl = DeviceDataLoader(train_dl, device)
val_dl = DeviceDataLoader(valid_dl, device)

In [None]:
optimizer = torch.optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.1)

In [None]:
model = to_device(model, device)

In [None]:
history = [evaluate(model, val_dl)]

In [None]:
history = fit(10, model, train_dl, val_dl, optimizer, exp_lr_scheduler)

In [None]:
history

## Plotting

In [None]:
def plot_losses(history):
    losses = [x['val_loss'] for x in history]
    plt.plot(losses, '-x')
    plt.xlabel('epoch')
    plt.ylabel('loss')
    plt.title('Loss vs. No. of epochs');

In [None]:
def plot_accuracies(history):
    accuracies = [x['val_acc'] for x in history]
    plt.plot(accuracies, '-x')
    plt.xlabel('epoch')
    plt.ylabel('accuracy')
    plt.title('Accuracy vs. No. of epochs')

In [None]:
plot_losses(history)

In [None]:
history