## Lab4: Introduction to Convolutional Layers

Authors: Aakash Kaku, Lee Tanenbaum

The goal of this lab is to understand how to train a convolutional neural network using PyTorch. A lot of starter code will be given to you, and the student is expected to build the network details.

The dataset we will analyze will be a small section of the nih chest xrays dataset, found here: https://www.kaggle.com/nih-chest-xrays/sample . Please create a Kaggle account to download the data (or access in a class folder if we can get server use set up). The dataset has images of resolution 1024x1024, but to make it computationally easier we have reduce the dimensionality to 64x64.

The task at hand is to treat the dataset as a binary/multiclass classification problem with image inputs. We propose to build a model that is a series of spatial convolutional layers, activation functions, and pooling layers.

Before we get into the code, lets think for a bit about model selection. What are the necessary choices?

Number of hidden layers?

For each layer:

    Number of filters?

    Size of kernel?

    Size of padding? (maybe (kernel - 1) / 2)
    
    Stride of layer?
    
    Activation after layer:
    
        Some type of relu, tanh, sigmoid?
        
        Maybe add Batch normalization before the activation function?
    
    Maybe a pooling layer instead of a convolutional layer to decrease spatial dimension?

    Learning Rate?

    Momentum parameters for optimizers such as ADAM?

Other training techniques such as adding noise to input or hidden layers?

Image specific techniques such as random rotations or blurring of the image?

Other optional enhancements:

    Let us know if you have any ideas, there are approximately infinite different enhancements that can be included to help these types of models learn
    
Before you start to write code, try to have a choice of these hyperparameters in mind so you can try to implement them

In [None]:
#Import common dependencies
import torch
import pandas as pd, numpy as np, matplotlib, matplotlib.pyplot as plt
from PIL import Image 
from torch import nn
from torch.utils.data import Dataset, DataLoader
import torch.optim as optim
import time
import torch.nn.functional as F

In [None]:
if torch.cuda.is_available:
    device = torch.device('cuda')
else:
    device = torch.device('cpu')

## Dataset Selection

### First, read in the sample labels which we will use as a lookup table while loading the data

In [None]:
label_df = pd.read_csv('sample_labels.csv').iloc[:, :2]
label_df.head()

In [None]:
label_df['Disease']=(label_df['Finding Labels'] != 'No Finding').astype(int)
print(label_df.head())
num_rows = 1000
label_df = label_df.iloc[:num_rows,:]

# define train, val and test idx
idx = np.arange(num_rows)
np.random.shuffle(idx)
train_size = 600
val_size = 200
test_size = 200
train_idx = idx[:train_size]
val_idx = idx[train_size:train_size+val_size]
test_idx = idx[train_size+val_size:]

# get train, val and test dataframes
train_df = label_df.iloc[train_idx,:]
val_df = label_df.iloc[val_idx,:]
test_df = label_df.iloc[test_idx,:]

# save the dataframes
train_df.to_csv('train.csv', index = False)
val_df.to_csv('val.csv', index = False)
test_df.to_csv('test.csv', index = False)

### We build dataloader to efficiently load the images and possibly also do some data augmentation on the fly

In [None]:
class Xray_dataset(Dataset):
    '''X-ray Dataset'''
    def __init__(self, df_path, train = False):
        self.df = pd.read_csv(df_path)
        self.train = train
        
    def __len__(self):
        return len(self.df)
    
    def __getitem__(self,idx):
        
        file_name = self.df.iloc[idx,0]
        label = self.df.iloc[idx,-1]
        img = Image.open('./images/'+file_name)
        img = img.resize((64,64))
        
        if self.train:
            # rotate the image (data augementation)
            # TO DO
            #Step 1: generate a random number
            #Step 2: Check if the random number is greater than some threshold (0.7)
            #Step 3: If yes for Step 3, then generate a random rotation angle between -10 and 10 degrees
            #Step 4: Rotate the image using the rotation angle (check the PIL library for the command to rotate imgs)
                
        img = np.asarray(img)
        min_image = np.min(img)
        max_image = np.max(img)
        img = (img - min_image)/(max_image - min_image + 1e-4)
        
        img = torch.tensor(img).unsqueeze(0).float()
        label = torch.tensor(label).long()
        if img.dim() != 3:
            img = img[:,:,:,0]
        
        return img, label


In [None]:
train_df_path = './train.csv'
val_df_path = './val.csv'
test_df_path = './test.csv'
transformed_dataset = {'train': Xray_dataset(train_df_path, train = True),
                       'validate':Xray_dataset(val_df_path),
                       'test':Xray_dataset(test_df_path),
                                          }
bs = 16
dataloader = {x: DataLoader(transformed_dataset[x], batch_size=bs,
                        shuffle=True, num_workers=0) for x in ['train', 'validate','test']}
data_sizes ={x: len(transformed_dataset[x]) for x in ['train', 'validate','test']}

### Check the data loader

In [None]:
sample = next(iter(dataloader['train']))

In [None]:
sample[0].size()

In [None]:
sample[1]

### Visualize the data

In [None]:
sample_img = sample[0][1].squeeze().numpy()
plt.imshow(sample_img, cmap = 'gray')

### Train loop

In [None]:
def train_model(model, dataloader, optimizer, loss_fn, num_epochs = 10, verbose = False):
    acc_dict = {'train':[],'validate':[]}
    loss_dict = {'train':[],'validate':[]}
    best_acc = 0
    phases = ['train','validate']
    since = time.time()
    for i in range(num_epochs):
        print('Epoch: {}/{}'.format(i, num_epochs-1))
        print('-'*10)
        for p in phases:
            running_correct = 0
            running_loss = 0
            running_total = 0
            if p == 'train':
                model.train(True)
            else:
                model.train(False)
                
            for data in dataloader[p]:
                num_imgs = image.size()[0]
                # TO DO
                # Step 1: zero the grad
                # Step 2: get the input and label data (make sure they are on the same device as the model)
                # Step 3: get the output
                # Step 4: Compute the loss
                # Step 5: Get the pred values
                # Step 6: compute the number of corrects for this batch and increment the running_correct
                running_loss += loss.item()*num_imgs
                running_total += num_imgs
                if p == 'train':
                    # TO DO
                    # Step 1: Peform the backward
                    # Step 2: Take a gradient step
            epoch_acc = float(running_correct/running_total)
            epoch_loss = float(running_loss/running_total)
            if verbose or (i%10 == 0):
                print('Phase:{}, epoch loss: {:.4f} Acc: {:.4f}'.format(p, epoch_loss, epoch_acc))
            
            acc_dict[p].append(epoch_acc)
            loss_dict[p].append(epoch_loss)
            if p == 'validate':
                # TO DO
                # Save the model's state dict as best_model_wts if epoch_acc is greater than best_acc
                
    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
    print('Best val acc: {:4f}'.format(best_acc))
    
    model.load_state_dict(best_model_wts)
    
    return model, acc_dict, loss_dict

### Evaluate Loop`

In [None]:
def evaluate_model(model, dataloader,loss_fn, phase = 'validate'):
    model.eval()
    running_correct = 0
    running_loss = 0
    running_total = 0
    for data in dataloader[phase]:
        num_imgs = image.size()[0]
        # TO DO
        # Step 1: get the input and label data (make sure they are on the same device as the model)
        # Step 2: get the output
        # Step 3: Compute the loss
        # Step 4: Get the pred values
        # Step 5: compute the number of corrects for this batch and increment the running_correct
        running_loss += loss.item()*num_imgs
        running_total += num_imgs
    accuracy = float(running_correct/running_total)
    loss = float(running_loss/running_total)
    
    return accuracy, loss

### Now lets build some models

### First some common functions that could be useful to build the model

In [None]:
help(nn.LeakyReLU)

In [None]:
help(nn.MaxPool2d)

In [None]:
help(nn.Conv2d)

### Define the model class

In [None]:
class Conv_model(nn.Module):
    def __init__(self, kernel_size = 3):
        super(Conv_model,self).__init__()
        # TO DO
        # Step 1: Instantiate Layer 1:
            # Conv layer: Convolution with input channel(i) = 1, output channels (o) = 16, padding (p) = 1,\
            #          stride (s) = 1, kernel_size (k) = kernel_size
            # Relu
            
        # Step 2: Instantiate Layer 2:
            # Conv layer: Convolution(i = previous output channels, o = 16, p = 1,s = 1, k = kernel_size)
            # Relu
            
        # Step 3: Instantiate Layer 3:
            # Maxpooling layer (k = kernel_size, s = 2)
        
        # Step 4: Instantiate Layer 4:
            # Conv layer: Convolution(i = previous output channels, o = 32, p = 1,s = 2, k = kernel_size)
            # Relu
            
        # Step 5: Instantiate Layer 5:
            # Conv layer: Convolution(i = previous output channels, o = 32, p = 1,s = 2, k = kernel_size)
            # Relu
        
        # Step 6: Instantiate Layer 6:
            # Maxpooling layer (k = kernel_size, s = 2)
        
        # Step 7: Instantiate Layer 7:
            # Conv layer: Convolution(i = previous output channels, o = 32, p = 1,s = 1, k = kernel_size)
            # Relu
        
        # Step 8: Instantiate Layer 8:
            # Conv layer: Convolution(i = previous output channels, o = 2, p = 1,s = 1, k = kernel_size) 
            
        
    def forward(self,x):
        # TO DO
        # Write the foward pass
        x = F.adaptive_avg_pool2d(x, (1,1))
    
        return x.view(-1,2)

### Lets Train the model

In [None]:
# TO DO
# Step 1: Call the model class. Make sure to send the model to the desired device 
# Step 2: instantiate the Adam optimizer
# Step 3: Use the train_model function to train the model

### Now lets evaluate the model

In [None]:
# TO DO
# Use the evaluate_model function to evaluate the model