
# Advanced Certification in AIML
## A Program by IIIT-H and TalentSprint

## Learning Objectives

At the end of the experiment, you will be able to:

* reduce overfitting using regularization method

In [None]:
#@title Experiment Explanation Video
from IPython.display import HTML

HTML("""<video width="850" height="480" controls>
  <source src="https://cdn.iiith.talentsprint.com/aiml/Experiment_related_data/Walkthrough/Walkthrough_Overfitting_Ants_Bees.mp4" type="video/mp4">
</video>
""")

## Dataset

### Description

For this experiment we have choosen a dataset which is subset of Imagenet. We have taken images belonging to ants and bees. The dataset contains 244 training images and 153 validation images. 

![alt text]( https://cdn.talentsprint.com/aiml/Experiment_related_data/IMAGES/15.png)



In [None]:
! wget https://cdn.iiith.talentsprint.com/aiml/Experiment_related_data/hymenoptera_data.zip
! unzip /content/hymenoptera_data.zip

### Importing the required packages

In [None]:
import torch
from torch import nn
import torchvision
from torchvision import datasets, transforms
from torch import optim
import matplotlib.pyplot as plt

### Defining Transformation


In [None]:
image_size = (128,128)
# Define Transformation for an image
transformations = transforms.Compose([
                                transforms.Resize(image_size), 
                                transforms.Grayscale(),
                                transforms.ToTensor(), 
                                transforms.Normalize((0.5,), (0.5,))
                                ])

### Data Loading


**torch.utils.data.DataLoader** class represents a Python iterable over a dataset, with following features.

1. Batching the data
2. Shuffling the data


The batches of train and test data are provided via data loaders that provide iterators over the datasets to train our models.

In [None]:
train_set = datasets.ImageFolder('/content/hymenoptera_data/train', transform = transformations)
trainloader = torch.utils.data.DataLoader(train_set, batch_size=100, shuffle=True)

val_set = datasets.ImageFolder('/content/hymenoptera_data/val',transform=transformations)
val_loader = torch.utils.data.DataLoader(val_set, batch_size=100, shuffle=True)

In [None]:
val_set

Dataset ImageFolder
    Number of datapoints: 153
    Root location: /content/hymenoptera_data/val
    StandardTransform
Transform: Compose(
               Resize(size=(128, 128), interpolation=bilinear, max_size=None, antialias=None)
               Grayscale(num_output_channels=1)
               ToTensor()
               Normalize(mean=(0.5,), std=(0.5,))
           )

### Defining the Architecture

Neural Networks are inherited from the nn.Module class.

Now let us define a neural network. Here we are using two functions \__init__ and forward function.

In the \__init__  function, we define the layers using the provided modules from the nn package. The forward function is called on the Neural Network for a set of inputs, and it passes that input through the different layers that have been defined. 




In [None]:
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()

        self.linear1 = nn.Linear(16384,4096)
        self.linear2 = nn.Linear(4096,1024)
        self.linear3 = nn.Linear(1024,256)
        self.linear4 = nn.Linear(256,10)
        self.linear5 = nn.Linear(10,2)
    
    def forward(self, x):
        out = x.view(x.shape[0],-1)
        out = self.linear1(out)
        out = self.linear2(out)
        out = self.linear3(out)
        out = self.linear4(out)
        out = self.linear5(out)
        return out

### Calling the instances of the network

Let us declare an object of class model, and make it a CUDA model if CUDA is available:

In [None]:
# Instantiate the model
device = torch.device("cuda")
model = Model().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr = 0.001)

### Training and Testing the model

In Training Phase, we iterate over a batch of images in the train_loader. For each batch, we perform  the following steps:

* First we zero out the gradients using zero_grad()

* We pass the data to the model i.e. we perform forward pass by calling the forward()

* We calculate the loss using the actual and predicted labels

* Perform Backward pass using backward() to update the weights

In [None]:
# No of Epochs
epoch = 20

# keeping the network in train mode
model.train()
train_losses,  train_accuracy = [], []
val_losses , val_accuracy = [], []

# Loop for no of epochs
for e in range(epoch):
    train_loss = 0
    correct = 0

    # Iterate through all the batches in each epoch
    for images, labels in trainloader:

      # Convert the image and label to gpu for faster execution
      images = images.to(device)
      labels = labels.to(device)

      # Zero the parameter gradients
      optimizer.zero_grad()

      # Passing the data to the model (Forward Pass)
      outputs = model(images)

      # Calculating the loss
      loss = criterion(outputs, labels)
      train_loss += loss.item()

      # Performing backward pass (Backpropagation)
      loss.backward()

      # optimizer.step() updates the weights accordingly
      optimizer.step()

      # Accuracy calculation
      _, predicted = torch.max(outputs, 1)
      correct += (predicted == labels).sum().item()
      
    val_loss = 0
    val_correct = 0
    with torch.no_grad():
        # Loop through all of the validation set
        for images, labels in val_loader:
            images = images.to(device)
            labels = labels.to(device)
            val_output = model(images)                                                                  
            val_loss += criterion(val_output, labels)             
            _, predicted = torch.max(val_output, 1)
            val_correct += (predicted == labels).sum()

    train_losses.append(train_loss/len(train_set))
    val_losses.append(val_loss/len(val_set))
    train_accuracy.append(100 * correct/len(train_set))
    val_accuracy.append(100 * val_correct/len(val_set))
    print('epoch: {}, Train Loss:{:.6f} Validation Loss {:.6f} Train Accuracy: {:.2f}, Validation accuracy {:.2f} '.format(e+1,train_losses[-1], val_losses[-1], train_accuracy[-1], val_accuracy[-1]))

epoch: 1, Train Loss:0.192346 Validation Loss 0.049841 Train Accuracy: 56.56, Validation accuracy 59.48 
epoch: 2, Train Loss:0.070110 Validation Loss 0.103071 Train Accuracy: 52.87, Validation accuracy 45.10 
epoch: 3, Train Loss:0.051702 Validation Loss 0.032211 Train Accuracy: 50.41, Validation accuracy 49.67 
epoch: 4, Train Loss:0.022001 Validation Loss 0.018249 Train Accuracy: 47.54, Validation accuracy 64.71 
epoch: 5, Train Loss:0.026950 Validation Loss 0.027135 Train Accuracy: 63.52, Validation accuracy 55.56 
epoch: 6, Train Loss:0.020776 Validation Loss 0.023206 Train Accuracy: 62.70, Validation accuracy 53.59 
epoch: 7, Train Loss:0.019316 Validation Loss 0.017162 Train Accuracy: 59.43, Validation accuracy 64.05 
epoch: 8, Train Loss:0.015065 Validation Loss 0.015178 Train Accuracy: 65.16, Validation accuracy 45.10 
epoch: 9, Train Loss:0.010211 Validation Loss 0.012411 Train Accuracy: 60.25, Validation accuracy 60.13 
epoch: 10, Train Loss:0.007451 Validation Loss 0.011305

### Data Augmentation



Diversity of data and a larger dataset is the easiest way to avoid overfitting of the model. Data augmentation allows you to increase the size of your dataset by performing processes like flipping, cropping, rotation, scaling and translation on the existing images. Data augmentation not only increases the dataset size but also exposes the model to different angles and lighting and reduces the bias in the dataset, thus avoiding chances of overfitting. 

Added two more transformations to the original data.


*   Applied random rotation of $45^o$ using **`transforms.RandomRotation`**
*   Applied vertical flip to the images using **`transforms.RandomVerticalFlip()`**




In [None]:
image_size = (128,128)
transformations = transforms.Compose([
                                transforms.Resize(image_size), 
                                transforms.Grayscale(),
                                transforms.RandomRotation(45),
                                transforms.RandomVerticalFlip(),
                                transforms.ToTensor(), 
                                transforms.Normalize((0.5,), (0.5,)),
                                ])

In [None]:
train_set = datasets.ImageFolder('/content/hymenoptera_data/train', transform = transformations)
trainloader = torch.utils.data.DataLoader(train_set, batch_size=100, shuffle=True, num_workers=8)

val_set = datasets.ImageFolder('/content/hymenoptera_data/val',transform=transformations)
val_loader = torch.utils.data.DataLoader(val_set, batch_size=100, shuffle=True, num_workers=8)

#### Regularization

Dropouts: Regularization techniques prevent the model from overfitting. Dropout prevents overfitting by modifying the network itself. Every neuron apart from the ones in the output layer is assigned a probability p of being temporarily ignored from calculations. p is also called dropout rate and is initialized to 0.2. where it randomly zeroes some of the elements of the input tensor with given probability p

### Optimize the Architecture

In [None]:
class Optimized_Model(nn.Module):
    def __init__(self):
        super(Optimized_Model, self).__init__()

        self.linear1 = nn.Linear(16384,4096)
        self.linear2 = nn.Linear(4096,1024)
        self.linear3 = nn.Linear(1024,256)
        self.linear4 = nn.Linear(256,10)
        self.linear5 = nn.Linear(10,2)
        self.dropout = nn.Dropout(0.2)

    def forward(self, x):
        out = x.view(x.shape[0],-1)
        out = self.linear1(out)
        out = self.linear2(out)
        out = self.linear3(out)
        out = self.linear4(out)
        out = self.dropout(self.linear5(out))       
        return out

#### Initialize the optimized model

In [None]:
# Instantiate the model
device = torch.device("cuda")
model2 = Optimized_Model().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model2.parameters(), lr = 0.001)

### Training the optimized model

In Training Phase, we iterate over a batch of images in the train_loader. For each batch, we perform  the following steps:

* First we zero out the gradients using zero_grad()

* We pass the data to the model i.e. we perform forward pass by calling the forward()

* We calculate the loss using the actual and predicted labels

* Perform Backward pass using backward() to update the weights

In [None]:
# No of Epochs
epoch = 20

model2.train()

train_losses_opt,  train_accuracy_opt = [], []
val_losses_opt , val_accuracy_opt = [], []
    
for e in range(epoch):
    otrain_loss = 0
    ocorrect = 0
    # Iterate through all the batches in each epoch
    for images, labels in trainloader:
      
      # Convert the image and label to gpu for faster execution
      images = images.to(device)
      labels = labels.to(device)
      
      # Zero the parameter gradients
      optimizer.zero_grad()
      
      # Passing the data to the model (Forward Pass)
      outputs = model2(images)
      
      # Calculating the loss
      loss = criterion(outputs, labels)
      otrain_loss += loss.item()

      # Performing backward pass (Backpropagation)
      loss.backward()

      # optimizer.step() updates the weights accordingly
      optimizer.step()

      # Accuracy calculation
      _, predicted = torch.max(outputs, 1)
      ocorrect += (predicted == labels).sum().item()
      
    oval_loss = 0
    oval_correct = 0
    with torch.no_grad():
        # Loop through all of the validation set
        for images, labels in val_loader:
            images = images.to(device)
            labels = labels.to(device)
            val_output = model2(images)                                                                  
            oval_loss += criterion(val_output, labels)             
            _, predicted = torch.max(val_output, 1)
            oval_correct += (predicted == labels).sum()

    train_losses_opt.append(otrain_loss/len(train_set))
    val_losses_opt.append(oval_loss/len(val_set))
    train_accuracy_opt.append(100 * ocorrect/len(train_set))
    val_accuracy_opt.append(100 * oval_correct/len(val_set))
    print('epoch: {}, Train Loss:{:.6f} Test Loss {:.6f} Train Accuracy: {:.2f}, Test accuracy {:.2f} '.format(e+1,train_losses_opt[-1], val_losses_opt[-1], train_accuracy_opt[-1], val_accuracy_opt[-1]))

epoch: 1, Train Loss:0.116887 Test Loss 0.316489 Train Accuracy: 51.64, Test accuracy 45.75 
epoch: 2, Train Loss:0.166032 Test Loss 0.086978 Train Accuracy: 42.21, Test accuracy 60.78 
epoch: 3, Train Loss:0.086636 Test Loss 0.031068 Train Accuracy: 52.87, Test accuracy 46.41 
epoch: 4, Train Loss:0.036917 Test Loss 0.039632 Train Accuracy: 53.28, Test accuracy 44.44 
epoch: 5, Train Loss:0.042593 Test Loss 0.033842 Train Accuracy: 54.51, Test accuracy 50.33 
epoch: 6, Train Loss:0.034824 Test Loss 0.023810 Train Accuracy: 47.13, Test accuracy 47.71 
epoch: 7, Train Loss:0.032700 Test Loss 0.030458 Train Accuracy: 51.23, Test accuracy 56.21 
epoch: 8, Train Loss:0.027614 Test Loss 0.030973 Train Accuracy: 48.77, Test accuracy 52.29 
epoch: 9, Train Loss:0.031312 Test Loss 0.012686 Train Accuracy: 52.05, Test accuracy 43.14 
epoch: 10, Train Loss:0.018431 Test Loss 0.015030 Train Accuracy: 47.13, Test accuracy 45.75 
epoch: 11, Train Loss:0.011291 Test Loss 0.013979 Train Accuracy: 52.