<a href="https://colab.research.google.com/github/Nagaraj-gt/applications-artificial-intelligence/blob/main/q2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

*italicized text*# PART B (3) : Image Classification

***Finetune the vgg19 model on this dataset, train a 15-class classification model and report 
per-class classification accuracy in terms of precision and recall. 
 Submit q2.py. [10 marks]***



**TEAM MEMBERS:**

 Nagaraj G T	 12120095

 Yashaswi Singh	 12120064

 Madhab Chakraborty	 12120045

 Rama Gangadhar Durvasula	 12120087

 Parmarth matta	 12120077



In [1]:
from torchvision import datasets, models, transforms

In [2]:
# Pre=processing the input images to match images with what was presented during training period

preprocess = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225]
        )])

In [None]:
# Import Train and Test Images. The Assignment data is split into train (1 to 40) and validation (rest) sets manually and uploaded in GIT

!wget 'https://github.com/Nagaraj-gt/applications-artificial-intelligence/raw/main/dataset.zip'
!unzip dataset.zip


In [4]:
# Importing required libraries for training

import os
import torch

In [5]:
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
import torch.backends.cudnn as cudnn
cudnn.benchmark = True

In [6]:

data_transforms = {
    'train': transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}

data_dir = '/content/dataset'
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
                                          data_transforms[x])
                  for x in ['train', 'val']}
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=4,
                                             shuffle=True, num_workers=4)
              for x in ['train', 'val']}
dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'val']}
class_names = image_datasets['train'].classes

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")



In [7]:
# Method to train pre-trained model with domain specific data
import copy
import time

def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
    since = time.time()

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print(f'Epoch {epoch}/{num_epochs - 1}')
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)
            if phase == 'train':
                scheduler.step()

            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

            print(f'{phase} Loss: {epoch_loss:.4f} Acc: {epoch_acc:.4f}')

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

        print()

    time_elapsed = time.time() - since
    print(f'Training complete in {time_elapsed // 60:.0f}m {time_elapsed % 60:.0f}s')
    print(f'Best val Acc: {best_acc:4f}')

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model

In [8]:
# Method to visualize Model Predictions

# Visualize few images
import torchvision
import matplotlib.pyplot as plt
import numpy as np

plt.ion()

def visualize_model(model, num_images=6):
    was_training = model.training
    model.eval()
    images_so_far = 0
    fig = plt.figure()

    with torch.no_grad():
        for i, (inputs, labels) in enumerate(dataloaders['val']):
            inputs = inputs.to(device)
            labels = labels.to(device)

            outputs = model(inputs)
            _, preds = torch.max(outputs, 1)

            for j in range(inputs.size()[0]):
                images_so_far += 1
                ax = plt.subplot(num_images//2, 2, images_so_far)
                ax.axis('off')
                ax.set_title(f'predicted: {class_names[preds[j]]}')
                imshow(inputs.cpu().data[j])

                if images_so_far == num_images:
                    model.train(mode=was_training)
                    return
        model.train(mode=was_training)

## VGG 19 Model


In [9]:
# Load Pretrained VGG Model and reset final fully connected Layer for training

vgg_model = models.vgg19(pretrained=True)
num_ftrs = vgg_model.classifier[0].in_features
# Alternatively, it can be generalized to nn.Linear(num_ftrs, len(class_names)).
vgg_model.fc = nn.Linear(num_ftrs, len(class_names))

model_ft = vgg_model.to(device)

criterion = nn.CrossEntropyLoss()

# Observe that all parameters are being optimized
optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)

Downloading: "https://download.pytorch.org/models/vgg19-dcbb9e9d.pth" to /root/.cache/torch/hub/checkpoints/vgg19-dcbb9e9d.pth


  0%|          | 0.00/548M [00:00<?, ?B/s]

In [10]:
# Train and Evaluate

model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler,num_epochs=5)

Epoch 0/4
----------
train Loss: 3.7005 Acc: 0.0650
val Loss: 3.0080 Acc: 0.1122

Epoch 1/4
----------
train Loss: 2.7745 Acc: 0.1567
val Loss: 2.3781 Acc: 0.2390

Epoch 2/4
----------
train Loss: 2.3941 Acc: 0.2567
val Loss: 2.1521 Acc: 0.2732

Epoch 3/4
----------
train Loss: 1.8525 Acc: 0.4133
val Loss: 1.8537 Acc: 0.4293

Epoch 4/4
----------
train Loss: 1.6684 Acc: 0.4983
val Loss: 1.2780 Acc: 0.5951

Training complete in 1m 53s
Best val Acc: 0.595122


In [11]:
# Per class classification accuracy report in terms of precision and recall

import pandas as pd
def print_accuracy_matrix(model):
  nb_classes = 15

  confusion_matrix = torch.zeros(nb_classes, nb_classes)
  with torch.no_grad():
      for i, (inputs, classes) in enumerate(dataloaders['val']):
          inputs = inputs.to(device)
          classes = classes.to(device)
          
          outputs = model(inputs)
          _, preds = torch.max(outputs, 1)
          for t, p in zip(classes.view(-1), preds.view(-1)): 
                  confusion_matrix[t.long(), p.long()] += 1

  precision = confusion_matrix.diag()/confusion_matrix.sum(1)
  recall = confusion_matrix.diag()/confusion_matrix.sum(0)
  res_accuracy = pd.DataFrame(list(zip(class_names,precision.tolist(), recall.tolist())), columns=['Class', 'Precision', 'Recall'])

  print(res_accuracy)

In [12]:
print_accuracy_matrix(model_ft)



             Class  Precision    Recall
0        accordion   1.000000  1.000000
1             bass   0.571429  0.533333
2           camera   1.000000  0.833333
3        crocodile   0.700000  0.388889
4   crocodile_head   0.000000       NaN
5              cup   0.176471  1.000000
6      dollar_bill   1.000000  0.375000
7              emu   0.692308  0.600000
8       gramophone   1.000000  0.333333
9         hedgehog   0.785714  0.611111
10        nautilus   0.266667  1.000000
11           pizza   0.384615  1.000000
12         pyramid   0.294118  1.000000
13       sea_horse   0.352941  0.545455
14   windsor_chair   1.000000  0.842105


CONCLUSION : The accuracy of VGG19 is about 72 - 75%. 

The accuracy is around 100% for precision and recall for certain classes like accordion and windsor_chair. However crocodile , bass its pretty bad !

THis model needs to be fine tuned !


# FINETUNING OF VGG 19 MODEL FOR BETTER ACCURACIES

**Finetuning by Gradual unfreeze of the VGG layers to increase accuracy cannot be adopted. We have initially adopted strategy to train all layers and not just the final FC layer. The pretrained value is derived only to initialize the weights from pre-trained model rather than random**

### Solution 1 : Increase Epochs during training time

In [13]:
# Train and Evaluate

model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler,num_epochs=15)

Epoch 0/14
----------




train Loss: 1.2583 Acc: 0.6050
val Loss: 0.7328 Acc: 0.7463

Epoch 1/14
----------
train Loss: 0.9617 Acc: 0.7067
val Loss: 0.4727 Acc: 0.8390

Epoch 2/14
----------
train Loss: 0.5877 Acc: 0.8250
val Loss: 0.4099 Acc: 0.8683

Epoch 3/14
----------
train Loss: 0.4973 Acc: 0.8233
val Loss: 0.3463 Acc: 0.8878

Epoch 4/14
----------
train Loss: 0.4914 Acc: 0.8333
val Loss: 0.3267 Acc: 0.9024

Epoch 5/14
----------
train Loss: 0.4027 Acc: 0.8533
val Loss: 0.3190 Acc: 0.8976

Epoch 6/14
----------
train Loss: 0.3919 Acc: 0.8567
val Loss: 0.3029 Acc: 0.9024

Epoch 7/14
----------
train Loss: 0.3799 Acc: 0.8883
val Loss: 0.3066 Acc: 0.8927

Epoch 8/14
----------
train Loss: 0.3723 Acc: 0.8683
val Loss: 0.2916 Acc: 0.9073

Epoch 9/14
----------
train Loss: 0.3578 Acc: 0.8817
val Loss: 0.2783 Acc: 0.9073

Epoch 10/14
----------
train Loss: 0.3375 Acc: 0.8850
val Loss: 0.2734 Acc: 0.9073

Epoch 11/14
----------
train Loss: 0.2549 Acc: 0.9167
val Loss: 0.2716 Acc: 0.9122

Epoch 12/14
----------
t

So the best value accuracy with increased Epochs os 0.93. This is indeed good. 

In [14]:
print_accuracy_matrix(model_ft)

             Class  Precision    Recall
0        accordion   1.000000  0.937500
1             bass   0.928571  0.866667
2           camera   1.000000  1.000000
3        crocodile   0.700000  0.777778
4   crocodile_head   0.727273  0.727273
5              cup   1.000000  1.000000
6      dollar_bill   1.000000  1.000000
7              emu   1.000000  0.928571
8       gramophone   1.000000  0.733333
9         hedgehog   0.928571  0.928571
10        nautilus   0.933333  1.000000
11           pizza   1.000000  0.928571
12         pyramid   0.823529  1.000000
13       sea_horse   0.823529  0.933333
14   windsor_chair   0.875000  0.933333


The accuracy matrix for precision and recall has significantly increased for per class accuracies.

### Solution 2 : Decrease learning rate

In [15]:
## Optimize at 1/10th slow learning rate

In [18]:
# Observe that all parameters are being optimized
slow_optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.0001, momentum=0.9)

# Decay LR by a factor of 0.1 every 7 epochs
slow_exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)

In [19]:
slow_model_ft = train_model(model_ft, criterion, slow_optimizer_ft, slow_exp_lr_scheduler,num_epochs=5)

Epoch 0/4
----------




train Loss: 0.3146 Acc: 0.8933




val Loss: 0.2448 Acc: 0.9268

Epoch 1/4
----------
train Loss: 0.2831 Acc: 0.8950
val Loss: 0.2562 Acc: 0.9024

Epoch 2/4
----------
train Loss: 0.2781 Acc: 0.9100
val Loss: 0.2285 Acc: 0.9317

Epoch 3/4
----------
train Loss: 0.3608 Acc: 0.8850
val Loss: 0.2156 Acc: 0.9317

Epoch 4/4
----------
train Loss: 0.2661 Acc: 0.8967
val Loss: 0.2173 Acc: 0.9122

Training complete in 1m 45s
Best val Acc: 0.931707


In [20]:
print_accuracy_matrix(slow_model_ft)

             Class  Precision    Recall
0        accordion   1.000000  1.000000
1             bass   0.928571  0.928571
2           camera   1.000000  1.000000
3        crocodile   0.800000  0.727273
4   crocodile_head   0.727273  0.800000
5              cup   1.000000  1.000000
6      dollar_bill   1.000000  1.000000
7              emu   1.000000  0.928571
8       gramophone   1.000000  0.733333
9         hedgehog   1.000000  0.933333
10        nautilus   1.000000  1.000000
11           pizza   0.923077  1.000000
12         pyramid   0.941176  0.941176
13       sea_horse   0.823529  1.000000
14   windsor_chair   0.812500  0.928571


## CONCLUSION

The fine tuning with either increased Epochs or slow learning rates results in better accuracy about 93%.

The accuracy matrix of precision and Recall is marginally better with slow running rate, especially crocodile.

Next step of improvement would be to further reduce learning rate with higher epochs , may be, 25.