<a href="https://colab.research.google.com/github/aditya0811/-100daysofMLcode/blob/master/VGG19_102_classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Using VGG19 to build a classifier using 6552 training files and 818 validation files.   

**Outcomes of this projects**


*   Trained the model using VGG19 architecture
*   Saved model parameters for further training 
*   About VGG19 
*   Loss function to use
*   Adding dropout layer
*   Using Transfer learning where we train the model using pretrained models like VGG19 or resnet
*  Tuning of Hyperparameters like changing learning rate,optimiser,loss function.






**[VGG19](https://forums.fast.ai/t/vgg-strength-and-limitations/1218)**

*   Only 3x3 convolution and 2x2 pooling are used throughout the whole network. 

*   VGG also shows that the depth of the network plays an important role. Deeper networks give better results

*   19 here refers to the weights of the network in the layer.




##DEALING WITH OVERFITTING



*   **Adding Dropout**
 Overfitting occurs when the ConvNet model with high number of weights gets trained on the training data set with less samples and the model learns to identify the intrinsic noise of the training data.
 say dropout=0.2 will randomly drop 20% of the data.
*   **Freezing weights **  of the layer
 



## MAKING CHANGES IN OUR MODEL MANUALLY



**Removing layers in NN,here last 3 layers**
```
model.classifier = nn.Sequential(*[model.classifier[i] for i in range(4)])
print(model.classifier)
```


```
model.classifier = nn.Sequential(*list(model.classifier.children())[:-3])
```




**When is data overfitting or underfitting**

*   Overfitting if: training loss << validation loss
*   Underfitting if: training loss >> validation loss
*   Just right if training loss ~ validation loss

In [0]:
!pip install pillow==4.1.1
%reload_ext autoreload
%autoreload

In [0]:
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import time
import json
import copy
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from PIL import Image
from collections import OrderedDict
import torch
from torch import nn, optim
from torch.optim import lr_scheduler
from torch.autograd import Variable
from torchvision import datasets, models, transforms

In [0]:
from google.colab import drive
drive.mount('/content/gdrive')

In [0]:
!wget "https://s3.amazonaws.com/content.udacity-data.com/courses/nd188/flower_data.zip" 

In [0]:
!unzip "flower_data.zip"

In [0]:
data_transforms = {
    'train': transforms.Compose([
        transforms.RandomRotation(45),
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], 
                             [0.229, 0.224, 0.225])
    ]),
    'valid': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], 
                             [0.229, 0.224, 0.225])
    ])
}

In [0]:
data_dir = "/content/flower_data"

In [0]:
cd flower_data

/content/flower_data


In [0]:
train_dir = 'train'
valid_dir = 'valid'
dirs = {'train': train_dir, 
        'valid': valid_dir}
image_datasets = {x: datasets.ImageFolder(dirs[x],   transform=data_transforms[x]) for x in ['train', 'valid']}
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=32, shuffle=True) for x in ['train', 'valid']}
dataset_sizes = {x: len(image_datasets[x]) 
                              for x in ['train', 'valid']}
class_names = image_datasets['train'].classes

In [0]:
model = models.vgg19(pretrained=True)

In [0]:
model

VGG(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU(inplace)
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace)
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace)
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU(inplace)
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU(inplace)
    (16): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (17): ReLU(inplace)

**Making our selected architecture consistent with given data by looking at the part of the architecture printed**

In [0]:
classifier = nn.Sequential(OrderedDict([
                          ('fc1', nn.Linear(25088, 4096)),
                          ('relu', nn.ReLU()),
                          ('fc2', nn.Linear(4096, 102)),
                          ('output', nn.LogSoftmax(dim=1))
                          ]))

**Ensuring we dont update weights for our model**

In [0]:
for param in model.parameters():
    param.requires_grad = False

In [0]:
model.classifier = classifier

In [0]:
def train_model(model, criteria, optimizer, scheduler,    
                                      num_epochs, device='cuda'):
    since = time.time()

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'valid']:
            if phase == 'train':
                scheduler.step()
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criteria(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)

            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))

            # deep copy the model
            if phase == 'valid' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

        print()

    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model

In [0]:
model.cuda()

In [0]:
import torch.optim as optim

# Criteria NLLLoss which is recommended with Softmax final layer
criteria = nn.CrossEntropyLoss()
# Observe that all parameters are being optimized
optimizer1 = optim.Adam(model.classifier.parameters(), lr=0.001)
# Decay LR by a factor of 0.1 every 4 epochs
sched = lr_scheduler.StepLR(optimizer1, step_size=4, gamma=0.1)
# Number of epochs
eps=10

**10 epochs**

In [0]:
model_ft = train_model(model, criteria, optimizer1, sched, eps, 'cuda')

Epoch 0/9
----------
train Loss: 2.3860 Acc: 0.4730
valid Loss: 0.8379 Acc: 0.7628

Epoch 1/9
----------
train Loss: 1.1422 Acc: 0.6815
valid Loss: 0.7204 Acc: 0.7910

Epoch 2/9
----------
train Loss: 0.9667 Acc: 0.7378
valid Loss: 0.6810 Acc: 0.8191

Epoch 3/9
----------
train Loss: 0.6831 Acc: 0.8112
valid Loss: 0.4402 Acc: 0.8814

Epoch 4/9
----------
train Loss: 0.5744 Acc: 0.8373
valid Loss: 0.3991 Acc: 0.8973

Epoch 5/9
----------
train Loss: 0.5604 Acc: 0.8443
valid Loss: 0.3885 Acc: 0.8961

Epoch 6/9
----------
train Loss: 0.5131 Acc: 0.8568
valid Loss: 0.3905 Acc: 0.8924

Epoch 7/9
----------
train Loss: 0.4989 Acc: 0.8622
valid Loss: 0.3796 Acc: 0.8924

Epoch 8/9
----------
train Loss: 0.4997 Acc: 0.8591
valid Loss: 0.3702 Acc: 0.8998

Epoch 9/9
----------
train Loss: 0.4886 Acc: 0.8655
valid Loss: 0.3701 Acc: 0.8985

Training complete in 38m 4s
Best val Acc: 0.899756


**5 +5 epochs**

In [0]:
model_ft = train_model(model, criteria, optim, sched, eps, 'cuda')

Epoch 0/4
----------
train Loss: 2.5754 Acc: 0.4588
valid Loss: 0.8900 Acc: 0.7579

Epoch 1/4
----------
train Loss: 1.1595 Acc: 0.6841
valid Loss: 0.7107 Acc: 0.8056

Epoch 2/4
----------
train Loss: 0.9255 Acc: 0.7428
valid Loss: 0.5939 Acc: 0.8472

Epoch 3/4
----------
train Loss: 0.8626 Acc: 0.7639
valid Loss: 0.6637 Acc: 0.8374

Epoch 4/4
----------
train Loss: 0.8000 Acc: 0.7779
valid Loss: 0.5750 Acc: 0.8411

Training complete in 19m 49s
Best val Acc: 0.847188


**Saving model - 5epochs** 

In [0]:
model.class_to_idx = image_datasets['train'].class_to_idx
model.cpu()
torch.save({'arch': 'vgg19',
            'state_dict': model.state_dict(), 
            'class_to_idx': model.class_to_idx}, 
            'classifier.pth')


In [0]:
def calc_accuracy(model, data, cuda=False):
    model.eval()
    model.to(device='cuda')    
    
    with torch.no_grad():
        for idx, (inputs, labels) in enumerate(dataloaders[data]):
            if cuda:
                inputs, labels = inputs.cuda(), labels.cuda()
            # obtain the outputs from the model
            outputs = model.forward(inputs)
            # max provides the (maximum probability, max value)
            _, predicted = outputs.max(dim=1)
            # check the 
            if idx == 0:
                print(predicted) #the predicted class
                print(torch.exp(_)) # the predicted probability
            equals = predicted == labels.data
            if idx == 0:
                print(equals)
            print(equals.float().mean())

**Checking accuracy**

In [0]:
calc_accuracy(model,'valid',True)

tensor([ 94,  77,  48, 100,   2,  14,  41,  45,  76,  67,  19,  48,  40,  77,
          0,  45,  72,  82,  49,  53,  56,  35,  49,  73,  61,  49,  59,  66,
         70,  77,  76,  19], device='cuda:0')
tensor([0.9997, 1.0000, 0.9814, 0.8381, 0.6187, 1.0000, 0.6639, 0.9277, 0.4184,
        1.0000, 0.8104, 0.9691, 0.9822, 0.9999, 0.8130, 0.9539, 0.7750, 0.9999,
        0.6759, 0.9990, 0.9978, 0.6216, 0.9555, 0.8973, 0.9999, 0.9994, 0.9888,
        0.9126, 0.9967, 1.0000, 0.9998, 0.8765], device='cuda:0')
tensor([1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1], device='cuda:0', dtype=torch.uint8)
tensor(0.9375, device='cuda:0')
tensor(0.7812, device='cuda:0')
tensor(0.7188, device='cuda:0')
tensor(0.9062, device='cuda:0')
tensor(0.7812, device='cuda:0')
tensor(0.8438, device='cuda:0')
tensor(0.6875, device='cuda:0')
tensor(0.7812, device='cuda:0')
tensor(0.8125, device='cuda:0')
tensor(0.9062, device='cuda:0')
tensor(0.8750, device='

**loading model**

In [0]:
model = load_model('classifier.pth')
calc_accuracy(model, 'valid', True)



tensor([ 32,   1,  77,   5,  91,  54,  57,   0,  84,  33,  40,  73,  14,  24,
         71,  74,  75,   5,  91, 100,  96,  82,   9,  89,  90,  89,  55,  78,
         48,  90,  55,  59], device='cuda:0')
tensor([0.5693, 0.9999, 1.0000, 0.8904, 0.9893, 1.0000, 0.9911, 0.9889, 0.9868,
        0.9816, 0.4665, 0.9317, 1.0000, 1.0000, 1.0000, 0.8170, 0.9997, 0.7088,
        0.7182, 0.8381, 0.9972, 0.9993, 0.9006, 0.9998, 0.9990, 0.9999, 0.9959,
        0.9983, 0.9950, 0.7380, 0.8078, 0.9995], device='cuda:0')
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 0, 1, 1], device='cuda:0', dtype=torch.uint8)
tensor(0.9062, device='cuda:0')
tensor(0.9062, device='cuda:0')
tensor(0.8750, device='cuda:0')
tensor(0.9688, device='cuda:0')
tensor(0.8750, device='cuda:0')
tensor(0.7188, device='cuda:0')
tensor(0.7812, device='cuda:0')
tensor(0.7812, device='cuda:0')
tensor(0.8438, device='cuda:0')
tensor(0.9375, device='cuda:0')
tensor(0.8125, device='

**Running next 5 epochs**

In [0]:
model_ft = train_model(model, criteria, optim, sched, eps, 'cuda')

Epoch 0/4
----------
train Loss: 0.7720 Acc: 0.7821
valid Loss: 0.5939 Acc: 0.8472

Epoch 1/4
----------
train Loss: 0.7821 Acc: 0.7842
valid Loss: 0.5939 Acc: 0.8472

Epoch 2/4
----------
train Loss: 0.7628 Acc: 0.7885
valid Loss: 0.5939 Acc: 0.8472

Epoch 3/4
----------
train Loss: 0.7816 Acc: 0.7836
valid Loss: 0.5939 Acc: 0.8472

Epoch 4/4
----------
train Loss: 0.7662 Acc: 0.7804
valid Loss: 0.5939 Acc: 0.8472

Training complete in 19m 50s
Best val Acc: 0.847188


**Saving weights after 10 epochs**

In [0]:
model.class_to_idx = image_datasets['train'].class_to_idx
model.cpu()
torch.save({'arch': 'vgg19',
            'state_dict': model.state_dict(), 
            'class_to_idx': model.class_to_idx}, 
            'classifier2.pth')

There was a significant difference by running model for 10 epochs and then for 5 + 5 epochs . 