<a href="https://colab.research.google.com/github/pabvald/Vision-Project-Image-Segmentation/blob/master/VGG-16.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### In the lecture you have been introduced to VGG16. For this problem your task is to implement a VGG like CNN architecture for classification on the CIFAR10 dataset.

In [1]:
import torch
import torchvision
import torch.optim as optim
import torchvision.transforms as transforms
import torch.nn as nn
import torch.nn.functional as F

In [30]:
num_workers = 2
batch_size = 16
%config Completer.use_jedi = False

  This is separate from the ipykernel package so we can avoid doing imports until


In [31]:
torch.cuda.init()

### 1. Load the dataset (0.5 point)
To load the dataset, you can use the inbuilt dataloader for CIFAR10 provided in the torchvision package. Load both test set and trainset separately. Define the transformations you might need to load the data appropriately.

In [32]:
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]
)

trainset = torchvision.datasets.CIFAR10(root='./CIFAR10', train = True,
                                        download = True, transform = transform)

trainloader = torch.utils.data.DataLoader(trainset, batch_size = batch_size, 
                                            shuffle = True, num_workers = num_workers)

testset = torchvision.datasets.CIFAR10(root='./CIFAR10', train = False,
                                        download = True, transform = transform)

testloader = torch.utils.data.DataLoader(testset, batch_size = batch_size, 
                                            shuffle = True, num_workers = num_workers)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

Files already downloaded and verified
Files already downloaded and verified


In [33]:
print(len(trainset))
print(trainset[0][0].size())

50000
torch.Size([3, 32, 32])


### Create the model architecture (1.0 point)
Implement the class below such that the final architecture follows the same pattern of layers as VGG16.

In [45]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1_1 = nn.Conv2d(3, 64, kernel_size = 3, stride = 1, padding = 1)
        self.conv1_2 = nn.Conv2d(64, 64, kernel_size = 3, stride = 1, padding = 1)
        
        self.conv2_1 = nn.Conv2d(64, 128, kernel_size = 3, stride = 1, padding = 1)
        self.conv2_2 = nn.Conv2d(128, 128, kernel_size = 3, stride = 1, padding = 1)

        self.conv3_1 = nn.Conv2d(128, 256, kernel_size = 3, stride = 1, padding = 1)
        self.conv3_2 = nn.Conv2d(256, 256, kernel_size = 3, stride = 1, padding = 1)

        self.max_pool = nn.MaxPool2d(2, stride = 2, padding = 0)

        self.fc1 = nn.Linear(256*4*4, 2048)
        self.fc2 = nn.Linear(2048, 512)
        self.fc3 = nn.Linear(512, 10)

        self.dropout = nn.Dropout(p=0.5)

    def forward(self, x):
        x = self.conv1_1(x)
        x = self.max_pool(self.conv1_2(x))
        x = self.conv2_1(x) 
        x = self.max_pool(self.conv2_2(x))
        x = self.conv3_1(x)
        x = self.max_pool(self.conv3_2(x))
        x = x.view(-1, 256*4*4)
        x = F.relu(self.fc1(x))
        x = self.dropout(x)
        x = F.relu(self.fc2(x))
        x = self.dropout(x)
        out = self.fc3(x)
        return out


net = Net().cuda()

### Loss function and optimizer (0.5 point)
Define the loss function and optimizer to train the model.

In [46]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

### Train the model (1.0 point)

In [47]:
epochs = 10

for epoch in range(epochs): 
    running_loss = 0.0

    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        inputs = inputs.cuda()
        labels = labels.cuda()

        # Reset gradient
        optimizer.zero_grad()

        # Feed forward 
        outputs = net.forward(inputs)
        loss = criterion(outputs, labels)

        # Backpropagation 
        loss.backward()
        optimizer.step()

        # Print statistics 
        running_loss += loss.item()
        if i % 500 == 499:    # print every 500 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 500))
            running_loss = 0.0
        

print('Finished Training')

[1,   500] loss: 2.301
[1,  1000] loss: 2.254
[1,  1500] loss: 2.074
[1,  2000] loss: 1.987
[1,  2500] loss: 1.872
[1,  3000] loss: 1.734
[2,   500] loss: 1.644
[2,  1000] loss: 1.560
[2,  1500] loss: 1.517
[2,  2000] loss: 1.461
[2,  2500] loss: 1.438
[2,  3000] loss: 1.399
[3,   500] loss: 1.298
[3,  1000] loss: 1.283
[3,  1500] loss: 1.230
[3,  2000] loss: 1.199
[3,  2500] loss: 1.168
[3,  3000] loss: 1.137
[4,   500] loss: 1.036
[4,  1000] loss: 1.042
[4,  1500] loss: 1.039
[4,  2000] loss: 0.995
[4,  2500] loss: 0.974
[4,  3000] loss: 0.959
[5,   500] loss: 0.874
[5,  1000] loss: 0.860
[5,  1500] loss: 0.868
[5,  2000] loss: 0.850
[5,  2500] loss: 0.866
[5,  3000] loss: 0.822
[6,   500] loss: 0.752
[6,  1000] loss: 0.724
[6,  1500] loss: 0.765
[6,  2000] loss: 0.757
[6,  2500] loss: 0.768
[6,  3000] loss: 0.738
[7,   500] loss: 0.655
[7,  1000] loss: 0.670
[7,  1500] loss: 0.678
[7,  2000] loss: 0.664
[7,  2500] loss: 0.627
[7,  3000] loss: 0.670
[8,   500] loss: 0.557
[8,  1000] 

Code below generates the class wise accuracy of the model. You can use the results from the code below to decide the values of hyperparametrs.

In [49]:
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
global_accuracy = 0.0

net.eval()

with torch.no_grad():
    for data in testloader:
        images, labels = data
        images = images.cuda()
        labels = labels.cuda()
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(batch_size):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))
    global_accuracy +=  100 * class_correct[i] / class_total[i] * class_total[i] / len(testset)

print('\nGlobal accuracy   : %2d %%' % global_accuracy)

Accuracy of plane : 83 %
Accuracy of   car : 90 %
Accuracy of  bird : 65 %
Accuracy of   cat : 63 %
Accuracy of  deer : 79 %
Accuracy of   dog : 73 %
Accuracy of  frog : 84 %
Accuracy of horse : 74 %
Accuracy of  ship : 83 %
Accuracy of truck : 80 %

Global accuracy   : 77 %
