# VGG16 CIFAR-10 Classifier

This notebook implements the classic AlexNet convolutional network [1] and applies it to the CIFAR10 object classification dataset. The basic architecture is shown in the figure below:

![](./images/vgg16/vgg16-arch.png)

The input to cov1 layer is of fixed size 224 x 224 RGB image. The image is passed through a stack of convolutional (conv.) layers, where the filters were used with a very small receptive field: 3×3 (which is the smallest size to capture the notion of left/right, up/down, center). In one of the configurations, it also utilizes 1×1 convolution filters, which can be seen as a linear transformation of the input channels (followed by non-linearity). The convolution stride is fixed to 1 pixel; the spatial padding of conv. layer input is such that the spatial resolution is preserved after convolution, i.e. the padding is 1-pixel for 3×3 conv. layers. Spatial pooling is carried out by five max-pooling layers, which follow some of the conv.  layers (not all the conv. layers are followed by max-pooling). Max-pooling is performed over a 2×2 pixel window, with stride 2.

Three Fully-Connected (FC) layers follow a stack of convolutional layers (which has a different depth in different architectures): the first two have 4096 channels each, the third performs 1000-way ILSVRC classification and thus contains 1000 channels (one for each class). The final layer is the soft-max layer. The configuration of the fully connected layers is the same in all networks.

All hidden layers are equipped with the rectification (ReLU) non-linearity. It is also noted that none of the networks (except for one) contain Local Response Normalisation (LRN), such normalization does not improve the performance on the ILSVRC dataset, but leads to increased memory consumption and computation time.

In [82]:
import sys
sys.path.append("/Users/ZRC")
sys.path

['/Users/ZRC/miniconda3/envs/tryit/lib/python36.zip',
 '/Users/ZRC/miniconda3/envs/tryit/lib/python3.6',
 '/Users/ZRC/miniconda3/envs/tryit/lib/python3.6/lib-dynload',
 '',
 '/Users/ZRC/miniconda3/envs/tryit/lib/python3.6/site-packages',
 '/Users/ZRC/miniconda3/envs/tryit/lib/python3.6/site-packages/IPython/extensions',
 '/Users/ZRC/.ipython',
 '/Users/ZRC',
 '/Users/ZRC']

In [83]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [84]:
import matplotlib.pyplot as plt
import numpy as np

import torch
import torch.nn.functional as F

from torch.utils.data import DataLoader
from torch.utils.data import RandomSampler


from torchvision import datasets
from torchvision import transforms

from torchsummary import summary

In [85]:
from fastai.utils import mem

In [86]:
mem.gpu_mem_get()

GPUMemory(total=0, free=0, used=0)

In [87]:
torch.cuda.empty_cache()

In [88]:
from coke.visualization.image import show_batch

## Model Settings

In [89]:
# Hyperparameters

BATCH_SIZE = 32
NUM_EPOCHS = 3
LEARNING_RATE = 0.001
RANDOM_SEED = 7

# Architecture
NUM_CLASSES = 10
GRAYSCALE = False

# other
DEVICE = torch.device("cuda: 0" if torch.cuda.is_available() else "cpu")

In [90]:
data_transforms = {"train": transforms.Compose([
                            transforms.Resize((224,224)),
                            transforms.ToTensor()]),
                    "val": transforms.Compose([
                      transforms.Resize((224,224)),
                      transforms.ToTensor()])
                   }

In [91]:
train_dataset = datasets.CIFAR10(root = "data",
                                train = True,
                                transform = data_transforms["train"],
                                download=True)

test_dataset = datasets.CIFAR10(root = "data",
                                train = False,
                                transform = data_transforms["val"],
                                download=False)

train_dataloader = DataLoader(dataset = train_dataset,
                             batch_size=BATCH_SIZE,
                             shuffle=True,
                             num_workers=4)

test_dataloader = DataLoader(dataset = test_dataset,
                             batch_size=BATCH_SIZE,
                             shuffle=False,
                             num_workers=4)

data_loader = {"train": train_dataloader, "val": test_dataloader}

Files already downloaded and verified


In [92]:
batch_samples,labels = next(iter(train_dataloader))
batch_samples.shape,labels.shape

(torch.Size([32, 3, 224, 224]), torch.Size([32]))

In [None]:
help(show_batch)

In [None]:
show_batch(batch_samples.permute(0,2,3,1), labels.numpy(), (4,4))

## Model

In [13]:
class Vgg16Zrc(torch.nn.Module):
    def __init__(self, num_classes, grascale = False):
        super(Vgg16Zrc, self).__init__()
        if grascale:
            in_channels = 1
        else:
            in_channels = 3
            
        #[3*224*224] -> [64*224*224]
        self.block_1 = torch.nn.Sequential(
        
            torch.nn.Conv2d(in_channels = in_channels, 
                            out_channels = 64,
                            kernel_size = 3,
                            stride = 1,
                            padding = 1),
            torch.nn.BatchNorm2d(64),
            torch.nn.ReLU(inplace = True),
            torch.nn.Dropout(0.5),
            
            torch.nn.Conv2d(64, 
                            out_channels = 64,
                            kernel_size = 3,
                            stride = 1,
                            padding = 1),
            torch.nn.BatchNorm2d(64),
            torch.nn.ReLU(inplace = True),
            torch.nn.Dropout(0.5),
        )
        
        
        #[64*224*224] -> [128*112*112]
        self.block_2 = torch.nn.Sequential(
            torch.nn.MaxPool2d(kernel_size = 2,
                              stride = 2),
            torch.nn.Conv2d(in_channels = 64, 
                            out_channels = 128,
                            kernel_size = 3,
                            stride = 1,
                            padding = 1),
            torch.nn.BatchNorm2d(128),
            torch.nn.ReLU(inplace = True),
            torch.nn.Dropout(0.5),
            
            torch.nn.Conv2d(128, 
                            out_channels = 128,
                            kernel_size = 3,
                            stride = 1,
                            padding = 1),
            torch.nn.BatchNorm2d(128),
            torch.nn.ReLU(inplace = True),
            torch.nn.Dropout(0.5),
        )
        
        # [128*112*112] -> [256*56*56]
        self.block_3 = self.__block_helper(128,256)
        # [256*56*56] -> [512*28*28]
        self.block_4 = self.__block_helper(256,512)
        # [512*28*28] -> [512*14*14]
        self.block_5 = self.__block_helper(512,512)
        
        
        self.classifier = torch.nn.Sequential(
            # [512*14*14] -> [512*7*7]
            torch.nn.MaxPool2d(kernel_size = 2,
                              stride = 2),
#             torch.nn.AdaptiveAvgPool2d((7,7)),
            torch.nn.AdaptiveAvgPool2d(1),
            torch.nn.Flatten(),
            
            # [512*7*7] -> [4096]
            torch.nn.Linear(in_features = 512, out_features = 4096),
            torch.nn.BatchNorm1d(4096),
            torch.nn.ReLU(inplace = True),
            torch.nn.Dropout(0.5),
            
            # [4096] -> [4096]
            torch.nn.Linear(in_features = 4096, out_features = 4096),
            torch.nn.BatchNorm1d(4096),
            torch.nn.ReLU(inplace = True),
            torch.nn.Dropout(0.5),
            
            # [4096] -> [num_classes]
            torch.nn.Linear(in_features = 4096, out_features = num_classes)
        )
        
        self.layers = torch.nn.ModuleList([self.block_1,self.block_2,self.block_3,self.block_4,self.block_5])
        
        
    def forward(self,x):
        for layer in self.layers:
            x = layer(x)
        
        logits = self.classifier(x)
        probas = F.softmax(logits, dim=1)
        return logits,probas
    
    
    
    def __block_helper(self,in_channels, out_channels):
        return torch.nn.Sequential(
            torch.nn.MaxPool2d(kernel_size = 2,
                              stride = 2),
            torch.nn.Conv2d(in_channels = in_channels, 
                            out_channels = out_channels,
                            kernel_size = 3,
                            stride = 1,
                            padding = 1),
            torch.nn.ReLU(inplace = True),
            torch.nn.Dropout(),
            
            torch.nn.Conv2d(in_channels = out_channels, 
                            out_channels = out_channels,
                            kernel_size = 3,
                            stride = 1,
                            padding = 1),
            
            torch.nn.ReLU(inplace = True),
            torch.nn.Dropout(),
            
            torch.nn.Conv2d(in_channels= out_channels, 
                            out_channels = out_channels,
                            kernel_size = 3,
                            stride = 1,
                            padding = 1),
            torch.nn.ReLU(inplace = True),
            torch.nn.Dropout()
            
        )

In [93]:
def init_weights(layer):

    if isinstance(layer, torch.nn.Conv2d):
        torch.nn.init.kaiming_normal_(layer.weight, mode='fan_out', nonlinearity='relu')
        if layer.bias is not None:
            torch.nn.init.constant_(layer.bias, 0)
    elif isinstance(layer, torch.nn.BatchNorm2d):
        torch.nn.init.constant_(layer.weight, 1)
        torch.nn.init.constant_(layer.bias, 0)
    elif isinstance(layer, torch.nn.Linear):
        torch.nn.init.normal_(layer.weight, 0, 0.01)
        torch.nn.init.constant_(layer.bias, 0)

            
model = Vgg16Zrc(num_classes = NUM_CLASSES)
model.apply(init_weights)
model = model.to(DEVICE)
optimizer = torch.optim.Adam(model.parameters(), lr = LEARNING_RATE)

In [94]:
summary(model, (3,224,224))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 64, 224, 224]           1,792
            Conv2d-2         [-1, 64, 224, 224]           1,792
       BatchNorm2d-3         [-1, 64, 224, 224]             128
       BatchNorm2d-4         [-1, 64, 224, 224]             128
              ReLU-5         [-1, 64, 224, 224]               0
              ReLU-6         [-1, 64, 224, 224]               0
           Dropout-7         [-1, 64, 224, 224]               0
           Dropout-8         [-1, 64, 224, 224]               0
            Conv2d-9         [-1, 64, 224, 224]          36,928
           Conv2d-10         [-1, 64, 224, 224]          36,928
      BatchNorm2d-11         [-1, 64, 224, 224]             128
      BatchNorm2d-12         [-1, 64, 224, 224]             128
             ReLU-13         [-1, 64, 224, 224]               0
             ReLU-14         [-1, 64, 2

In [95]:
def compute_accuracy(model, data_loader, device):
    model.eval()
    correct_pred, num_examples = 0, 0
    for i, (features, targets) in enumerate(data_loader):
            
        features = features.to(device)
        targets = targets.to(device)

        logits, probas = model(features)
        _, predicted_labels = torch.max(probas, 1)
        num_examples += targets.size(0)
        correct_pred += (predicted_labels == targets).sum()
    return correct_pred.float()/num_examples * 100

In [96]:
def train_model(model, data_loader, optimizer, num_epochs,batch_size, device,metric_func, random_seed = 7):
    # Manual seed for deterministic data loader
    torch.manual_seed(random_seed)
    for epoch in range(num_epochs):
        # set training mode
        model.train() 
        for batch_idx, (features, targets) in enumerate(data_loader["train"]):
            features = features.to(device)
            targets = targets.to(device)


            ## forward pass
            logits, probas = model(features)
            loss = F.cross_entropy(logits,targets)

            # backward pass
            # clear the gradients of all tensors being optimized
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            ### Login
            if not batch_idx % 50:
                print ('Epoch: {0:03d}/{1:03d} | Batch {2:03d}/{3:03d} | Loss: {4:.2f}'.format(
                    epoch+1, num_epochs, batch_idx, 
                         len(train_dataset)//batch_size, loss))

        with torch.set_grad_enabled(False):
            print('Epoch: {0:03d}/{1:03d} training accuracy: {2:.2f}'.format(
                  epoch+1, num_epochs, 
                  metric_func(model, data_loader["train"], device)))
            
            print('Epoch: {0:03d}/{1:03d} validation accuracy: {2:.2f}'.format(
                  epoch+1, num_epochs, 
                  metric_func(model, data_loader["val"], device)))

In [98]:
train_model(model, 
            data_loader, 
            optimizer, 
            NUM_EPOCHS, 
            device = DEVICE, 
            batch_size = BATCH_SIZE,
            metric_func = compute_accuracy)