<a href="https://colab.research.google.com/github/EggPudding/Deep-Learning-Practice-with-Codes/blob/main/VGGNet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Very Deep Convolutional Networks for Large-Scale Image Recognition (ICLR 2015) Tutorial**

*   Pratice for VGGNet Architecture
*   Orginal Paper: https://arxiv.org/abs/1409.1556
*   Note that you first change **Runtime** to **GPU** setting
*   **CIFAR-10** Dataset is used for practice for simplicity
*   Part of codes from https://github.com/kuangliu/pytorch-cifar

In [15]:
!nvidia-smi # Make Sure you are using GPU

Mon Jan 25 02:05:02 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   37C    P8    10W /  70W |     10MiB / 15079MiB |      0%      Default |
|                               |                      |                 ERR! |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

### **VGGNet Model Definition**
* In this **Tutorial**, Model object for **CIFAR-10** Dataset. 
* **CIFAR-10** is dataset for classifying image into **10** different categories.
* The **10** different classes represent **airplanes**, **cars**, **birds**, **cats**, **deer**, **dogs**, **frogs**, **horses**, **ships**, and **trucks**. 
* There are **6,000** images of each class.

In [18]:
import torch
import torch.nn as nn
import torch.nn.functional as F

import torch.backends.cudnn as cudnn
import torch.optim as optim

import os

# VGGNet contains various depth models such as VGG-11, VGG-18.
# But in this tutorial VGG-11 is utilized since we use simple MNIST Dataset.
# Note that all Conv Layers has kernel_size 3.
class VGGNet(nn.Module):
    def __init__(self, num_classes=10, init_weight=True):
        super(VGGNet, self).__init__()
        self.feature_extract = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1),
            nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(256, 512, kernel_size=3, stride=1, padding=1),
            nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1),
            nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1),
            nn.MaxPool2d(kernel_size=2, stride=2),
        )

        self.classifier = nn.Sequential(
            nn.Dropout(),
            nn.Linear(512, 512),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(512, 512),
            nn.ReLU(True),
            nn.Linear(512, 10),       
        )

        if init_weight:
            self._initialize_weights()

    def forward(self, x):
        x = self.feature_extract(x)
        x = torch.flatten(x, 1)
        x = self.classifier(x)

        return x

    # Weight Initialization
    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.BatchNorm2d):
                nn.init.constant_(m.weight, 1)
                nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)
                nn.init.constant_(m.bias, 0)

### **Hyper Parameter Setting**

In [19]:
device = "cuda" if torch.cuda.is_available() else "cpu" # whether using gpu or cpu

model = VGGNet() # model assignment
model.to(device) # mapping model weight & bias into gpu memory
model = torch.nn.DataParallel(model) # used for parallel setting

cudnn.benchmark = True # using cudnn which optimizes the algorithm

learning_rate = 0.01
batch_size = 128
max_epoch = 10

model_path = 'vggnet_cifar10.pt'

criterion = nn.CrossEntropyLoss() # simple cross-entropy is utilized
optimizer = optim.Adam(model.parameters(), lr=learning_rate) # Adam optimizer utilized

* **torchsummary** is package for visualizing pytorch model.
* Layers, Number of parameters can be viewed through this.

In [20]:
import torchsummary

torchsummary.summary(model.cuda(), (3, 32, 32)) # CIFAR-10 Image has 32 x 32 x 3 shape

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 64, 32, 32]           1,792
              ReLU-2           [-1, 64, 32, 32]               0
         MaxPool2d-3           [-1, 64, 16, 16]               0
            Conv2d-4          [-1, 128, 16, 16]          73,856
         MaxPool2d-5            [-1, 128, 8, 8]               0
            Conv2d-6            [-1, 256, 8, 8]         295,168
            Conv2d-7            [-1, 256, 8, 8]         590,080
         MaxPool2d-8            [-1, 256, 4, 4]               0
            Conv2d-9            [-1, 512, 4, 4]       1,180,160
           Conv2d-10            [-1, 512, 4, 4]       2,359,808
        MaxPool2d-11            [-1, 512, 2, 2]               0
           Conv2d-12            [-1, 512, 2, 2]       2,359,808
           Conv2d-13            [-1, 512, 2, 2]       2,359,808
        MaxPool2d-14            [-1, 51

### **Training and Evaluation function definition**

In [21]:
def train(epoch, max_epoch):
    print(f"Train Epoch [{epoch}/{max_epoch}]")
    model.train() # Model to train mode

    train_loss = 0
    correct = 0
    total = 0
    acc = 0

    for idx, (x, y) in enumerate(train_dataloader):
        x = x.to(device) # maps data into GPU memory
        y = y.to(device)

        optimizer.zero_grad() # reset gradients in optimizer before calculating the loss

        y_pred = model(x) # model inference
        loss = criterion(y_pred, y) # calculating the loss

        loss.backward() # back-propagation to get cumulative gradients

        optimizer.step() # update model parameters
        train_loss += loss.item()
        _, inference = y_pred.max(1)

        total += x.size(0)
        correct += inference.eq(y).sum().item()

        if idx % 100 == 0:
            print(f"Epoch [{epoch}/{max_epoch}] Batch [{idx}] Train Loss: {loss.item()}")

    acc = 100*correct/total
    print(f"Epoch [{epoch}/{max_epoch}] Train Loss: {train_loss/total} Train Accuracy: {acc}")

def valid(epoch, max_epoch):
    print(f"Valid Epoch [{epoch}/{max_epoch}]")
    model.eval() # Model to evaluation mode

    valid_loss = 0
    correct = 0
    total = 0

    for idx, (x, y) in enumerate(valid_dataloader):
        x = x.to(device) # maps data into GPU memory
        y = y.to(device)

        with torch.no_grad():
            y_pred = model(x) # model inference
            valid_loss += criterion(y_pred, y).item()

            _, inference = y_pred.max(1)

            total += x.size(0)
            correct += inference.eq(y).sum().item()

    acc = 100*correct/total
    print(f"Epoch [{epoch}/{max_epoch}] Valid Loss: {valid_loss/total} Valid Accuracy: {acc}")

    if not os.path.exists('checkpoint'):
        os.mkdir('checkpoint')

    torch.save(model.state_dict(), f'checkpoint/{model_path}')
    print(f"Epoch [{epoch}/{max_epoch}] Valid Model Saved: checkpoint/{model_path}")

# Custom leraning rate scheduler is use.
# Decay learning rate by 10 at epoch 5.
def lr_schedule(optimizer, epoch):
    if epoch == 5:
        lr = learning_rate / 10
        for param_group in optimizer.param_groups:
            param_group['lr'] = lr

### **Data Preparation**
* In this **Tutorial**, we are going to use **torchvision** which contains famous vision dataset, and we will use **CIFAR-10** especially 
* **CIFAR-10** is dataset for classifying image into **10** different categories.
* The **10** different classes represent **airplanes**, **cars**, **birds**, **cats**, **deer**, **dogs**, **frogs**, **horses**, **ships**, and **trucks**. 
* There are **6,000** images of each class.


In [22]:
import torchvision
import torchvision.transforms as transforms

from torch.utils.data import DataLoader

transform = transforms.Compose([
  transforms.ToTensor(),
  transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
])

train_dataset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
valid_dataset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)

train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=4)
valid_dataloader = DataLoader(valid_dataset, batch_size=batch_size, shuffle=False, num_workers=4)

Files already downloaded and verified
Files already downloaded and verified


### **Training**

In [23]:
for epoch in range(0, max_epoch):
    lr_schedule(optimizer, epoch)
    train(epoch, max_epoch)
    valid(epoch, max_epoch)

Train Epoch [0/10]
Epoch [0/10] Batch [0] Train Loss: 2.302567958831787
Epoch [0/10] Batch [100] Train Loss: 2.302792549133301
Epoch [0/10] Batch [200] Train Loss: 2.30121111869812
Epoch [0/10] Batch [300] Train Loss: 2.2992587089538574
Epoch [0/10] Train Loss: 10294622.855603388 Train Accuracy: 9.984
Valid Epoch [0/10]
Epoch [0/10] Valid Loss: 0.018192924785614014 Valid Accuracy: 10.0
Epoch [0/10] Valid Model Saved: checkpoint/vggnet_cifar10.pt
Train Epoch [1/10]
Epoch [1/10] Batch [0] Train Loss: 2.3001837730407715
Epoch [1/10] Batch [100] Train Loss: 2.304408311843872
Epoch [1/10] Batch [200] Train Loss: 2.305873155593872
Epoch [1/10] Batch [300] Train Loss: 4775998390272.0
Epoch [1/10] Train Loss: 20976610627.678074 Train Accuracy: 9.996
Valid Epoch [1/10]
Epoch [1/10] Valid Loss: 0.01819319784641266 Valid Accuracy: 10.0
Epoch [1/10] Valid Model Saved: checkpoint/vggnet_cifar10.pt
Train Epoch [2/10]
Epoch [2/10] Batch [0] Train Loss: 2.301635980606079
Epoch [2/10] Batch [100] Train