<a href="https://colab.research.google.com/github/EggPudding/Deep-Learning-Practice-with-Codes/blob/main/GoogLeNet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Going Deeper with Convolutions (CVPR 2015) Tutorial**

*   Pratice for GoogLeNet Architecture
*   Orginal Paper: https://arxiv.org/abs/1409.4842
*   Note that you first change **Runtime** to **GPU** setting
*   **CIFAR-10** Dataset is used for practice for simplicity
*   Part of codes from https://github.com/kuangliu/pytorch-cifar

In [None]:
!nvidia-smi # Make Sure you are using GPU

Fri Jan 22 07:26:18 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   43C    P8    10W /  70W |      0MiB / 15079MiB |      0%      Default |
|                               |                      |                 ERR! |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

### **GoogLeNet Model Definition**
* In this **Tutorial**, Model object for **CIFAR-10** Dataset. 
* **CIFAR-10** is dataset for classifying image into **10** different categories.
* The **10** different classes represent **airplanes**, **cars**, **birds**, **cats**, **deer**, **dogs**, **frogs**, **horses**, **ships**, and **trucks**. 
* There are **6,000** images of each class.

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F

import torch.backends.cudnn as cudnn
import torch.optim as optim

import os

# Inception Module Implementation
# GoogLeNet Architecture consists of small building blocks which is called Inception.
# Thus, it is useful to define such module as an independent module.
class Inception(nn.Module):
  def __init__(self, in_planes, n1x1, n3x3red, n3x3, n5x5red, n5x5, pool_planes):
      super(Inception, self).__init__()
      # 1x1 conv branch
      self.b1 = nn.Sequential(
          nn.Conv2d(in_planes, n1x1, kernel_size=1),
          nn.BatchNorm2d(n1x1),
          nn.ReLU(True),
      )

      # 1x1 conv -> 3x3 conv branch
      self.b2 = nn.Sequential(
          nn.Conv2d(in_planes, n3x3red, kernel_size=1),
          nn.BatchNorm2d(n3x3red),
          nn.ReLU(True),
          nn.Conv2d(n3x3red, n3x3, kernel_size=3, padding=1),
          nn.BatchNorm2d(n3x3),
          nn.ReLU(True),
      )

      # 1x1 conv -> 5x5 conv branch
      self.b3 = nn.Sequential(
          nn.Conv2d(in_planes, n5x5red, kernel_size=1),
          nn.BatchNorm2d(n5x5red),
          nn.ReLU(True),
          nn.Conv2d(n5x5red, n5x5, kernel_size=3, padding=1),
          nn.BatchNorm2d(n5x5),
          nn.ReLU(True),
          nn.Conv2d(n5x5, n5x5, kernel_size=3, padding=1),
          nn.BatchNorm2d(n5x5),
          nn.ReLU(True),
      )

      # 3x3 pool -> 1x1 conv branch
      self.b4 = nn.Sequential(
          nn.MaxPool2d(3, stride=1, padding=1),
          nn.Conv2d(in_planes, pool_planes, kernel_size=1),
          nn.BatchNorm2d(pool_planes),
          nn.ReLU(True),
      )

  def forward(self, x):
      y1 = self.b1(x)
      y2 = self.b2(x)
      y3 = self.b3(x)
      y4 = self.b4(x)
      return torch.cat([y1,y2,y3,y4], 1)

# GoogLeNet Implementation
class GoogLeNet(nn.Module):
  def __init__(self):
      super(GoogLeNet, self).__init__()
      self.pre_layers = nn.Sequential(
          nn.Conv2d(3, 192, kernel_size=3, padding=1),
          nn.BatchNorm2d(192),
          nn.ReLU(True),
      )

      self.a3 = Inception(192,  64,  96, 128, 16, 32, 32)
      self.b3 = Inception(256, 128, 128, 192, 32, 96, 64)

      self.maxpool = nn.MaxPool2d(3, stride=2, padding=1)

      self.a4 = Inception(480, 192,  96, 208, 16,  48,  64)
      self.b4 = Inception(512, 160, 112, 224, 24,  64,  64)
      self.c4 = Inception(512, 128, 128, 256, 24,  64,  64)
      self.d4 = Inception(512, 112, 144, 288, 32,  64,  64)
      self.e4 = Inception(528, 256, 160, 320, 32, 128, 128)

      self.a5 = Inception(832, 256, 160, 320, 32, 128, 128)
      self.b5 = Inception(832, 384, 192, 384, 48, 128, 128)

      self.avgpool = nn.AvgPool2d(8, stride=1)
      self.linear = nn.Linear(1024, 10)

  def forward(self, x):
      out = self.pre_layers(x)
      out = self.a3(out)
      out = self.b3(out)
      out = self.maxpool(out)
      out = self.a4(out)
      out = self.b4(out)
      out = self.c4(out)
      out = self.d4(out)
      out = self.e4(out)
      out = self.maxpool(out)
      out = self.a5(out)
      out = self.b5(out)
      out = self.avgpool(out)
      out = out.view(out.size(0), -1)
      out = self.linear(out)
      return out

### **Hyper Parameter Setting**

In [2]:
device = "cuda" if torch.cuda.is_available() else "cpu" # whether using gpu or cpu

model = GoogLeNet() # model assignment
model.to(device) # mapping model weight & bias into gpu memory
model = torch.nn.DataParallel(model) # used for parallel setting

cudnn.benchmark = True # using cudnn which optimizes the algorithm

learning_rate = 0.001
batch_size = 128
max_epoch = 5

model_path = 'googlenet_cifar10.pt'

criterion = nn.CrossEntropyLoss() # simple cross-entropy is utilized
optimizer = optim.Adam(model.parameters(), lr=learning_rate) # Adam optimizer utilized

* **torchsummary** is package for visualizing pytorch model.
* Layers, Number of parameters can be viewed through this.

In [3]:
import torchsummary

torchsummary.summary(model.cuda(), (3, 32, 32)) # CIFAR-10 Image has 32 x 32 x 3 shape

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1          [-1, 192, 32, 32]           5,376
       BatchNorm2d-2          [-1, 192, 32, 32]             384
              ReLU-3          [-1, 192, 32, 32]               0
            Conv2d-4           [-1, 64, 32, 32]          12,352
       BatchNorm2d-5           [-1, 64, 32, 32]             128
              ReLU-6           [-1, 64, 32, 32]               0
            Conv2d-7           [-1, 96, 32, 32]          18,528
       BatchNorm2d-8           [-1, 96, 32, 32]             192
              ReLU-9           [-1, 96, 32, 32]               0
           Conv2d-10          [-1, 128, 32, 32]         110,720
      BatchNorm2d-11          [-1, 128, 32, 32]             256
             ReLU-12          [-1, 128, 32, 32]               0
           Conv2d-13           [-1, 16, 32, 32]           3,088
      BatchNorm2d-14           [-1, 16,

### **Training and Evaluation function definition**

In [4]:
def train(epoch, max_epoch):
    print(f"Train Epoch [{epoch}/{max_epoch}]")
    model.train() # Model to train mode

    train_loss = 0
    correct = 0
    total = 0
    acc = 0

    for idx, (x, y) in enumerate(train_dataloader):
        x = x.to(device) # maps data into GPU memory
        y = y.to(device)

        optimizer.zero_grad() # reset gradients in optimizer before calculating the loss

        y_pred = model(x) # model inference
        loss = criterion(y_pred, y) # calculating the loss

        loss.backward() # back-propagation to get cumulative gradients

        optimizer.step() # update model parameters
        train_loss += loss.item()
        _, inference = y_pred.max(1)

        total += x.size(0)
        correct += inference.eq(y).sum().item()

        if idx % 100 == 0:
            print(f"Epoch [{epoch}/{max_epoch}] Batch [{idx}] Train Loss: {loss.item()}")

    acc = 100*correct/total
    print(f"Epoch [{epoch}/{max_epoch}] Train Loss: {train_loss/total} Train Accuracy: {acc}")

def valid(epoch, max_epoch):
    print(f"Valid Epoch [{epoch}/{max_epoch}]")
    model.eval() # Model to evaluation mode

    valid_loss = 0
    correct = 0
    total = 0

    for idx, (x, y) in enumerate(valid_dataloader):
        x = x.to(device) # maps data into GPU memory
        y = y.to(device)

        with torch.no_grad():
            y_pred = model(x) # model inference
            valid_loss += criterion(y_pred, y).item()

            _, inference = y_pred.max(1)

            total += x.size(0)
            correct += inference.eq(y).sum().item()

    acc = 100*correct/total
    print(f"Epoch [{epoch}/{max_epoch}] Valid Loss: {valid_loss/total} Valid Accuracy: {acc}")

    if not os.path.exists('checkpoint'):
        os.mkdir('checkpoint')

    torch.save(model.state_dict(), f'checkpoint/{model_path}')
    print(f"Epoch [{epoch}/{max_epoch}] Valid Model Saved: checkpoint/{model_path}")

### **Data Preparation**
* In this **Tutorial**, we are going to use **torchvision** which contains famous vision dataset, and we will use **CIFAR-10** especially 
* **CIFAR-10** is dataset for classifying image into **10** different categories.
* The **10** different classes represent **airplanes**, **cars**, **birds**, **cats**, **deer**, **dogs**, **frogs**, **horses**, **ships**, and **trucks**. 
* There are **6,000** images of each class.


In [5]:
import torchvision
import torchvision.transforms as transforms

from torch.utils.data import DataLoader

transform = transforms.Compose([
  transforms.ToTensor(),
  transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
])

train_dataset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
valid_dataset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)

train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=4)
valid_dataloader = DataLoader(valid_dataset, batch_size=batch_size, shuffle=False, num_workers=4)

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting ./data/cifar-10-python.tar.gz to ./data
Files already downloaded and verified


### **Training**

In [6]:
for epoch in range(0, max_epoch):
    train(epoch, max_epoch)
    valid(epoch, max_epoch)

Train Epoch [0/5]
Epoch [0/5] Batch [0] Train Loss: 2.3278722763061523
Epoch [0/5] Batch [100] Train Loss: 1.4095921516418457
Epoch [0/5] Batch [200] Train Loss: 1.1260926723480225
Epoch [0/5] Batch [300] Train Loss: 0.9474471211433411
Epoch [0/5] Train Loss: 0.009576131782531739 Train Accuracy: 55.25
Valid Epoch [0/5]
Epoch [0/5] Valid Loss: 0.008457013821601867 Valid Accuracy: 62.76
Epoch [0/5] Valid Model Saved: checkpoint/googlenet_cifar10.pt
Train Epoch [1/5]
Epoch [1/5] Batch [0] Train Loss: 0.7021706104278564
Epoch [1/5] Batch [100] Train Loss: 0.5628349184989929
Epoch [1/5] Batch [200] Train Loss: 0.6900032758712769
Epoch [1/5] Batch [300] Train Loss: 0.8339300155639648
Epoch [1/5] Train Loss: 0.005804524007439613 Train Accuracy: 73.86
Valid Epoch [1/5]
Epoch [1/5] Valid Loss: 0.005654138326644897 Valid Accuracy: 75.2
Epoch [1/5] Valid Model Saved: checkpoint/googlenet_cifar10.pt
Train Epoch [2/5]
Epoch [2/5] Batch [0] Train Loss: 0.513065755367279
Epoch [2/5] Batch [100] Train