---
title: VGG Paper Implementation from Scratch
author: Aras Edeş
date: '2023-06-29'
categories:
  - IN PROGRESS
  - Deep Learning
  - PyTorch
  - VGG
image: vgg16.png
execute: 
  enabled: false
editor:
  render-on-save: true
---

# Introduction
In this article I will be guiding throught the implementation of [VGG paper](https://arxiv.org/abs/1409.1556). VGG paper is published in 2015. Title of the paper is Very Deep Convolutional Networks for Large-Scale Image Recognition". I agree, quite a mouthful but the proposed models managed to place second in Classification and first in Localization in ImageNet 2014 Challange. Mainly, they investigate the following:

1. Use of deeper networks (Convolutional that is)
2. Small filter sizes (3x3 and 1x1)
3. Systematically decreasing feature size and increasing filter count

Architectures presented in the paper consist of Conv, MaxPool, Fully-Connected layers and ReLU activation (apart softmax for multi-class classification). Nowadays, BatchNorm is in wide use but it wasn't on the map at the time of this paper (although on one of the architectures utilized LocalResponseNormalization but it is reported to have no benefit). Similarly Dropout layers are not used as well. Layer configurations can be summarized as:

1. ReLU Activation after every parametrized layer
2. MaxPool layers (2x2) window with stride 2
3. Common fully-connected layers accross the proposed architectures
4. Convolutional layers with (3x3) and (1x1) filters with "same" padding. Filter count ranges from 64 to 512 and incremented by doubling

# Architecture Summary
Summary of the architectures as given in the paper:

|A| A-LRN| B| C| D| E|
|---|------|---|----|---|------|
|11 layers|  11 layers| 13 layers| 16 layers| 16 layers| 19 layers|
|conv3-64| conv3-64| conv3-64| conv3-64| conv3-64| conv3-64|
|LRN     | conv3-64| conv3-64| conv3-64| conv3-64| -|
|maxpool| maxpool| maxpool| maxpool| maxpool| maxpool|
|conv3-128 |conv3-128| conv3-128| conv3-128| conv3-128| conv3-128|
|conv3-128 |conv3-128| conv3-128| conv3-128| -| -| 
|maxpool| maxpool| maxpool| maxpool| maxpool| maxpool|
|conv3-256 |conv3-256| conv3-256| conv3-256| conv3-256 conv3-256|
|conv3-256 |conv3-256| conv3-256| conv3-256| conv3-256| conv3-256|
|conv1-256 |conv3-256| conv3-256| -| -| -|
|conv3-256| -| -| -| -| -|
|maxpool| maxpool| maxpool| maxpool| maxpool| maxpool|
|conv3-512 |conv3-512| conv3-512| conv3-512| conv3-512| conv3-512|
|conv3-512 |conv3-512| conv3-512| conv3-512| conv3-512| conv3-512|
|conv1-512 |conv3-512| conv3-512| -| -| -|
|conv3-512| -| -| -| -| -|
|maxpool| maxpool| maxpool| maxpool| maxpool| maxpool|
|conv3-512 |conv3-512| conv3-512| conv3-512| conv3-512| conv3-512|
|conv3-512 |conv3-512| conv3-512| conv3-512| conv3-512| conv3-512|
|conv1-512 |conv3-512| conv3-512| -| -| -|
|conv3-512| -| -| -| -| -|
|maxpool| maxpool| maxpool| maxpool| maxpool| maxpool|
|FC-4096| FC-4096| FC-4096| FC-4096| FC-4096| FC-4096|
|FC-4096| FC-4096| FC-4096| FC-4096| FC-4096| FC-4096|
|FC-1000| FC-1000| FC-1000| FC-1000| FC-1000| FC-1000|
|soft-max |soft-max |soft-max |soft-max |soft-max |soft-max|

# Implementation

In [4]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision import transforms

In [None]:
train_dataset = datasets.Country211("data/",
                                split="train",
                                download=True,
                                transform=transforms.ToTensor)
valid_dataset = datasets.Country211("data/",
                                split="valid",
                                download=True,
                                transform=transforms.ToTensor)
test_dataset = datasets.Country211("data/",
                                split="test",
                                download=True,
                                transform=transforms.ToTensor)

train_dataloader = DataLoader(train_dataset,
                        batch_size=32,
                        shuffle=True,
                        num_workers=2)
valid_dataloader = DataLoader(valid_dataset,
                        batch_size=32,
                        shuffle=True,
                        num_workers=2)
test_dataloader = DataLoader(test_dataset,
                        batch_size=32,
                        shuffle=True,
                        num_workers=2)