📝 **Author:** Amirhossein Heydari - 📧 **Email:** amirhosseinheydari78@gmail.com - 📍 **Linktree:** [linktr.ee/mr_pylin](https://linktr.ee/mr_pylin)

---

# Dependencies
   - torchvision models:
      - class
         - brings in the model class directly
         - Allows more control and customization since you are dealing directly with the class. You can override methods, customize initialization, etc.
      - function
         - This import brings in a function that returns an instance of the model
         - Easier and quicker to use, especially for standard models
   - [pytorch.org/vision/stable/models.html](https://pytorch.org/vision/stable/models.html)

In [None]:
import torch
from torch import nn
from torchinfo import summary
from torchvision.models import VGG, vgg11, vgg13, vgg16, vgg19

# VGGNet
   - Developed in 2014 by [Karen Simonyan](https://dblp.uni-trier.de/search/author?author=Karen%20Simonyan) and [Andrew Zisserman](https://dblp.uni-trier.de/pid/z/AndrewZisserman.html?q=Andrew%20Zisserman) from the Visual Geometry Group ([VGG](https://www.robots.ox.ac.uk/~vgg/index.html)) at the University of [Oxford](https://www.ox.ac.uk/).
   - It is based on the [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) paper
   - It was trained on the [ImageNet](https://www.image-net.org/) dataset (first resized to 256x256 then center cropped to 224x224) [[ImageNet viewer](https://navigu.net/#imagenet)]
   - Known for its simple and uniform architecture, using small `3x3` convolutional filters consistently throughout the network
   - It comes in several variants, primarily `VGG11`, `VGG13`, `VGG16` and `VGG19`, indicating the total number of layers
   - The `runner-up` of the ImageNet Large Scale Visual Recognition Challenge ([ILSVRC](https://image-net.org/challenges/LSVRC/2014/)) in 2014

<figure style="text-align: center;">
    <img src="../../../assets/images/original/cnn/architectures/vgg16.svg" alt="vgg16-architecture.svg" style="width: 100%;">
    <figcaption>VGG16 Architecture</figcaption>
</figure>

## Custom VGGNet
   - `Softmax` is missing due to internal implementation of `LogSoftmax` in the `CrossEntropyLoss` function.
   - there is an extension of VGGNet which also contains `nn.BatchNorm2d` before `nn.ReLU` in the `feature_extractor` section.

In [None]:
class CustomVGGNet(nn.Module):
    def __init__(self, feature_layers: list, num_classes: int = 1000) -> None:
        super(CustomVGGNet, self).__init__()

        self.features = nn.Sequential(*self._make_layers(feature_layers))

        # 512x7x7 -> 512x7x7
        # trainable params: 0
        self.avgpool = nn.AdaptiveAvgPool2d(output_size=(7, 7))

        # flatten : 512x7x7 -> 25088
        # 25088 -> 1000
        self.classifier = nn.Sequential(

            # 25088 -> 4096
            # trainable params: (25088 + 1) * 4096 = 102,764,544
            nn.Linear(25088, 4096),

            # 4096 -> 4096
            # trainable params: 0
            nn.ReLU(inplace=True),

            # 4096 -> 4096
            # trainable params: 0
            nn.Dropout(),

            # 4096 -> 4096
            # trainable params: (4096 + 1) * 4096 = 16,781,312
            nn.Linear(4096, 4096),

            # 4096 -> 4096
            # trainable params: 0
            nn.ReLU(inplace=True),

            # 4096 -> 4096
            # trainable params: 0
            nn.Dropout(),

            # 4096 -> 1000
            # trainable params: (4096 + 1) * 1000 = 4,097,000
            nn.Linear(4096, num_classes),
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:

        # feature extractor
        x = self.features(x)

        # adaptive average pooling
        x = self.avgpool(x)

        # flatten : 512x7x7 -> 25088
        x = torch.flatten(x, start_dim=1)

        # classifier
        x = self.classifier(x)

        return x

    def _make_layers(self, cfg: list) -> list:
        layers = []
        in_channels = 3

        for x in cfg:
            if x == 'M':
                layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
            else:
                layers += [nn.Conv2d(in_channels, x, kernel_size=3, padding=1), nn.ReLU(inplace=True)]
                in_channels = x

        return layers

### VGG11

In [None]:
vgg11_1 = CustomVGGNet(
    feature_layers=[64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
    num_classes=1000
)

vgg11_1

In [None]:
summary(vgg11_1, (1, 3, 224, 224), device='cpu')

### VGG13

In [None]:
vgg13_1 = CustomVGGNet(
    feature_layers=[64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
    num_classes=1000
)

vgg13_1

In [None]:
summary(vgg13_1, (1, 3, 224, 224), device='cpu')

### VGG16

In [None]:
vgg16_1 = CustomVGGNet(
    feature_layers=[64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'],
    num_classes=1000
)

vgg16_1

In [None]:
summary(vgg16_1, (1, 3, 224, 224), device='cpu')

### VGG19

In [None]:
vgg19_1 = CustomVGGNet(
    feature_layers=[64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'],
    num_classes=1000
)

vgg19_1

In [None]:
summary(vgg19_1, (1, 3, 224, 224), device='cpu')

## PyTorch VGGNet
   - All VGGNet variants available in PyTorch: [pytorch.org/vision/main/models/vgg.html](https://pytorch.org/vision/main/models/vgg.html)

### VGG11

In [None]:
vgg11_2 = vgg11()
vgg11_2

In [None]:
summary(vgg11_2, (1, 3, 227, 227), device='cpu')

### VGG13

In [None]:
vgg13_2 = vgg13()
vgg13_2

In [None]:
summary(vgg13_2, (1, 3, 227, 227), device='cpu')

### VGG16

In [None]:
vgg16_2 = vgg16()
vgg16_2

In [None]:
summary(vgg16_2, (1, 3, 227, 227), device='cpu')

### VGG19

In [None]:
vgg19_2 = vgg19()
vgg19_2

In [None]:
summary(vgg19_2, (1, 3, 227, 227), device='cpu')