# 📖 TABLE OF CONTENTS

- [1. VGG16 Model: Intro]()
- [2. VGG16 Architecture]()
- [3. VGG16 from scratch using Keras]()
- [4. VGG16 from scratch using TensorFlow]()
- [5. VGG16 from scratch using PyTorch]()
- [6. VGG16 Transfer Learning]()

![rainbow](https://github.com/ancilcleetus/My-Learning-Journey/assets/25684256/839c3524-2a1d-4779-85a0-83c562e1e5e5)

# 1. VGG16 Model: Intro

VGG16 is a CNN (Convolutional Neural Network) architecture which was used to win ILSVRC (Imagenet) competition in 2014. It was proposed by the Visual Geometry Group (VGG) at the University of Oxford. It is considered to be one of the excellent vision model architecture till date. The pre-trained version of the VGG16 network is trained on over 1 million images from the ImageNet visual database, and is able to classify images into 1,000 different categories with 92.7 % top-5 test accuracy.

**Note**

 Top-5 accuracy measures the percentage of test images for which the correct label is among the top five predicted labels out of 1000 labels of ImageNet challenge dataset.

![rainbow](https://github.com/ancilcleetus/My-Learning-Journey/assets/25684256/839c3524-2a1d-4779-85a0-83c562e1e5e5)

# 2. VGG16 Architecture

In [None]:
# VGG16 Layers

from IPython import display
display.Image("data/images/CV_04_VGG16_Model-01-VGG16-Layers.jpg")

<IPython.core.display.Image object>

In the above figure, we have

- Convolution Layers
    - Conv 1 has 2 layers with 64 filters each
    - Conv 2 has 2 layers with 128 filters each
    - Conv 3 has 3 layers with 256 filters each
    - Conv 4 has 3 layers with 512 filters each
    - Conv 5 has 3 layers with 512 filters each
    - Total $2 + 2 + 3 + 3 + 3 = 13$ Convolution layers $\implies$ Responsible for feature extraction
    - All Convolution layers use $3 \times 3$ filter with stride $1$ and `padding='same'` (no change in resolution before Pooling operation)
    - Maxpool layer of $2 \times 2$ with stride $2$
- FC (Fully Connected) or Dense layers
    - $3$ Fully Connected layers $\implies$ Responsible for classification
    - FC 1 has 4096 Perceptrons
    - FC 2 has 4096 Perceptrons
    - FC 3 has 1000 Perceptrons (since 1000 classes in ImageNet dataset)
- Total $13 + 3 = 16$ trainable layers $\implies$ model named as VGG16
- Final layer $\implies$ Softmax layer

In [None]:
# VGG16 Architecture

from IPython import display
display.Image("data/images/CV_04_VGG16_Model-02-VGG16-Architecture.jpg")

<IPython.core.display.Image object>

![rainbow](https://github.com/ancilcleetus/My-Learning-Journey/assets/25684256/839c3524-2a1d-4779-85a0-83c562e1e5e5)

# 3. VGG16 from scratch using Keras

![rainbow](https://github.com/ancilcleetus/My-Learning-Journey/assets/25684256/839c3524-2a1d-4779-85a0-83c562e1e5e5)

# 4. VGG16 from scratch using TensorFlow

![rainbow](https://github.com/ancilcleetus/My-Learning-Journey/assets/25684256/839c3524-2a1d-4779-85a0-83c562e1e5e5)

# 5. VGG16 from scratch using PyTorch

![rainbow](https://github.com/ancilcleetus/My-Learning-Journey/assets/25684256/839c3524-2a1d-4779-85a0-83c562e1e5e5)

# 6. VGG16 Transfer Learning

![rainbow](https://github.com/ancilcleetus/My-Learning-Journey/assets/25684256/839c3524-2a1d-4779-85a0-83c562e1e5e5)