<a href="https://colab.research.google.com/github/rahiakela/computer-vision-research-and-practice/blob/main/deep-learning-patterns-and-practices/3-convolutional-and-residual-neural-networks/2_vgg_networks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##VGG networks

While AlexNet (and its corresponding ConvNet design pattern) is considered the
granddaddy of convolutional networks, the VGGNet (and its corresponding VGG
design pattern) is considered the father of formalizing a design pattern based on
groups of convolutions. Like its AlexNet predecessors, it continued to view the convolutional
layers as a frontend, and to retain a large DNN backend for the classification
task. The fundamental principles behind the VGG design pattern are as follows:
- Grouping multiple convolutions into blocks, with the same number of filters
- Progressively doubling the number of filters across blocks
- Delaying pooling to the end of a block

It is designed using a handful of principles that are easy to learn. The convolutional
frontend consists of a sequence of pairs (and later triples) of convolutions of
the same size, followed by a max pooling. The max pooling layer downsamples the
generated feature maps by 75%, and the next pair (or triple) of convolutional layers
then doubles the number of learned filters. The principle behind the convolution
design was that the early layers learn coarse features, and subsequent layers, by
increasing the filters, learn finer and finer features, and the max pooling is used
between the layers to minimize growth in size (and subsequently parameters to learn)
of the feature maps.

Finally, the DNN backend consists of two identically-sized dense
hidden layers of 4096 nodes each, and a final dense output layer of 1000 nodes for classification.

<img src='images/2.png?raw=1' width='800'/>

As they have been frequently used in transfer learning,
others have kept the convolutional frontend of an ImageNet pretrained VGG16 or
VGG19, and corresponding weights, and attached a new DNN backend for retraining
for new classes of images.

<img src='images/3.png?raw=1' width='800'/>

So, let’s go ahead and code a VGG16 in two coding styles: the first in a sequential flow,
and the second procedurally using reuse functions for duplicating the common blocks
of layers, and parameters for their specific settings.

##Setup

In [1]:
from tensorflow.keras.models import Sequential
from tensorflow.keras import Input, Model
from tensorflow.keras.layers import Dense, ReLU, Activation
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten

##VGG using sequential flow style

In [6]:
model = Sequential()

# First convolutional block
model.add(Conv2D(64, kernel_size=(3, 3), strides=(1, 1), padding="same", activation="relu", input_shape=(224, 224, 3)))
model.add(Conv2D(64, kernel_size=(3, 3), strides=(1, 1), padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

# Second convolutional block—double the number of filters
model.add(Conv2D(128, kernel_size=(3, 3), strides=(1, 1), padding="same", activation="relu"))
model.add(Conv2D(128, kernel_size=(3, 3), strides=(1, 1), padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

# Third convolutional block—double the number of filters
model.add(Conv2D(256, kernel_size=(3, 3), strides=(1, 1), padding="same", activation="relu"))
model.add(Conv2D(256, kernel_size=(3, 3), strides=(1, 1), padding="same", activation="relu"))
model.add(Conv2D(256, kernel_size=(3, 3), strides=(1, 1), padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

# Fourth convolutional block—double the number of filters
model.add(Conv2D(512, kernel_size=(3, 3), strides=(1, 1), padding="same", activation="relu"))
model.add(Conv2D(512, kernel_size=(3, 3), strides=(1, 1), padding="same", activation="relu"))
model.add(Conv2D(512, kernel_size=(3, 3), strides=(1, 1), padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

# Fifth (final) convolutional block—double the number of filters
model.add(Conv2D(512, kernel_size=(3, 3), strides=(1, 1), padding="same", activation="relu"))
model.add(Conv2D(512, kernel_size=(3, 3), strides=(1, 1), padding="same", activation="relu"))
model.add(Conv2D(512, kernel_size=(3, 3), strides=(1, 1), padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

# DNN backend
model.add(Flatten())
model.add(Dense(4096, activation="relu"))
model.add(Dense(4096, activation="relu"))

# Output layer for classification (1000 classes)
model.add(Dense(1000, activation="softmax"))

model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_36 (Conv2D)           (None, 224, 224, 64)      1792      
_________________________________________________________________
conv2d_37 (Conv2D)           (None, 224, 224, 64)      36928     
_________________________________________________________________
max_pooling2d_15 (MaxPooling (None, 112, 112, 64)      0         
_________________________________________________________________
conv2d_38 (Conv2D)           (None, 112, 112, 128)     73856     
_________________________________________________________________
conv2d_39 (Conv2D)           (None, 112, 112, 128)     147584    
_________________________________________________________________
max_pooling2d_16 (MaxPooling (None, 56, 56, 128)       0         
_________________________________________________________________
conv2d_40 (Conv2D)           (None, 56, 56, 256)      

##VGG using procedural style

Let’s now code the same using a procedural reuse
style. In this example, we create a procedure (function) conv_block(), which builds
the convolutional blocks and takes as parameters the number of layers in the block (2 or 3), and number of filters (64, 128, 256, or 512). 

Note that we keep the first convolutional
layer outside conv_block. The first layer needs the input_shape parameter. We
could have coded this as a flag to conv_block, but since it would occur only one time,
that’s not reuse.

In [5]:
def conv_block(n_layers, n_filters):
  """
  n_layers : number of convolutional layers
  n_filters: number of filters
  """
  for n in range(n_layers):
    model.add(Conv2D(n_filters, kernel_size=(3, 3), strides=(1, 1), padding="same", activation="relu"))
  model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

model = Sequential()

# First convolutional specified separately since it requires the input_shape parameter
model.add(Conv2D(64, kernel_size=(3, 3), strides=(1, 1), padding="same", activation="relu", input_shape=(224, 224, 3)))

# Remainder of first convolutional block
conv_block(1, 64)
# Second through fifth convolutional blocks
conv_block(2, 128)
conv_block(3, 256)
conv_block(3, 512)
conv_block(3, 512)

# DNN backend
model.add(Flatten())
model.add(Dense(4096, activation="relu"))
model.add(Dense(4096, activation="relu"))

# Output layer for classification (1000 classes)
model.add(Dense(1000, activation="softmax"))

model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_23 (Conv2D)           (None, 224, 224, 64)      1792      
_________________________________________________________________
conv2d_24 (Conv2D)           (None, 224, 224, 64)      36928     
_________________________________________________________________
max_pooling2d_10 (MaxPooling (None, 112, 112, 64)      0         
_________________________________________________________________
conv2d_25 (Conv2D)           (None, 112, 112, 128)     73856     
_________________________________________________________________
conv2d_26 (Conv2D)           (None, 112, 112, 128)     147584    
_________________________________________________________________
max_pooling2d_11 (MaxPooling (None, 56, 56, 128)       0         
_________________________________________________________________
conv2d_27 (Conv2D)           (None, 56, 56, 256)      