# Custom Model 

We can build models by multiple ways, initally we were using the `Sequential` model, which enables us to stack layers on top of one another, simple and easy to use and understand for beginners.  

But as our needs became more complicated, we wanted to connect layers in all sorts of ways, which can't be achieved with the `Sequential` model, and so we tended to use the `Functional API`, which allows layers to have multiple inputs or multiple output.  

The `Functional API` is actually perfect but we want to its components to be as simple as possible, and to do that we can extend the `Model` class, which will allow us to define reusable models that can be used to build complex models.

this is done as follows:

```python
class CustomModel(Model):
  def __init__(self, customization_args):
    super(CustomModel, self).__init__()
    # define the layers here
    # for example:
    self.dense_layer = Dense(customization_args['units'])

  def call(self, inputs, training=None, mask=None):
    # define the forward pass here (connect the layers with input here)
    output = self.dense_layer(inputs)

    return output

```

You need to implement at least 2 methods:

1. `__init__`: This method is called when the model is created. It is used to define the layers of the model.
2. `call`: This method is called when the model is used to make predictions. It is used to define the forward pass of the model.

The advatages of using the `Model` subclass is that it would allow you to use the functionality defined by keras, like fitting, evaluation, saving and loading models.  


In [1]:
import tensorflow as tf

class MyModel(tf.keras.Model):
  def __init__(self):
    super(MyModel, self).__init__()
    self.dense1 = tf.keras.layers.Dense(10, activation='relu')
    self.dense2 = tf.keras.layers.Dense(1, activation='sigmoid')

  def call(self, inputs):
    x = self.dense1(inputs)
    x = self.dense2(x)
    return x

Now we will define some custom models, and demonstrate how to use them.


# Identity Block

Resnet depends on the idea of identity blocks, you can reed more about this in the [Resnet paper](https://arxiv.org/pdf/1512.03385.pdf).  

The identity block consists of two convolutional layers, followed by a batch normalization layer and a relu activation.


In [2]:
from tensorflow.keras import Model

class IdentityBlock(Model):
  def __init__(self, filters, kernel_size, **kwargs):
    super(IdentityBlock, self).__init__()
    self.conv1 = tf.keras.layers.Conv2D(filters, kernel_size, padding='same')
    self.bn1 = tf.keras.layers.BatchNormalization()
    
    self.conv2 = tf.keras.layers.Conv2D(filters, kernel_size, padding='same')
    self.bn2 = tf.keras.layers.BatchNormalization()

    self.identityAdd = tf.keras.layers.Add()

    self.act = tf.keras.layers.Activation('relu') # those are like lambda layers, they are stateless and only apply their activation function

  def call(self, inputs):
    x = self.conv1(inputs)
    x = self.bn1(x)
    x = self.act(x)

    x = self.conv2(x)
    x = self.bn2(x)

    x = self.identityAdd([x, inputs])
    x = self.act(x)
    return x

# Resnet Model

The model is as follows:

1. Convolutional layer: `64` filters, `7x7`
2. Batch Normalization
3. Activation: ReLU
4. MaxPooling: 2D `3x3`
5. 2 Identity Blocks: `64` filters, `3x3`
6. Global Average Pooling
7. Dense: Classiifier layer, `num_classes` units

In [None]:
class ResNet(Model):
  def __init__(self, num_classes=10):
    super(ResNet, self).__init__()
    self.conv1 = tf.keras.layers.Conv2D(64, 7, padding='same')
    self.bn1 = tf.keras.layers.BatchNormalization()
    self.act = tf.keras.layers.Activation('relu')
    self.maxpool = tf.keras.layers.MaxPool2D(3)

    self.identity1 = IdentityBlock(64, 3)
    self.identity2 = IdentityBlock(64, 3)

    self.avgpool = tf.keras.layers.GlobalAveragePooling2D()
    self.fc = tf.keras.layers.Dense(num_classes)

  def call(self, inputs):
    x = self.conv1(inputs)
    x = self.bn1(x)
    x = self.act(x)
    x = self.maxpool(x)

    x = self.identity1(x)
    x = self.identity2(x)

    x = self.avgpool(x)
    x = self.fc(x)
    return x

# VGG-16 Model

Building VGG-16 follows the same principle as Resnet, but with a different architecture.

We first start by defining a recurring block, that contains few convolutional layers, followed by a max pooling layer.

The end of the model is simply a flatten followed by two dense layers.


In [None]:
class VggBlock(Model):
  def __init__(self, filters_count, kernel_size , strides, pool_size, cnn_count, prefix):
    super(VggBlock, self).__init__()
    
    self.filters_count = filters_count
    self.cnn_count = cnn_count

    self.cnns = []

    for i in range(cnn_count):
      cnn = tf.keras.layers.Conv2D(
        filters=filters_count,
        kernel_size=kernel_size, 
        padding='same',
        activation='relu',
        name=f'{prefix}_cnn_{i}'
      )
      self.cnns.append(cnn)

    # the strides is the stride of the pooling, not the convolution
    self.pool = tf.keras.layers.MaxPool2D(pool_size=pool_size, strides=strides)

  def call(self, inputs):
    x = inputs
    for cnn in self.cnns:
      x = cnn(x)
      x = tf.keras.layers.Activation('relu')(x)
    x = self.pool(x)
    return x

# VGG Model

it consists of the following blocks:

1. VggBlock: `2` cnns, `64` filters, `3x3`
2. VggBlock: `2` cnns, `128` filters, `3x3`
3. VggBlock: `3` cnns, `256` filters, `3x3`
4. VggBlock: `3` cnns, `512` filters, `3x3`
5. VggBlock: `3` cnns, `512` filters, `3x3`
6. Flatten layer
7. Dense layer: `256` units, `relu` activation
8. Dense layer: `num_classes` units, `softmax` activation


In [None]:
class VGG16(Model):
  def __init__(self, num_classes=10):
    super(VGG16, self).__init__()
    self.block1 = VggBlock(filters_count=64, kernel_size=3, strides=1, pool_size=2, cnn_count=2, prefix='block1')
    self.block2 = VggBlock(128, 3, 1, 2, 2, 'block2')
    self.block3 = VggBlock(256, 3, 1, 2, 3, 'block3')
    self.block4 = VggBlock(512, 3, 1, 2, 3, 'block4')
    self.block5 = VggBlock(512, 3, 1, 2, 3, 'block5')

    self.flatten = tf.keras.layers.Flatten()
    self.fc = tf.keras.layers.Dense(256, activation='relu')
    self.classifier = tf.keras.layers.Dense(num_classes, activation='softmax')

  def call(self, inputs):
    x = self.block1(inputs)
    x = self.block2(x)
    x = self.block3(x)
    x = self.block4(x)
    x = self.block5(x)

    x = self.flatten(x)
    x = self.fc(x)
    x = self.classifier(x)
    return x

# Danger

**NOTE:** Only run this in Colab and using GPU

1. It downloads a large dataset (namely Cat vs Dog dataset)
2. Training takes a long time


In [None]:
import tensorflow_datasets as tfds

dataset = tfds.load('cats_vs_dogs', split=tfds.Split.TRAIN, data_dir='data/')

# Initialize VGG with the number of classes 
vgg = VGG16(num_classes=2)

# Compile with losses and metrics
vgg.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Define preprocessing function
def preprocess(features):
    # Resize and normalize
    image = tf.image.resize(features['image'], (224, 224))
    return tf.cast(image, tf.float32) / 255., features['label']

# Apply transformations to dataset
dataset = dataset.map(preprocess).batch(32)

# Train the custom VGG model
vgg.fit(dataset, epochs=10)