### Subclassing
In ``Keras`` there are basically three-way we can define a neural network, namely
1. Sequential API
2. Functional API
3. Model Subclassing API
This Notebook will walk through on all these methods in creating one model of the following `achitecture`.

```python
           Input(input_shape=(32, 32, 3)
              |
           Conv2D(64, 3)
              |
           Conv2D(128, 3)
              |
           Conv2D(64, 3)
              |
           MaxPool2D()
              |
           Conv2D (64, 3)
              |
           Conv2D (128, 3)
              |
           Conv2D (64, 3)
              |
           MaxPool2D(2, 2)
              |
           Flatten ()
              |
            Dense (16)
              |
            Dense (output for 10 classes)
```

In [1]:
import tensorflow as tf
import tensorflow.keras as keras

> **Sequential Model**

In [None]:
seq_model = keras.Sequential([
    keras.layers.Input(shape=(32, 32, 3), name="input_layer"),
    keras.layers.Conv2D(64, 3, activation="relu", padding="same", name="conv_1"),
    keras.layers.Conv2D(128, 3, activation="relu", name="conv_2"),
    keras.layers.Conv2D(64, 3, activation="relu", name="conv_3"),
    keras.layers.MaxPool2D(pool_size=(2, 2), name="pool_2d_1"),
    keras.layers.Conv2D(64, 3, activation="relu", name="conv_4"),
    keras.layers.Conv2D(128, 3, activation="relu", name="conv_5"),
    keras.layers.Conv2D(64, 3, activation="relu", name="conv_6"),
    keras.layers.MaxPool2D(pool_size=(2, 2), name="pool_2d_2"),
    keras.layers.Flatten(name="flatten_layer"),
    keras.layers.Dense(64, name="dense_1", activation='relu'),
    keras.layers.Dense(10, name="output_layer", activation='softmax')
], name="seq_model")
seq_model.summary()

Model: "seq_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv_1 (Conv2D)              (None, 32, 32, 64)        1792      
_________________________________________________________________
conv_2 (Conv2D)              (None, 30, 30, 128)       73856     
_________________________________________________________________
conv_3 (Conv2D)              (None, 28, 28, 64)        73792     
_________________________________________________________________
pool_2d_1 (MaxPooling2D)     (None, 14, 14, 64)        0         
_________________________________________________________________
conv_4 (Conv2D)              (None, 12, 12, 64)        36928     
_________________________________________________________________
conv_5 (Conv2D)              (None, 10, 10, 128)       73856     
_________________________________________________________________
conv_6 (Conv2D)              (None, 8, 8, 64)          73

> The **Functional Model**

In [None]:
input_layer = keras.layers.Input(shape=(32, 32, 3), name="input_layer")
conv_1 = keras.layers.Conv2D(64, 3, activation="relu", padding="same", name="conv_1")(input_layer)
conv_2 = keras.layers.Conv2D(128, 3, activation="relu", name="conv_2")(conv_1)
conv_3 = keras.layers.Conv2D(64, 3, activation="relu", name="conv_3")(conv_2)
pool_2d_1 =  keras.layers.MaxPool2D(pool_size=(2, 2), name="pool_2d_1")(conv_3)
conv_4 = keras.layers.Conv2D(64, 3, activation="relu", name="conv_4")(pool_2d_1)
conv_5 = keras.layers.Conv2D(128, 3, activation="relu", name="conv_5")(conv_4)
conv_6 = keras.layers.Conv2D(64, 3, activation="relu", name="conv_6")(conv_5)
pool_2d_2 = keras.layers.MaxPool2D(pool_size=(2, 2), name="pool_2d_2")(conv_6)
flatten_layer = keras.layers.Flatten(name="flatten_layer")(pool_2d_2)
dense_1 = keras.layers.Dense(64, name="dense_1", activation='relu')(flatten_layer)
output_layer = keras.layers.Dense(10, name="output_layer", activation='softmax')(dense_1)
fn_model = keras.Model(inputs=input_layer, outputs=output_layer, name="fn_model")
fn_model.summary()

Model: "fn_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_layer (InputLayer)     [(None, 32, 32, 3)]       0         
_________________________________________________________________
conv_1 (Conv2D)              (None, 32, 32, 64)        1792      
_________________________________________________________________
conv_2 (Conv2D)              (None, 30, 30, 128)       73856     
_________________________________________________________________
conv_3 (Conv2D)              (None, 28, 28, 64)        73792     
_________________________________________________________________
pool_2d_1 (MaxPooling2D)     (None, 14, 14, 64)        0         
_________________________________________________________________
conv_4 (Conv2D)              (None, 12, 12, 64)        36928     
_________________________________________________________________
conv_5 (Conv2D)              (None, 10, 10, 128)       738

### The `Model Sub-Classing API`
In `Model Sub-Classing` there are two most important functions **`__init__`** and call. Basically, we will define all the **`tf.keras layers`** or custom implemented layers inside the **`__init__`** method and **`call`** those layers based on our network design inside the call method which is used to perform a forward propagation. The **`call`** method quite the same as the **`forward`** method that is used to build the model in PyTorch anyway.

> The `Model` class contains the layers in it.

In [None]:
class Model(keras.Model):
    def __init__(self):
        super(Model, self).__init__()
        self.conv_1 = keras.layers.Conv2D(64, 3, activation="relu", padding="same", name="conv_1")
        self.conv_2 =keras.layers.Conv2D(128, 3, activation="relu", name="conv_2")
        self.conv_3 = keras.layers.Conv2D(64, 3, activation="relu", name="conv_3")
        self.pool = keras.layers.MaxPool2D(pool_size=(2, 2), name="pool_2d_1")
        self.conv_4 = keras.layers.Conv2D(64, 3, activation="relu", name="conv_4")
        self.conv_5 = keras.layers.Conv2D(128, 3, activation="relu", name="conv_5")
        self.conv_6  = keras.layers.Conv2D(64, 3, activation="relu", name="conv_6")
        self.flatten = keras.layers.Flatten(name="flatten_layer")
        self.dense_1 = keras.layers.Dense(64, name="dense_1", activation='relu')
        self.output_layer = keras.layers.Dense(10, name="output_layer", activation='softmax')
        
    def call(self, input_tensor):
        # forward  pass the first block
        x = self.conv_1(input_tensor)
        x = self.conv_2(x)
        x = self.conv_3(x)
        # pool
        x = self.pool(x)
        # forward pass block 2
        x = self.conv_4(x)
        x = self.conv_5(x)
        x = self.conv_6(x)
        # pool block2 
        x = self.pool(x)
        # flatten the conv_6 layer
        x = self.flatten(x)
        # forward pass the output blocks
        x = self.dense_1(x)
        x = self.output_layer(x)
        return x
    
sub_class_model = Model()
sub_class_model.build((None, 32, 32, 3))
sub_class_model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv_1 (Conv2D)              multiple                  1792      
_________________________________________________________________
conv_2 (Conv2D)              multiple                  73856     
_________________________________________________________________
conv_3 (Conv2D)              multiple                  73792     
_________________________________________________________________
pool_2d_1 (MaxPooling2D)     multiple                  0         
_________________________________________________________________
conv_4 (Conv2D)              multiple                  36928     
_________________________________________________________________
conv_5 (Conv2D)              multiple                  73856     
_________________________________________________________________
conv_6 (Conv2D)              multiple                  73792 

> **We still getting the same `Total params` and `Trainable params` just like the from the previous models.** But the problem is we are not able to see the output shape. Which is the problem that we are going to solve later on.

#### Trainning our `model` on the `cifar10` dataset.

In [None]:
from tensorflow.keras import datasets

In [None]:
(X_train, y_train), (X_test, y_test) = datasets.cifar10.load_data()
X_test_tensors = tf.convert_to_tensor(X_test/255., dtype=tf.float32)
X_train_tensors = tf.convert_to_tensor(X_train/255., dtype=tf.float32)
y_test_tensors = tf.one_hot(tf.squeeze(y_test), depth=10)
y_train_tensors = tf.one_hot(tf.squeeze(y_train), depth=10)

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz


In [None]:
seq_model.compile(
    loss = "categorical_crossentropy",
    metrics=['acc'],
    optimizer = 'adam'
)
fn_model.compile(
    loss = "categorical_crossentropy",
    metrics=['acc'],
    optimizer = 'adam'
)
sub_class_model.compile(
    loss = "categorical_crossentropy",
    metrics=['acc'],
    optimizer = 'adam'
)

In [None]:
seq_model.fit(X_train_tensors, y_train_tensors, batch_size=64, epochs=3, verbose=1,
                  validation_data=(X_test_tensors, y_test_tensors)
                  )

Epoch 1/3
Epoch 2/3
Epoch 3/3


<tensorflow.python.keras.callbacks.History at 0x7fe482020dd0>

In [None]:
fn_model.fit(X_train_tensors, y_train_tensors, batch_size=64, epochs=3, verbose=1,
                  validation_data=(X_test_tensors, y_test_tensors)
                  )

Epoch 1/3
Epoch 2/3
Epoch 3/3


<tensorflow.python.keras.callbacks.History at 0x7fe470127450>

In [None]:
sub_class_model.fit(X_train_tensors, y_train_tensors, batch_size=64, epochs=3, verbose=1,
                  validation_data=(X_test_tensors, y_test_tensors)
                  )

Epoch 1/3
Epoch 2/3
Epoch 3/3


<tensorflow.python.keras.callbacks.History at 0x7fe45cdf5490>

So bassically the models are the same. The only different is how to create them using different api's.

### Getting the **Output shapes** ``using the subclass api``
At this point the output shape of the `sub_class_model` when we call the ``model.summary()`` there is only **`multiple`**. We want to sey if we call the summary we should also get the output shapes.

In [None]:
sub_class_model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv_1 (Conv2D)              multiple                  1792      
_________________________________________________________________
conv_2 (Conv2D)              multiple                  73856     
_________________________________________________________________
conv_3 (Conv2D)              multiple                  73792     
_________________________________________________________________
pool_2d_1 (MaxPooling2D)     multiple                  0         
_________________________________________________________________
conv_4 (Conv2D)              multiple                  36928     
_________________________________________________________________
conv_5 (Conv2D)              multiple                  73856     
_________________________________________________________________
conv_6 (Conv2D)              multiple                  73792 

In [None]:
class Model(keras.Model):
    def __init__(self):
        super(Model, self).__init__()
        self.conv_1 = keras.layers.Conv2D(64, 3, activation="relu", padding="same", name="conv_1")
        self.conv_2 =keras.layers.Conv2D(128, 3, activation="relu", name="conv_2")
        self.conv_3 = keras.layers.Conv2D(64, 3, activation="relu", name="conv_3")
        self.pool = keras.layers.MaxPool2D(pool_size=(2, 2), name="pool_2d_1")
        self.conv_4 = keras.layers.Conv2D(64, 3, activation="relu", name="conv_4")
        self.conv_5 = keras.layers.Conv2D(128, 3, activation="relu", name="conv_5")
        self.conv_6  = keras.layers.Conv2D(64, 3, activation="relu", name="conv_6")
        self.flatten = keras.layers.Flatten(name="flatten_layer")
        self.dense_1 = keras.layers.Dense(64, name="dense_1", activation='relu')
        self.output_layer = keras.layers.Dense(10, name="output_layer", activation='softmax')
        
    def call(self, input_tensor):
        # forward  pass the first block
        x = self.conv_1(input_tensor)
        x = self.conv_2(x)
        x = self.conv_3(x)
        # pool
        x = self.pool(x)
        # forward pass block 2
        x = self.conv_4(x)
        x = self.conv_5(x)
        x = self.conv_6(x)
        # pool block2 
        x = self.pool(x)
        # flatten the conv_6 layer
        x = self.flatten(x)
        # forward pass the output blocks
        x = self.dense_1(x)
        x = self.output_layer(x)
        return x

    def model(self):
        x = Input(shape=( None, 32, 32, 3))
        return Model(inputs=[x], outputs =self.call(x))

sub_class_model_01 = Model()
sub_class_model_01.build((None, 32, 32, 3))
sub_class_model_01.summary()


Model: "model_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv_1 (Conv2D)              multiple                  1792      
_________________________________________________________________
conv_2 (Conv2D)              multiple                  73856     
_________________________________________________________________
conv_3 (Conv2D)              multiple                  73792     
_________________________________________________________________
pool_2d_1 (MaxPooling2D)     multiple                  0         
_________________________________________________________________
conv_4 (Conv2D)              multiple                  36928     
_________________________________________________________________
conv_5 (Conv2D)              multiple                  73856     
_________________________________________________________________
conv_6 (Conv2D)              multiple                  7379

> The following code was found on the blog post at [towardsdatascience.com](https://towardsdatascience.com/model-sub-classing-and-custom-training-loop-from-scratch-in-tensorflow-2-cc1d4f10fb4e)

In [None]:
class ModelSubClassing(tf.keras.Model):
    def __init__(self, num_classes):
        super(ModelSubClassing, self).__init__()
        # define all layers in init
        # Layer of Block 1
        self.conv1 = tf.keras.layers.Conv2D(32, 3, strides=2, activation="relu")
        self.max1  = tf.keras.layers.MaxPooling2D(3)
        self.bn1   = tf.keras.layers.BatchNormalization()

        # Layer of Block 2
        self.conv2 = tf.keras.layers.Conv2D(64, 3, activation="relu")
        self.bn2   = tf.keras.layers.BatchNormalization()
        self.drop  = tf.keras.layers.Dropout(0.3)

        # GAP, followed by Classifier
        self.gap   = tf.keras.layers.GlobalAveragePooling2D()
        self.dense = tf.keras.layers.Dense(num_classes)

    def call(self, input_tensor, training=False):
        # forward pass: block 1 
        x = self.conv1(input_tensor)
        x = self.max1(x)
        x = self.bn1(x)

        # forward pass: block 2 
        x = self.conv2(x)
        x = self.bn2(x)

        # droput followed by gap and classifier
        x = self.drop(x)
        x = self.gap(x)
        return self.dense(x)
model = ModelSubClassing(10)
model.build((None, 32, 32, 3))
model.summary()

Model: "model_sub_classing"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              multiple                  896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) multiple                  0         
_________________________________________________________________
batch_normalization (BatchNo multiple                  128       
_________________________________________________________________
conv2d_1 (Conv2D)            multiple                  18496     
_________________________________________________________________
batch_normalization_1 (Batch multiple                  256       
_________________________________________________________________
dropout (Dropout)            multiple                  0         
_________________________________________________________________
global_average_pooling2d (Gl multiple           

### What is cool about the `Subclassing` API?
The subclass API allows us to write our own custom blocks and then stack together. For example let's say we have the following achitecture we want to build.

```python

(Conv2D + BatchNorm) -> (Conv2D + BatchNorm) -> (Conv2D + BatchNorm) -> Flatten() -> [Dense(name="hidden") + Dense(name="output")]
```
We can say `(Conv2D + BatchNorm)` is a certain block and, 
``[Dense(name="hidden") + Dense(name="output")]`` is another block which means we can build these blocks and stack them using the `Sequantial` api as follows:

```python
model = keras.Sequential([
  (Conv2D + BatchNorm),
  (Conv2D + BatchNorm),
  (Conv2D + BatchNorm),
  [Dense(name="hidden") + Dense(name="output")]
])
```


In [9]:
# The convulutional Block
class ConvBlock(keras.layers.Layer):
  def __init__(self, in_features, kernel=3):
    super(ConvBlock, self).__init__()
    self.in_features = in_features
    self.kernel = kernel
    self.conv = keras.layers.Conv2D(self.in_features, self.kernel, 1, padding="same")
    self.bn = keras.layers.BatchNormalization()


    def call(self, input_tensor, training=False): # the BatchNorm behaves differently for train and test
      x = self.conv(input_tensor)
      x = self.bn(bn, training=training)
      return keras.activations.relu(x)



In [10]:
model = keras.Sequential([
    # Conv block
    ConvBlock(64),
    ConvBlock(128),
    ConvBlock(64),
    # output block
    keras.layers.Flatten(),
    keras.layers.Dense(64, activation=keras.activations.relu),
    keras.layers.Dense(10, activation = keras.activations.softmax),
])
model.build((None, 32, 32, 1))
model.summary()


Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv_block_9 (ConvBlock)     (None, 32, 32, 1)         0         
_________________________________________________________________
conv_block_10 (ConvBlock)    (None, 32, 32, 1)         0         
_________________________________________________________________
conv_block_11 (ConvBlock)    (None, 32, 32, 1)         0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 1024)              0         
_________________________________________________________________
dense_4 (Dense)              (None, 64)                65600     
_________________________________________________________________
dense_5 (Dense)              (None, 10)                650       
Total params: 66,250
Trainable params: 66,250
Non-trainable params: 0
__________________________________________________

**The adavantage of using subclassing is that it allows us to build increadible large models with few lines of code.**