# Introduction
<hr style="border:2px solid black"> </hr>

<div class="alert alert-warning">
<font color=black>

**What?** Keras functional, sequential and subclassing APIs

</font>
</div>

# Import modules
<hr style="border:2px solid black"> </hr>

In [6]:
from keras.models import Sequential, Model
from keras.layers import Dense, Input

import numpy as np
import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import layers

# Sequential vs. Functional vs. Suvclassing APIs
<hr style="border:2px solid black"> </hr>

<div class="alert alert-info">
<font color=black>

- The Keras or more specifically the Models API can be divided into three parts, therefore making it possible to create models in multiple ways:
    - The **sequential API** allows you to create models layer-by-layer for most problems. This makes it on the one hand very easy for us to use and to debug, whereas, on the other hand, we lose a lot of flexibility. To be more specific, it is limited in that it does not allow you to create models that share layers or have multiple inputs or outputs or we want to use a non-linear topology e.g. residual or skip-connections.
    - The **functional API** in Keras is an alternate way of creating models that offers a lot more flexibility, including creating more complex models. It enables us to create a non-linear topology, shared layers, and even multiple inputs or outputs. The main idea, underlying the functional API, is the building of a *graph of layers*.
    -  The **subclassing API** provides us with maximum flexibility. However, this flexibility comes with the price of increased complexity and verbosity. The Layer Class is one of the central abstractions in Keras. A layer holds both a state (the layer’s weights) and a transformation from inputs to outputs (the forward pass from a call). Creating custom layers is especially useful when the model’s complexity increases, providing us with the means to create reusable “blocks” within our model’s architecture.
    

</font>
</div>

# Load dataset
<hr style="border:2px solid black"> </hr>

In [3]:
# set seed to reproduce results
seed = 2021
np.random.seed(seed)
tf.random.set_seed(seed)

# load MNIST dataset
(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data()
# pre-process, normalize data
X_train, X_test = X_train / 255.0, X_test / 255.0

# Sequential API
<hr style="border:2px solid black"> </hr>

<div class="alert alert-info">
<font color=black>

- The Sequential model API is a way of creating deep learning models where an instance of the Sequential class is created and model layers are created and added to it.
- There are **two ways** you can use the sequential API:
    - Pass the layers as an array
    - Layer can be add in a piecewise manner    
    
</font>
</div>

In [7]:
model = Sequential([Dense(2, input_dim=1),
                    Dense(1)])

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 2)                 4         
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 3         
Total params: 7
Trainable params: 7
Non-trainable params: 0
_________________________________________________________________


In [8]:
model = Sequential()
model.add(Dense(2, input_dim=1))
model.add(Dense(1))

model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_2 (Dense)              (None, 2)                 4         
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 3         
Total params: 7
Trainable params: 7
Non-trainable params: 0
_________________________________________________________________


<div class="alert alert-info">
<font color=black>

- Going back to build our model for the MNIST dataset.
- We'll use the second option shown above: adding layer in a piecewise manner.

</font>
</div>

In [9]:
# get input dimension 28x28 pixels
input_dim = (28, 28)
# get output dimensions 10 classes
output_dim = len(np.unique(y_train))

# create sequential model
model = keras.Sequential()
model.add(layers.Flatten(input_shape=input_dim))
model.add(layers.Dense(units=128, activation='relu'))
model.add(layers.BatchNormalization())
model.add(layers.Dense(units=128, activation='relu'))
model.add(layers.BatchNormalization())
model.add(layers.Dense(units=output_dim, activation='softmax'))

In [10]:
# compile the model
model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(),
    metrics=['acc']
)

# train the model
model.fit(
    X_train, y_train,
    epochs=5,
    batch_size=128,
    validation_split=0.1,
    verbose=1
)

# get model summary
model.summary()

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
flatten (Flatten)            (None, 784)               0         
_________________________________________________________________
dense (Dense)                (None, 128)               100480    
_________________________________________________________________
batch_normalization (BatchNo (None, 128)               512       
_________________________________________________________________
dense_1 (Dense)              (None, 128)               16512     
_________________________________________________________________
batch_normalization_1 (Batch (None, 128)               512       
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1290      
Total params: 119,306
Trainable params: 118,794
Non-trainable params: 51

# Functional API
<hr style="border:2px solid black"> </hr>

<div class="alert alert-info">
<font color=black>

- It specifically allows you to define multiple input or output models as well as models that share layers.
- A bracket notation is used as in `(current)(input)`
    
</font>
</div>

In [20]:
visible = Input(shape=(2,))
hidden = Dense(2)(visible)
model = Model(inputs=visible, outputs=hidden)
model.summary()

Model: "model_12"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_14 (InputLayer)        [(None, 2)]               0         
_________________________________________________________________
dense_23 (Dense)             (None, 2)                 6         
Total params: 6
Trainable params: 6
Non-trainable params: 0
_________________________________________________________________


<div class="alert alert-info">
<font color=black>

- `hidden = Dense(2)(visible)` what is this syntax doing?
    - `Dense(2)` creates the layer via the class constructor hence via `__init__`
    - `(visibile)` is the second bracket “(input)” and is a function with no name implemented via the `__call__()` function, that when called will connect the layers.
    - The `__call__()` function is a default function on all Python objects that can be overridden and is used to “call” an instantiated object.
    
</font>
</div>

In [21]:
# Shorter notation
visible = Input(shape=(2,))
hidden = Dense(2)(visible)

In [22]:
# Equivalent
visible = Input(shape=(2,))
hidden = Dense(2)
# connect layer to previous layer
hidden.__call__(visible)

<KerasTensor: shape=(None, 2) dtype=float32 (created by layer 'dense_25')>

In [12]:
# get input dimension 28x28 pixels
input_dim = (28, 28)
# get output dimensions 10 classes
output_dim = len(np.unique(y_train))

# create model with functional api
def create_model(input_dim, output_dim):
    inputs = layers.Input(shape=input_dim)
    x = layers.Flatten()(inputs)
    x = layers.Dense(units=128, activation='relu')(x)
    x = layers.BatchNormalization()(x)
    x = layers.Dense(units=128, activation='relu')(x)
    x = layers.BatchNormalization()(x)
    outputs = layers.Dense(units=output_dim, activation='softmax')(x)
    return keras.Model(inputs=inputs, outputs=outputs)


model = create_model(input_dim, output_dim)

In [13]:
# compile the model
model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(),
    metrics=['acc']
)

# train the model
model.fit(
    X_train, y_train,
    epochs=5,
    batch_size=128,
    validation_split=0.1,
    verbose=1
)

# get model summary
model.summary()

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         [(None, 28, 28)]          0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 784)               0         
_________________________________________________________________
dense_6 (Dense)              (None, 128)               100480    
_________________________________________________________________
batch_normalization_4 (Batch (None, 128)               512       
_________________________________________________________________
dense_7 (Dense)              (None, 128)               16512     
_________________________________________________________________
batch_normalization_5 (Batch (None, 128)               512       
_________________________________________________________________
dense_8 (

# Model Subclassing API
<hr style="border:2px solid black"> </hr>

<div class="alert alert-info">
<font color=black>

- In our model, we can use a custom layer to create a basic building block that occurs multiple times in our architecture. The class below inherits from the layer class and initializes two layers: Dense and BatchNormalization. Both layers are then simply called within the forward pass

</font>
</div>

In [16]:
# get input dimensions 28x28 pixels
input_dim = (28, 28)
# get output dimensions 10 classes
output_dim = len(np.unique(y_train))
# define layer for fully connected NN
hidden_layer = [128, 128]

# define custom layer
class DenseBlock(layers.Layer):
    def __init__(self, units, activation='relu'):
        super().__init__()
        self.dense = layers.Dense(units, activation)
        self.bn = layers.BatchNormalization()

    def call(self, inputs):
        x = self.dense(inputs)
        x = self.bn(x)
        return x

<div class="alert alert-info">
<font color=black>

- Next, we create our model utilizing our custom layer from above. Our class inherits from the Keras Model Class, hence the name subclassing. 
- We initialize our custom layer based on the hidden_layer = [128,128] as well as two other layers: Flatten and our final output layer. All of the layers are then just simply called within the model’s forward pass.

</font>
</div>

In [17]:
# define custom model by subclassing
class FCNN(keras.Model):
    def __init__(self, hidden_layer, output_dim, activation='relu'):
        super().__init__()
        self.hidden_layer = [DenseBlock(units) for units in hidden_layer]
        self.flatten = layers.Flatten()
        self.softmax = layers.Dense(units=output_dim, activation='softmax')
    
    def call(self, inputs):
        x = self.flatten(inputs)
        for layer in self.hidden_layer:
            x = layer(x)
        x = self.softmax(x)
        return x

# instantiate new model
model = FCNN(hidden_layer, output_dim)

In [18]:
# compile the model
model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(),
    metrics=['acc']
)

# train the model
model.fit(
    X_train, y_train,
    epochs=5,
    batch_size=128,
    validation_split=0.1,
    verbose=1
)

# get model summary
model.summary()

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Model: "fcnn_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_block_2 (DenseBlock)   multiple                  100992    
_________________________________________________________________
dense_block_3 (DenseBlock)   multiple                  17024     
_________________________________________________________________
flatten_4 (Flatten)          multiple                  0         
_________________________________________________________________
dense_14 (Dense)             multiple                  1290      
Total params: 119,306
Trainable params: 118,794
Non-trainable params: 512
_________________________________________________________________


# Conclusions
<hr style="border:2px solid black"> </hr>

<div class="alert alert-danger">
<font color=black>

- There are three ways to create models in Keras. Depending on the problem we have to solve and the model we need to build one way might be better suited than the other — but there is no absolute best or correct way in general.

- Simple, linear, and straightforward models can and probably should be built with either the sequential or the functional API. More complex, non-linear, and highly customized models will however benefit greatly from subclassing.

</font>
</div>

# References
<hr style="border:2px solid black"> </hr>

<div class="alert alert-warning">
<font color=black>

- https://towardsdatascience.com/build-your-neural-networks-with-keras-in-three-ways-553cea182c6b
- https://machinelearningmastery.com/keras-functional-api-deep-learning/
- [KERAS API](https://keras.io/api/models/)

</font>
</div>