# Keras Model Preparer

This notebook shows how to prepare a Keras model for quantization. Specifically, this preparer converts a Keras model with subclass layers into a Keras model with functional layers. This is required for quantization because the AIMET quantization tooling only supports the Functional and Sequantial Keras model building API's.

To learn more about the Keras Model Preparer, please refer to the API Docs in AIMET.

#### Overall flow
This notebook covers the following
1. Creating a Keras model with subclass layers
2. Converting the Keras model with subclass layers to a Keras model with functional layers
3. Showing similarities and differences between the original and converted models
4. Dicussing the limitations of the Keras Model Preparer

---
## 1. Creating a Keras model with subclass layers

First, we will create a Keras model with subclass layers. For this notebook example, we will use a model defined by Keras that utilizes subclass layers. This model is a text classification transformer model and can be found [here]( https://keras.io/examples/nlp/text_classification_with_transformer/). The subclass layers used in this model are - `TokenAndPositionEmbedding` and `TransformerBlock`. They are defined below. 

In [None]:
import tensorflow as tf

class TransformerBlock(tf.keras.layers.Layer):
    def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1):
        super(TransformerBlock, self).__init__()
        self.att = tf.keras.layers.MultiHeadAttention(num_heads=num_heads, key_dim=embed_dim)
        self.ffn = tf.keras.Sequential(
            [tf.keras.layers.Dense(ff_dim, activation="relu"), tf.keras.layers.Dense(embed_dim),]
        )
        self.layernorm1 = tf.keras.layers.LayerNormalization(epsilon=1e-6)
        self.layernorm2 = tf.keras.layers.LayerNormalization(epsilon=1e-6)
        self.dropout1 = tf.keras.layers.Dropout(rate)
        self.dropout2 = tf.keras.layers.Dropout(rate)

    def call(self, inputs, training, **kwargs):
        attn_output = self.att(inputs, inputs)
        attn_output = self.dropout1(attn_output, training=training)
        out1 = self.layernorm1(inputs + attn_output)
        ffn_output = self.ffn(out1)
        ffn_output = self.dropout2(ffn_output, training=training)
        return self.layernorm2(out1 + ffn_output)



class TokenAndPositionEmbedding(tf.keras.layers.Layer):
    def __init__(self, maxlen, vocab_size, embed_dim):
        super(TokenAndPositionEmbedding, self).__init__()
        self.token_emb = tf.keras.layers.Embedding(input_dim=vocab_size, output_dim=embed_dim)
        self.pos_emb = tf.keras.layers.Embedding(input_dim=maxlen, output_dim=embed_dim)

    def call(self, x, **kwargs):
        maxlen = tf.shape(x)[-1]
        positions = tf.range(start=0, limit=maxlen, delta=1)
        positions = self.pos_emb(positions)
        x = self.token_emb(x)
        x = x + positions
        return x

With those subclass layers defined, we can now define the model. Since we are not training the model, we will use random weights and a random input tensor to build the model.

In [None]:
import numpy as np
vocab_size = 20000 
maxlen = 200

random_input = np.random.random((10, 200)) # Random input to build the model

embed_dim = 32  # Embedding size for each token
num_heads = 2  # Number of attention heads
ff_dim = 32  # Hidden layer size in feed forward network inside transformer

inputs = tf.keras.layers.Input(shape=(maxlen,))
embedding_layer = TokenAndPositionEmbedding(maxlen, vocab_size, embed_dim)
x = embedding_layer(inputs)
transformer_block = TransformerBlock(embed_dim, num_heads, ff_dim)
x = transformer_block(x)
x = tf.keras.layers.GlobalAveragePooling1D()(x)
x = tf.keras.layers.Dropout(0.1)(x)
x = tf.keras.layers.Dense(20, activation="relu")(x)
x = tf.keras.layers.Dropout(0.1)(x)
outputs = tf.keras.layers.Dense(2, activation="softmax")(x)

model = tf.keras.Model(inputs=inputs, outputs=outputs)
_ = model(random_input)
model.summary()

From the `model.summary()` output, we can see the models 2 subclass layers - `token_and_position_embedding`, `transformer_block`. Since these layers are using layer inside they're classes, we need to extract them to create a symmetrical functional model. 

---
## 2. Converting the Keras model with subclass layers to a Keras model with functional layers

The Keras Model Preparer can be used to convert a Keras model with subclass layers to a Keras model with functional layers. The Keras Model Preparer can be imported from `aimet_tensorflow.keras.model_preparer`. The Keras Model Preparer takes in a Keras model with subclass layers and returns a Keras model with functional layers. Note that the `prepare_model` function takes an optional `input_layer` parameter. This parameter is required if the model begins with a subclass layer. In this case, the model does not begin with a subclass layer, so we do not need to provide an `input_shape` parameter.

In [None]:
from aimet_tensorflow.keras.model_preparer import prepare_model

functional_model = prepare_model(model) 
functional_model.summary()

We can see that the Keras Model Preparer has converted the model with subclass layers to a model with functional layers. Specifically, it has extracted the call function of each of these layers and created a functional layer from it.

---
## 3. Showing similarities and differences between the original and converted models

We can see that the original model and the converted model are symmetrical. The only difference is that the subclass layers are unwrapped. This means that the converted model is functionally identical to the original model. We can test this in a few ways.

1) We can compare the total number of parameters in the original and converted models. We can see that the total number of parameters is the same.

2) We can compare the weights of the original and converted models. We can see that the weights are the same.
    * Note that the order of the weights presented when calling `get_weights()` on each of these models are not the same and as is the names of the weights. We can use an internal function to get the original models weights in the same order as the converted models weights.

3) We can compare the outputs of the original and converted models. We can see that the outputs are the same.

In [None]:
from aimet_tensorflow.keras.model_preparer import _get_original_models_weights_in_functional_model_order

assert functional_model.count_params() == model.count_params()
assert functional_model.input_shape == model.input_shape
assert functional_model.output_shape == model.output_shape

# NOTE: Since TextClassification Model has the internal layers out of order compared to the call method,
# the weights are not in the order of what the actual architecture is (this is a Keras design).
# Therefore, we get the original model's weights and sort them in the order of the actual
# architecture and use those weights to compare to the functional model's weights.
model_weights_in_correct_order = _get_original_models_weights_in_functional_model_order(
    model, functional_model, class_names=["token_and_position_embedding", "transformer_block"])

for i, _ in enumerate(model_weights_in_correct_order):
        np.testing.assert_array_equal(model_weights_in_correct_order[i], functional_model.get_weights()[i])

np.testing.assert_array_equal(functional_model(random_input).numpy(), model(random_input).numpy())
print("Models are equal")

## 4. Discussing the limitations of the Keras Model Preparer

- The AIMET Keras ModelPreparer API is able to convert subclass layers that have arthmetic experssion in their call function.
However, this API and Keras, will convert these operations to TFOPLambda layers which are not currently supported by AIMET Keras Quantization API. 
If possible, it is recommended to have the subclass layers call function ressemble the Keras Functional API layers.
For example, if a subclass layer has two convolution layers in its call function, the call function should look like
the following:

    ```python
    def call(self, x, **kwargs):
        x = self.conv_1(x)
        x = self.conv_2(x)
        return x
    ```

- If the model starts with a subclassed layer, the AIMET Keras ModelPreparer API will need an Keras Input Layer as input.
This is becuase the Keras Functional API requires an Input Layer as the first layer in the model. The AIMET Keras ModelPreparer API
will raise an exception if the model starts with a subclassed layer and an Input Layer is not provided as input.

---
## Summary

Hopefully this notebook was useful for you to understand how to use the Keras Model Preparer.

Few additional resources:
- [AIMET API Docs](https://quic.github.io/aimet-pages/releases/latest/user_guide/index.html)