<!-- This should be added to the overrides/main.html and improved-->
<div class="grid cards" markdown>

- <svg xmlns="http://www.w3.org/2000/svg" height="50" width="50" viewBox="0 0 488 512"><!--!Font Awesome Free 6.6.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free Copyright 2024 Fonticons, Inc.--><path fill="#2094F3" d="M488 261.8C488 403.3 391.1 504 248 504 110.8 504 0 393.2 0 256S110.8 8 248 8c66.8 0 123 24.5 166.3 64.9l-67.5 64.9C258.5 52.6 94.3 116.6 94.3 256c0 86.5 69.1 156.6 153.7 156.6 98.2 0 135-70.4 140.8-106.9H248v-85.3h236.1c2.3 12.7 3.9 24.9 3.9 41.4z"/></svg>
<a href="https://colab.research.google.com/github/AmbiqAI/neuralspot-edge/blob/main/docs/guides/custom-model-architecture.ipynb" class="md-content__button md-icon" style="color: #2094F3;">
    View in Colab
</a>

- <svg xmlns="http://www.w3.org/2000/svg" height="50" width="50" viewBox="0 0 496 512"><!--!Font Awesome Free 6.6.0 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free Copyright 2024 Fonticons, Inc.--><path fill="#2094F3" d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3 .3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5 .3-6.2 2.3zm44.2-1.7c-2.9 .7-4.9 2.6-4.6 4.9 .3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3 .7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3 .3 2.9 2.3 3.9 1.6 1 3.6 .7 4.3-.7 .7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3 .7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3 .7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"/></svg>
<a href="https://github.com/AmbiqAI/neuralspot-edge/blob/main/docs/guides/custom-model-architecture.ipynb" class="md-content__button md-icon" style="color: #2094F3;">
    GitHub source
</a>

</div>

# Create Custom Model Architecture

## Introduction

In this notebook, we will create a custom model architecture in a similar fashion to NSEs built-in architectures. For brevity, we will use a very simple fully-connected topology, but the same principles can be applied to more complex architectures.

Major concepts covered in this notebook:
* Leverage Pydantic to define a custom model architecture parameters
* Create a custom model architecture by subclassing `keras.Model`
* Create a functional version of the custom model architecture

In [15]:
import keras
import numpy as np
from pydantic import BaseModel, Field

## Define model parameters

The first step is to define the parameters for the model. The preferred way to do this is to use Pydantic models or dataclasses. For this model, we will need to take the following parameters:

* The number of fully-connected layers
* The number of neurons in each layer
* The activation function for each layer

Rather than passing these parameters as a nested list, we will leverage Pydantic to create a data model that will make it easier to know what parameters are required, what their types are, and perform validation.

In [3]:
class CustomLayerParams(BaseModel):
    """Fully connected layer parameters

    Attributes:
        units: int: Number of neurons in the layer
        activation: str: Activation
    """
    units: int = Field(..., ge=1, description="Number of neurons in the layer")
    activation: str = Field("relu", description="Activation function")

class CustomModelParams(BaseModel):
    """Fully connected neural network model parameters

    Attributes:
        layers: list[CustomLayerParams]: List of layers

    """
    layers: list[CustomLayerParams] = Field(..., min_length=1, description="List of layers")

### Let's create an example model definition

In [4]:
params = CustomModelParams(layers=[
    CustomLayerParams(units=64, activation="relu"),
    CustomLayerParams(units=32, activation="relu"),
    CustomLayerParams(units=16, activation="relu"),
])

In [5]:
# Let's dump the model parameters
print(params.model_dump_json(indent=2))

{
  "layers": [
    {
      "units": 64,
      "activation": "relu"
    },
    {
      "units": 32,
      "activation": "relu"
    },
    {
      "units": 16,
      "activation": "relu"
    }
  ]
}


## Creating the model

Next, let's create the custom model generator routines. We will show two ways to create the model:

1. Subclassing `keras.Model` and defining the forward pass in the `call` method
2. Creating a functional version of the model using the `functional` API

In [6]:
inputs = keras.Input(shape=(128,), name="inputs")

### 1. Create a keras.Model subclass

In [7]:
class MyCustomModel(keras.Model):
    def __init__(self, params: CustomModelParams, num_classes: int|None = None, **kwargs):
        """Custom model

        Args:
            params (CustomModelParams): Model parameters
            num_classes (int|None): Number of classes for classification
        """
        super().__init__(**kwargs)
        self._dense_layers = [keras.layers.Dense(units=layer.units, activation=layer.activation) for layer in params.layers]
        if num_classes:
            self.output_act = keras.layers.Dense(num_classes, activation="softmax")
        else:
            self.output_act = None

    def call(self, inputs):
        """Forward pass

        Args:
            inputs: Input tensor
        """
        x = inputs
        for layer in self._dense_layers:
            x = layer(x)
        if self.output_act:
            x = self.output_act(x)
        return x

Now let's instantiate the model and check the summary

In [8]:
model = MyCustomModel(params, num_classes=10, name="custom_model")
model(inputs)
model.summary()

I0000 00:00:1724953830.423422  180479 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1724953830.443203  180479 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1724953830.443316  180479 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1724953830.444430  180479 cuda_executor.cc:1015] successful NUMA node read from SysFS ha

### 2. Create a functional version of the model 

This is the preferred way to build models, as it allows for more flexibility and reusability. 

Notice the functions actually returns a closure that builds the model. This is a common pattern in functional programming, and it allows us to pass parameters to the model building function.


In [9]:
def dense_layer(params: CustomLayerParams) -> keras.Layer:
    """Create a dense functional layer

    Args:
        params: CustomLayerParams: Layer parameters

    Returns:
        keras.Layer: Closure that creates a dense layer
    """
    def layer(x: keras.KerasTensor) -> keras.KerasTensor:
        return keras.layers.Dense(units=params.units, activation=params.activation)(x)
    return layer

def custom_model_layer(params: CustomModelParams) -> keras.Layer:
    """Create a custom model layer

    Args:
        params: CustomModelParams: Model parameters

    Returns:
        keras.Layer: Closure that creates a custom model layer
    """
    def layer(x: keras.KerasTensor) -> keras.KerasTensor:
        for param in params.layers:
            x = dense_layer(param)(x)
        return x
    return layer


def custom_model(inputs: keras.Input, params: CustomModelParams, num_classes: int|None = None):
    """Create a custom model using functional API

    Args:
        inputs: keras.Input: Input tensor
        params: CustomModelParams: Model parameters
        num_classes: int|None: Number of classes

    Returns:
        keras.Model: Model
    """
    outputs = custom_model_layer(params)(inputs)
    if num_classes is not None:
        outputs = keras.layers.Dense(num_classes, activation="softmax")(outputs)
    return keras.Model(inputs=inputs, outputs=outputs, name="custom_model")

Similarly to the subclassed model, we can instantiate the functional model and check the summary

In [10]:
model_fn = custom_model(inputs, params, num_classes=10)
model_fn.summary()

### Validate two versions match

Finally, we can validate that the two versions of the model are equivalent by comparing their outputs for a random input tensor.

Since the models are randomly initialized, we will copy the weights from the functional model to the subclassed model to ensure they are the same.

In [19]:
model.set_weights(model_fn.get_weights())

In [20]:
x = keras.random.normal((1, 128))
y = model(x)
y_fn = model_fn(x)

In [21]:
if np.allclose(y, y_fn, rtol=1e-5, atol=1e-5):
    print("The model and the functional model are equivalent")


The model and the functional model are equivalent
