# <font color='#154360'> <b> <center> CUSTOM LAYERS </center> </b> </font>

## <font color='blue'> Table of Contents </font>

1. [Introduction](#1)
2. [Setup](#2)  
3. [Custom Lambda Layers](#3)
4. [Custom Layers with Weights](#4)
5. [Activation in custom layers](#5) <br>
6. [References](#references)


<a name="1"></a>
## <font color='blue'> 1. Introduction </font> 

This notebook demonstrates how to create custom layers in TensorFlow. We’ll cover key concepts, including defining the call and build methods, handling trainable parameters, and integrating custom layers into models.

<a name="2"></a>
## <font color='blue'> 2. Setup </font> 

In [15]:
import tensorflow as tf
import numpy as np
from tensorflow.keras import backend as K
from tensorflow.keras.layers import Layer

In [3]:
# Set seeds
SEED = 42

tf.random.set_seed(SEED)
np.random.seed(SEED)

<a name="3"></a>
## <font color='blue'> 3. Custom Lambda Layers </font> 

The Lambda layer in TensorFlow allows you to define simple custom operations without creating a full Layer subclass. It's useful for applying custom functions within a model while keeping the code concise.

### Example 1: Absolute value

In the first example, we will see a lambda layer that applies the absolute value to the output.

In [5]:
# Define a simple model without Lambda layer
model_without_lambda = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(3,)),  # Input shape with 3 features
    tf.keras.layers.Dense(3)  # Simple dense layer
])

# Define a model with Lambda layer applied to output
model_with_lambda = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(3,)),
    tf.keras.layers.Dense(3),
    tf.keras.layers.Lambda(lambda x: tf.abs(x)) # Lambda layer!!!
])

# Create test input
test_input = np.array([[-1.0, 2.0, -3.0]])

# Run inference
output_without_lambda = model_without_lambda.predict(test_input)
output_with_lambda = model_with_lambda.predict(test_input)

print("Input:", test_input)
print("Output without Lambda layer:", output_without_lambda)
print("Output with Lambda layer:", output_with_lambda)


Input: [[-1.  2. -3.]]
Output without Lambda layer: [[-0.07905126  0.7423949  -0.42413902]]
Output with Lambda layer: [[0.813118  1.1552937 1.9624093]]


To clarify, we can use a custom function to print the values before and after applying the absolute value.

In [13]:
# Define a custom Lambda layer that prints the input and output
class PrintInputOutput(tf.keras.layers.Layer):
    def __init__(self, lambda_func):
        super(PrintInputOutput, self).__init__()
        self.lambda_func = lambda_func

    def call(self, inputs):
        # Print input to Lambda layer (symbolic tensor)
        tf.print("Input to Lambda layer:", inputs)
        output = self.lambda_func(inputs)
        # Print output from Lambda layer (symbolic tensor)
        tf.print("Output from Lambda layer:", output)
        return output

# Define the model
model_with_lambda = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(3,)),
    tf.keras.layers.Dense(3),
    PrintInputOutput(lambda x: tf.abs(x))  # Apply the Lambda operation
])

# Compile the model
model_with_lambda.compile(optimizer='adam', loss='mse')

# Example dummy data for inference
X = np.random.random((1, 3))  # 1 sample, 3 features

# Perform inference and print the input and output
print("Input:", X)
output = model_with_lambda.predict(X)
print("Final output:", output)


Input: [[0.87977674 0.01321019 0.50682292]]
Input to Lambda layer: [[-0.588795424 -0.215181828 1.25408578]]
Output from Lambda layer: [[0.588795424 0.215181828 1.25408578]]
Final output: [[0.5887954  0.21518183 1.2540858 ]]


### Example 2: Scaling 
    
Now, we are going to implement a lambda layer that multiplies the output by 10.

In [6]:
# Define a simple model without Lambda layer
model_without_lambda = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(3,)),  # Input shape with 3 features
    tf.keras.layers.Dense(3)  # Simple dense layer
])

# Define a model with Lambda layer applied to output
model_with_lambda = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(3,)),
    tf.keras.layers.Dense(3),
    tf.keras.layers.Lambda(lambda x: x * 10)
])

# Create test input
test_input = np.array([[-1.0, 2.0, -3.0]])

# Run inference
output_without_lambda = model_without_lambda.predict(test_input)
output_with_lambda = model_with_lambda.predict(test_input)

print("Input:", test_input)
print("Output without Lambda layer:", output_without_lambda)
print("Output with Lambda layer:", output_with_lambda)

Input: [[-1.  2. -3.]]
Output without Lambda layer: [[-0.93204284  3.467924    2.4794106 ]]
Output with Lambda layer: [[41.085434 40.95478  10.633764]]


### Example 3: Custom function 
    
Finally, we will implement a lambda layer that applies a leaky ReLU. In this case, we will also train the model.

In [14]:
# Define a custom Leaky ReLU function
def my_leaky_relu(x):
    return K.maximum(0.1 * x, x)


# Define a model with Lambda layer applying Leaky ReLU
model_with_lambda = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128),
    tf.keras.layers.Lambda(my_leaky_relu),
    tf.keras.layers.Dense(10, activation='softmax')
])

# Compile models
model_with_lambda.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Create fake training data
x_train = np.random.rand(1000, 28, 28)
y_train = np.random.randint(0, 10, 1000)

# Train models
model_with_lambda.fit(x_train, y_train, epochs=5, batch_size=32, verbose=1)

# Create test input
test_input = np.random.rand(1, 28, 28)  # Random input

# Run inference
output_with_lambda = model_with_lambda.predict(test_input)

print("Output with Lambda layer:", output_with_lambda)


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Output with Lambda layer: [[0.04984679 0.07913906 0.06390966 0.05209231 0.19305727 0.07833236
  0.09449093 0.1550185  0.17974283 0.05437021]]


<a name="4"></a>
## <font color='blue'> 4. Custom Layers with Weights </font> 

We'll see how to create a custom layer that inherits the Layer class. Unlike simple Lambda layers, the custom layer here will contain weights that can be updated during training.

To make a custom layer that is trainable, we need to define a class that inherits the Layer base class from Keras. 

This class requires three functions: 

- `__init__()` 
- build()
- call()

These ensure that our custom layer has a state and computation that can be accessed during training or inference.

```
class CustomLayer(tf.keras.layers.Layer):
    def __init__(self, ...):  
        super(CustomLayer, self).__init__(...)
        # Initialize attributes

    def build(self, input_shape):  
        # Define layer's weights
        self.some_weight = self.add_weight(...)

    def call(self, inputs):  
        # Define the forward pass
        return some_transformation(inputs)
```

### Example 1: Custom Dense Layer

In a dense layer the output is computed as:

$$
\mathbf{y} = \sigma(\mathbf{W} \mathbf{x} + \mathbf{b})
$$

Where:

- **x** is the input vector (with shape \(n\)).
- **W** is the weight matrix (with shape m x n), where m is the number of neurons in the layer).
- **b** is the bias vector (with shape \(m\)).
- **sigma** is the activation function (e.g., ReLU, Sigmoid, etc.).
- **y** is the output vector (with shape \(m\)).

The equation represents a linear transformation of the input, followed by the application of an activation function to introduce non-linearity.

In the following example, we will not apply the activation function; we will introduce it later.





In [16]:
class SimpleDense(Layer):
    def __init__(self, units=32):
        '''Initializes the instance attributes'''
        super(SimpleDense, self).__init__()
        self.units = units

    def build(self, input_shape):
        '''Create the state of the layer (weights)'''
        # Initialize the weights
        self.w = self.add_weight(name="kernel",
                                 shape=(input_shape[-1], self.units),
                                 initializer=tf.random_normal_initializer(),
                                 trainable=True)

        # Initialize the biases
        self.b = self.add_weight(name="bias",
                                 shape=(self.units,),
                                 initializer=tf.zeros_initializer(),
                                 trainable=True)

    def call(self, inputs):
        '''Defines the computation from inputs to outputs'''
        return tf.matmul(inputs, self.w) + self.b

Let's use this layer:

In [17]:
# declare an instance of the class
my_dense = SimpleDense(units=1)

# define an input and feed into the layer
x = tf.ones((1, 1))
y = my_dense(x)

# parameters of the base Layer class like `variables` can be used
print(my_dense.variables)

[<tf.Variable 'simple_dense/kernel:0' shape=(1, 1) dtype=float32, numpy=array([[0.01637343]], dtype=float32)>, <tf.Variable 'simple_dense/bias:0' shape=(1,) dtype=float32, numpy=array([0.], dtype=float32)>]


Now, let's use this layer in a model:

In [18]:
# Define dataset (reshape to 2D: (samples, features))
xs = np.array([-1.0,  0.0, 1.0, 2.0, 3.0, 4.0], dtype=float).reshape(-1, 1)
ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float).reshape(-1, 1)

# my custom layer
my_layer = SimpleDense(units=1)

# simple model that uses our custom layer
model = tf.keras.Sequential([my_layer])

# compile
model.compile(optimizer='sgd', loss='mean_squared_error')

# train
model.fit(xs, ys, epochs=500, verbose=0)

# perform inference
print(model.predict([[10.0]]))  # Reshaped input

[[18.98151]]


### Example 2: Quadratic Layer

In [21]:
class SquareLayer(tf.keras.layers.Layer):
    """
    Custom Keras layer that squares the input tensor element-wise.
    This layer applies the square function to each element of the input tensor.

    Methods:
    --------
    call(inputs):
        Computes the element-wise square of the input tensor.
    """
    def __init__(self):
        super(SquareLayer, self).__init__()

    def call(self, inputs):
        # Apply element-wise square to the input tensor
        return tf.square(inputs)


In [22]:
# Create a model with the custom layer
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(3,)),  
    SquareLayer()
])

# Test the layer with sample input
test_input = np.array([[1.0, -2.0, 3.0]])
output = model.predict(test_input)

print("Input:", test_input)
print("Output:", output)

Input: [[ 1. -2.  3.]]
Output: [[1. 4. 9.]]


### Example 3: Scaling

In [23]:
class ScaleLayer(tf.keras.layers.Layer):
    """
    Custom Keras layer that scales the input tensor by a trainable scalar weight.

    This layer multiplies the input tensor by a scalar weight, which is initialized
    with a given value (default is 1.0) and is trainable during model training.

    Attributes:
    -----------
    scale: tf.Variable
        A trainable scalar that scales the input tensor.

    Methods:
    --------
    build(input_shape):
        Initializes the trainable scalar weight.
    
    call(inputs):
        Scales the input tensor by the trainable scalar.
    """
    def __init__(self, initial_scale=1.0):
        super(ScaleLayer, self).__init__()
        self.initial_scale = initial_scale

    def build(self, input_shape):
        # Initialize the trainable scalar weight with a constant initializer
        self.scale = self.add_weight(
            name="scale",
            shape=(1,),
            initializer=tf.constant_initializer(self.initial_scale),
            trainable=True
        )

    def call(self, inputs):
        # Scale the input tensor by the trainable scalar weight
        return inputs * self.scale


In [24]:
# Create fake dataset (y = 3 * x)
x_train = np.random.rand(1000, 1).astype(np.float32) * 10
y_train = 3 * x_train  

# Create model with the custom layer
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(1,)),
    ScaleLayer(initial_scale=1.0)  # Trainable scaling factor
])

# Compile model
model.compile(optimizer='sgd', loss='mse')

# Train the model
model.fit(x_train, y_train, epochs=100, verbose=0)

# Test the model
test_input = np.array([[5.0]], dtype=np.float32)
output = model.predict(test_input)

print("Test Input:", test_input)
print("Predicted Output:", output)
print("Learned Scale Factor:", model.layers[0].get_weights()[0])

Test Input: [[5.]]
Predicted Output: [[15.]]
Learned Scale Factor: [3.]


In [25]:
# Test the model
test_input = np.array([[15.0]], dtype=np.float32)
output = model.predict(test_input)

print("Test Input:", test_input)
print("Predicted Output:", output)
print("Learned Scale Factor:", model.layers[0].get_weights()[0])

Test Input: [[15.]]
Predicted Output: [[45.]]
Learned Scale Factor: [3.]


<a name="5"></a>
## <font color='blue'> 5. Activation in custom layers </font> 

To use the built-in activations in Keras, we can:

- Specify an activation parameter in the __init__() method of our custom layer class. 

- From there, we can initialize it by using the tf.keras.activations.get() method. This takes in a string identifier that corresponds to one of the available activations in Keras. 

- Next, you can now pass in the forward computation to this activation in the call() method.

### Example 1: Custom Dense Layer with Activation Function

In [46]:
class SimpleDense(tf.keras.layers.Layer):
    def __init__(self, units=32, activation=None):
        """
        Custom dense layer with optional activation function.

        Parameters:
        -----------
        units: int
            The number of units (neurons) in the dense layer.
        activation: str or callable, optional
            The activation function to apply after the linear transformation.
            If None, no activation function will be applied.
        """
        super(SimpleDense, self).__init__()
        self.units = units
        # Get the activation function using Keras' utility function.
        # This supports string identifiers (e.g., 'relu', 'sigmoid') or custom functions.
        self.activation = tf.keras.activations.get(activation)

    def build(self, input_shape):
        self.w = self.add_weight(name="kernel",
                                 shape=(input_shape[-1], self.units),
                                 initializer=tf.keras.initializers.RandomNormal(stddev=0.1),
                                 trainable=True)
        self.b = self.add_weight(name="bias",
                                 shape=(self.units,),
                                 initializer=tf.zeros_initializer(),
                                 trainable=True)

    def call(self, inputs):
        # Perform the linear transformation (matmul + bias)
        output = tf.matmul(inputs, self.w) + self.b

        # Apply the activation function if specified.  Crucial fix!
        if self.activation is not None:
            output = self.activation(output)  # Or: return self.activation(output)

        return output

Let's try it:

In [47]:
# --- Fake Data (Linear Relationship) ---
x_train = np.array([[1], [2], [3], [4], [5]], dtype=np.float32)  # Input
y_train = np.array([[2], [4], [6], [8], [10]], dtype=np.float32) # Output (y = 2x)

First, we will not use an activation function (activation = None in SimpleDense):

In [48]:
# --- Define Model with Custom Layer ---
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(1,)),  # Input layer
    SimpleDense(units=1, activation=None)  # Custom Dense layer
])

In [49]:
# --- Train Model ---
model.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=0.01), loss='mse')
model.fit(x_train, y_train, epochs=500, verbose=0)

# --- Test Model ---
test_input = np.array([[6]], dtype=np.float32)  # Expected output: 12
output = model.predict(test_input)

print("Test Input:", test_input)
print("Predicted Output:", output)
print("Learned Weights:", model.layers[0].get_weights())  # Check trained weights

Test Input: [[6.]]
Predicted Output: [[11.938638]]
Learned Weights: [array([[1.974322]], dtype=float32), array([0.09270614], dtype=float32)]


Now let's use ReLU:

In [52]:
# --- Define Model with Custom Layer ---
model_relu = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(1,)),  # Input layer
    SimpleDense(units=1, activation='relu')  # Custom Dense layer with ReLU
])

# --- Train Model ---
model_relu.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=0.01), loss='mse')
model_relu.fit(x_train, y_train, epochs=500, verbose=0)

# --- Test Model ---
test_input = np.array([[6]], dtype=np.float32)  # Expected output: 12
output = model_relu.predict(test_input)

print("Test Input:", test_input)
print("Predicted Output:", output)
print("Learned Weights:", model_relu.layers[0].get_weights())  # Check trained weights

Test Input: [[6.]]
Predicted Output: [[11.939157]]
Learned Weights: [array([[1.9745392]], dtype=float32), array([0.09192166], dtype=float32)]


### Example 2: Quadratic Layer

In [54]:
# Define the custom quadratic layer
class SimpleQuadratic(Layer):
    def __init__(self, units=32, activation=None):
        """
        Custom quadratic layer that applies a quadratic transformation to the input.
        
        Parameters:
        -----------
        units: int
            The number of units (neurons) in the layer.
        activation: str or callable, optional
            The activation function to apply after the quadratic transformation.
            If None, no activation function will be applied.
        """
        super().__init__()
        self.units = units
        # Get the activation function using Keras' utility function
        self.activation = tf.keras.activations.get(activation)
    
    def build(self, input_shape):
        """
        Initialize the weights for the quadratic transformation: 
        - a: coefficient for the squared input.
        - b: coefficient for the linear input.
        - c: constant term.

        Parameters:
        -----------
        input_shape: tuple
            The shape of the input tensor. We use the last dimension for weight shapes.
        """
        # Initialize the weights a, b, and c with random normal and zeros
        a_init = tf.random_normal_initializer()
        b_init = tf.random_normal_initializer()
        c_init = tf.zeros_initializer()

        # Define the trainable weights
        self.a = self.add_weight(shape=(input_shape[-1], self.units), initializer=a_init, trainable=True, name="a")
        self.b = self.add_weight(shape=(input_shape[-1], self.units), initializer=b_init, trainable=True, name="b")
        self.c = self.add_weight(shape=(self.units,), initializer=c_init, trainable=True, name="c")
   
    def call(self, inputs):
        """
        Apply the quadratic transformation: 
        - First, square the inputs.
        - Then compute the quadratic terms (ax² + bx + c).
        
        Parameters:
        -----------
        inputs: Tensor
            The input tensor to be transformed.
        
        Returns:
        --------
        output: Tensor
            The transformed tensor, optionally passed through an activation function.
        """
        # Square the input tensor element-wise
        x_squared = tf.math.square(inputs)
        
        # Compute the quadratic transformation terms
        x_squared_times_a = tf.linalg.matmul(x_squared, self.a)
        x_times_b = tf.linalg.matmul(inputs, self.b)
        
        # Combine the terms (ax² + bx + c)
        output = x_squared_times_a + x_times_b + self.c
        
        # Apply activation function if specified
        return self.activation(output) if self.activation else output


Let's try:

In [55]:
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  SimpleQuadratic(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

Epoch 1/5


2025-02-18 20:00:21.262797: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 188160000 exceeds 10% of free system memory.


Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


[0.08106440305709839, 0.9747999906539917]

Let's change the activation function:

In [57]:
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  SimpleQuadratic(128, activation='leaky_relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

Epoch 1/5


2025-02-18 20:02:08.922326: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 188160000 exceeds 10% of free system memory.


Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


[0.10189716517925262, 0.9700999855995178]

Finally, let's not use the quadratic layer:

In [56]:
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

Epoch 1/5


2025-02-18 20:01:05.475102: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 188160000 exceeds 10% of free system memory.


Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


[0.27015531063079834, 0.9243000149726868]

In this example, we obtained the best results using the quadratic layer with ReLU as the activation function.

<a name="references"></a>
## <font color='blue'> 6. References </font> 

[TensorFlow Advanced Techniques Specialization](https://www.coursera.org/specializations/tensorflow-advanced-techniques)