# **Creating Custom Layers in Keras**  

The video explains the importance of **custom layers** in Keras and how to implement them to extend the functionality of neural networks.  

#### **Why Use Custom Layers?**  
- **Flexibility**: Allows defining custom operations not available in standard Keras layers.  
- **Innovation**: Useful for developing new algorithms and research techniques.  
- **Optimization**: Can be tailored to specific data needs or computational constraints.  
- **Maintainability**: Encapsulates complex logic into reusable components, making the code cleaner.  

#### **Structure of a Custom Layer**  
A **custom layer** in Keras is created by subclassing `Layer` from `tensorflow.keras.layers` and implementing three key methods:  
1. `__init__` → Initializes the layer’s attributes.  
2. `build` → Creates the layer’s weights (called only during the first execution).  
3. `call` → Defines the **forward pass** logic.  

#### **Example of a Custom Layer**  
**Custom Dense Layer** that performs a dense operation followed by a ReLU activation. This layer can be easily integrated into a **Sequential** model, just like any other Keras layer.  


In [2]:
from tensorflow.keras.layers import Layer
import tensorflow as tf

In [3]:
class MyCustomLayer(Layer):
    def __init__(self, units=32, **kwargs):
        super(MyCustomLayer, self).__init__(**kwargs)
        self.units = units

    def build(self, input_shape):
        self.w = self.add_weight(
            shape=(input_shape[-1], self.units),
            initializer="random_normal",
            trainable=True
        )
        self.b = self.add_weight(
            shape=(self.units,),
            initializer="zeros",
            trainable=True
        )
    
    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b


In [4]:
print(tf.executing_eagerly())

a = tf.constant([1, 2, 3])
b = tf.constant([4, 5, 6])
result = tf.add(a, b)
result

True


<tf.Tensor: shape=(3,), dtype=int32, numpy=array([5, 7, 9])>

In [8]:
class CustomDenseLayer(Layer):
    def __init__(self, units=32):
        super(CustomDenseLayer, self).__init__()
        self.units = units

    def build(self, input_shape):
        self.w = self.add_weight(shape=(input_shape[-1], self.units), initializer="random_normal", trainable=True)
        self.b = self.add_weight(shape=(self.units,), initializer="zeros", trainable=True)

    def call(self, inputs):
        return tf.nn.relu(tf.matmul(inputs, self.w) + self.b)

In [9]:
from tensorflow.keras.models import Sequential

model = Sequential([
    CustomDenseLayer(64),
    CustomDenseLayer(10)
])

model.compile(optimizer="adam", loss="categorical_crossentropy")
model.summary

<bound method Model.summary of <Sequential name=sequential_2, built=False>>

### **Create a custom Neural network**, 
based on **LSTM + Attention for NoVa**, optimized to run on consumer hardware without requiring advanced GPUs.

- **NoVaMemoryModel model**, based on LSTM + Attention, which:

 1. **Memorize the context **of the conversation.
 2. **Saves and reloads memory state** to maintain persistence.
 3. It uses a **tanh activation** to **simulate adaptive memory**.

In [56]:
import tensorflow as tf
from tensorflow.keras.layers import Input, LSTM, Dense, Attention, Embedding
from tensorflow.keras.models import Model
import numpy as np
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

# Sample conversation dataset
data = [
    "Hello, how are you?",
    "I'm fine, thank you! And you?",
    "I'm fine too, what are you doing?",
    "Studying artificial intelligence!",
    "Wow, interesting! Tell me more."
]

# Text tokenization
tokenizer = Tokenizer()
tokenizer.fit_on_texts(data)
sequences = tokenizer.texts_to_sequences(data)
max_len = max(len(seq) for seq in sequences)
padded_sequences = pad_sequences(sequences, maxlen=max_len, padding='post')

In [58]:
# NoVa Memory Model
class NoVaMemoryModel(Model):
    def __init__(self, units=128, vocab_size=1000, embedding_dim=50):
        super(NoVaMemoryModel, self).__init__()
        self.units = units
        self.embedding = Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_len)
        self.lstm = LSTM(units, return_sequences=True, return_state=True)
        self.attention = Attention()
        self.dense = Dense(units, activation="relu")
        self.output_layer = Dense(units, activation="tanh")  

    def build(self, input_shape):
        super(NoVaMemoryModel, self).build(input_shape)  

    def call(self, inputs, states=None):
        embedded = self.embedding(inputs)
        batch_size = tf.shape(inputs)[0]  
        if states is None:
            states = [tf.zeros((batch_size, self.units)), tf.zeros((batch_size, self.units))]
        lstm_output, state_h, state_c = self.lstm(embedded, initial_state=states)
        memory_context = self.attention([lstm_output, lstm_output, lstm_output])
        final_output = self.dense(memory_context)
        return self.output_layer(final_output), [state_h, state_c]

In [59]:
# Model initialization
units = 128
vocab_size = len(tokenizer.word_index) + 1
embedding_dim = 50
memory_model = NoVaMemoryModel(units, vocab_size, embedding_dim)
memory_model.build((None, max_len))  

In [60]:
# Load memory state
try:
    loaded_states = np.load("nova_memory_state.npy", allow_pickle=True)
    state_h, state_c = tf.convert_to_tensor(loaded_states[0]), tf.convert_to_tensor(loaded_states[1])
    print("✅ Memory state loaded successfully!")
except FileNotFoundError:
    state_h, state_c = tf.zeros((1, units)), tf.zeros((1, units))
    print("⚠️ No previous state found. Initializing empty memory.")


⚠️ No previous state found. Initializing empty memory.


In [63]:
# Function to update and save memory
def update_memory(input_text):
    input_sequence = tokenizer.texts_to_sequences([input_text])
    input_padded = pad_sequences(input_sequence, maxlen=max_len, padding='post')
    input_tensor = tf.convert_to_tensor(input_padded, dtype=tf.float32)

    global state_h, state_c
    output, new_states = memory_model(input_tensor, [state_h, state_c])
    state_h, state_c = new_states
    
    np.save("nova_memory_state.npy", [state_h.numpy(), state_c.numpy()])
    return output.numpy().flatten()[:10]  # Returns context summary

# Function to retrieve memory context
def retrieve_memory():
    return f"Memory summary: {state_h.numpy().flatten()[:5]} ..."

print("Memory model ready for use.")

Memory model ready for use.


In [64]:
# Test the memory model with sample conversations
def test_memory_model():
    # 1. Test initial memory state
    print("Initial Memory State:")
    print(retrieve_memory())
    print("\n" + "="*50 + "\n")
    
    # 2. Test updating memory with new conversations
    test_inputs = [
        "Hello AI, I'm a new user",
        "Can you remember our conversation?",
        "Let's talk about machine learning"
    ]
    
    print("Testing Memory Updates:")
    for text in test_inputs:
        print(f"\nInput: {text}")
        context = update_memory(text)
        print(f"Updated Memory Context: {context}")
        print(f"Current Memory State: {retrieve_memory()}")
        print("-"*30)
    
    # 3. Test memory persistence
    print("\nTesting Memory Persistence:")
    try:
        loaded_states = np.load("nova_memory_state.npy", allow_pickle=True)
        print("✅ Memory successfully persisted!")
        print(f"Stored Memory Shape: {loaded_states[0].shape}")
    except FileNotFoundError:
        print("❌ Memory persistence test failed!")

# Run the test
test_memory_model()

# Visual verification of model structure
print("\nModel Summary:")
memory_model.summary()

Initial Memory State:
Memory summary: [0. 0. 0. 0. 0.] ...


Testing Memory Updates:

Input: Hello AI, I'm a new user
Updated Memory Context: [ 0.00287965 -0.00238571 -0.0029122   0.00569352  0.00678138  0.00466003
 -0.00680607  0.00089767  0.00205536  0.00559776]
Current Memory State: Memory summary: [-0.00640432  0.01239642  0.00356225  0.00675242  0.01898393] ...
------------------------------

Input: Can you remember our conversation?
Updated Memory Context: [ 0.00522634 -0.00766089 -0.0071715   0.00871886  0.01135762  0.0091626
 -0.01366987 -0.00106806  0.00298307  0.01301617]
Current Memory State: Memory summary: [-0.00781322  0.01795602  0.0011015   0.01006286  0.02495648] ...
------------------------------

Input: Let's talk about machine learning
Updated Memory Context: [ 0.00632868 -0.01181056 -0.00911155  0.01069849  0.01470181  0.01121417
 -0.01678194 -0.00242839  0.00501644  0.018376  ]
Current Memory State: Memory summary: [-0.00782993  0.01918023  0.00106169  0.01270488 