# Multimodal AGI Architecture Implementation

**Copyright ¬© 2025 by Ananya Soni. All Rights Reserved.**

---

### **Terms of Use: For Personal Learning Only**

This project is made public so others may **run the code, study, and learn** from its implementation, including the unique:
* Multimodal AI
* Computer Vision & NLP
* Generative AI & Deep Learning
* Spiking Neural Networks (SNN)

**STRICTLY FORBIDDEN:**

1.  **COMMERCIAL USE:** You may **not** use this project for any profit-making or commercial purpose whatsoever.
2.  **AUTHORSHIP CLAIM:** The **original authorship is retained by Ananya Soni.** You may **not** claim to be the creator of this project or its unique components (Replicated Synapse, Frontal Lobe).

# üåü Temporal Causal Synthesis Network (TCS-25) - Core Architecture

## Introduction: A Novel AGI Paradigm
The **Temporal Causal Synthesis Network (TCS-25)** is a hyper-advanced deep learning architecture designed for **Multimodal General Intelligence (MGI)**. It integrates neuro-inspired mechanisms like **Generalized Plasticity (G-Plasticity)**, **Axiomatic Knowledge Systems (AKS)**, and a **Conscious Global Workspace (CGW)** to achieve human-like, system-2 reasoning and dynamic goal pursuit across 21 parallel data streams.

This notebook implements the full, 10-cell architecture definition using TensorFlow/Keras.

---

## Cell 1: Essential Imports and Hyper-Advanced Constants

This cell defines all necessary library dependencies and establishes the **global hyperparameters**. The scale of these constants (e.g., $\text{HYPER\_LATENT\_DIM}=4096$, $\text{NUM\_PFC\_CONTEXTS}=64$) reflects a commitment to advanced, high-capacity models for complex cognitive tasks. Defining them first ensures modularity and ease of future scaling.

In [None]:
# Cell 1 ‚Äî imports and constants
import tensorflow as tf
from tensorflow.keras.layers import Layer, Input, Dense, Conv2D, Flatten, Embedding, LSTM, GRU, Bidirectional, TimeDistributed, GlobalMaxPooling1D, Concatenate, Multiply, BatchNormalization, Dot, Reshape, Dropout, Attention
from tensorflow.keras.models import Model
import numpy as np
import traceback
import random
import itertools

# --- Global Constants for TCS-25 Architecture (Hyper-Advanced) ---
IMAGE_SHAPE = (128, 128, 3)
DATA_INPUT_SIZE = 512
TS_STEPS = 10
TS_DIM = 8
SEQ_LEN = 50
SEQ_DIM = 64
GRAPH_DIM = 32
VOCAB_SIZE = 10000
NUM_CLASSES = 10
FRONTAL_LOBE_UNITS = 2048 # Doubled capacity
HYPER_LATENT_DIM = 4096 # Doubled Hyper-Dimensional Latent Space
NUM_PFC_CONTEXTS = 64 # Doubled for deeper executive control
RELATIONAL_EMB_DIM = 512 # Doubled for richer relational reasoning
CAUSAL_STATE_DIM = 256 # Doubled for richer Causal Inference Module
AXIOMATIC_DIM = 128 # NEW: Dimension for Axiomatic Knowledge Embeddings
MLC_OUTPUT_DIM = 5 # Output dimension for Meta-Learning Control
LOSS_WEIGHT_SSTC = 0.95 # Stronger emphasis on Self-Supervised Temporal Contrastive
TRAINING_BATCH_SIZE = 32 # Smaller batch for stability with larger model

---

## Cell 2: Generalized Plasticity (G-Plasticity) and Hebbian Mixin

This component introduces **G-Plasticity**, a mechanism inspired by meta-plasticity in the human brain. It extends standard backpropagation by adding a **Hebb-based update term** ($\Delta W_P \propto \text{pre} \cdot \text{post}$), which is dynamically gated by two internal cognitive signals: **Surprisal** (prediction error) and **Causal Error** (disagreement with causal inference).

The `GPlasticityMixin` manages these signals and the dynamic plasticity rate output by the $\text{Meta-Learning Control (MLC)}$ head, allowing the network to **learn *how* and *when* to learn** based on context and surprise.

In [None]:
# Cell 2 ‚Äî G-Plasticity mixin and plastic layers
class GPlasticityMixin:
    """TCS-25's base mixin for Generalized Plasticity (G-Plasticity), now with dynamic rate."""
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        # Neuromodulatory signals (tf tensors)
        self.surprisal_signal = tf.constant([[0.0]], dtype=tf.float32)
        self.causal_signal = tf.constant([[0.0]], dtype=tf.float32)
        self.dynamic_plasticity_strength = tf.constant([[0.0001]], dtype=tf.float32)
        self.w_old = None
        self.x_input = None
        self.y_output = None

    def calculate_plasticity_change(self):
        if self.w_old is None or self.x_input is None or self.y_output is None:
            return tf.constant(0.0, dtype=tf.float32), tf.constant(0.0, dtype=tf.float32)

        hebbian_term = tf.matmul(tf.transpose(self.x_input), self.y_output) / tf.cast(tf.shape(self.x_input)[0], tf.float32)
        modulation_scalar = 0.6 * tf.cast(self.surprisal_signal[0, 0], tf.float32) + 0.4 * tf.cast(self.causal_signal[0, 0], tf.float32)
        delta_p = tf.cast(self.dynamic_plasticity_strength[0, 0], tf.float32) * modulation_scalar * hebbian_term
        return delta_p, tf.norm(delta_p)

    def track_activations(self, x_input, y_output):
        self.x_input = tf.stop_gradient(x_input)
        self.y_output = tf.stop_gradient(y_output)

class GPlasticDense(GPlasticityMixin, Dense):
    """G-Plastic Dense"""
    def __init__(self, units, activation='tanh', **kwargs):
        super().__init__(units=units, activation=activation, **kwargs)

    def call(self, inputs):
        self.w_old = self.weights[0] if self.weights else None
        output = super().call(inputs)
        self.track_activations(inputs, output)
        return output

class GPlasticConv2D(GPlasticityMixin, Conv2D):
    """G-Plastic Conv2D"""
    def __init__(self, filters, kernel_size, activation='relu', **kwargs):
        super().__init__(filters=filters, kernel_size=kernel_size, activation=activation, **kwargs)

    def call(self, inputs):
        self.w_old = self.weights[0] if self.weights else None
        output = super().call(inputs)
        flat_out = Flatten()(output)
        self.track_activations(inputs, flat_out)
        return output

class GPlasticGRU(GPlasticityMixin, Layer):
    """G-Plastic GRU wrapper"""
    def __init__(self, units, **kwargs):
        super().__init__(**kwargs)
        self.gru_cell = GRU(units, return_sequences=False, return_state=True)
        self.proj_input = GPlasticDense(units, activation='tanh', name='gplastic_gru_input_proj')

    def call(self, inputs, initial_state=None):
        projected_input = self.proj_input(inputs)
        output, state = self.gru_cell(projected_input, initial_state=initial_state)
        # store weight snapshot for potential plasticity usage
        self.w_old = self.gru_cell.weights[0] if self.gru_cell.weights else None
        return output, state

---

## Cell 3: Axiomatic Knowledge System (AKS) and Episodic Relational Memory (ERM)

This cell defines two critical memory and knowledge components:

1.  **Axiomatic Knowledge Layer (AKL):** This layer provides a **stable, non-plastic foundation** of fundamental truths (e.g., physical laws, mathematical principles). It is integrated via a fixed embedding matrix, preventing the model from "forgetting" core knowledge during training and providing a base for high-level reasoning.
2.  **Episodic Relational Memory (ERM):** A simplified, graph-based buffer that stores past state, context, and causal vectors. The ERM allows the network to **retrieve past experiences** relevant to the current query, providing crucial contextual cues for the $\text{Frontal Lobe}$ and $\text{Causal Inference Module}$.

In [None]:
# Cell 3 ‚Äî AKL and EpisodicRelationalMemory
class AxiomaticKnowledgeLayer(Layer):
    """Provides an axiomatic, structured knowledge vector for context."""
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.axiom_embeddings = self.add_weight(
            name="axiom_emb_matrix",
            shape=(NUM_PFC_CONTEXTS, AXIOMATIC_DIM),
            initializer='glorot_uniform',
            trainable=True
        )
        self.context_to_axiom_proj = GPlasticDense(AXIOMATIC_DIM, activation='tanh', name='axiom_proj_plastic')

    def call(self, context_mask):
        expanded_mask = tf.expand_dims(context_mask, axis=-1)
        weighted_axioms = expanded_mask * self.axiom_embeddings
        return tf.reduce_sum(weighted_axioms, axis=1)

class EpisodicRelationalMemory:
    """Graph-based memory for episodes."""
    def __init__(self):
        self.buffer = []
        self.max_size = 10000
        self.graph_nodes = {}
        self.graph_edges = {}

    def store(self, state_vector, context_mask, causal_vector):
        node_id = len(self.buffer)
        episode = {
            'state': state_vector[0].numpy(),
            'context': context_mask[0].numpy(),
            'causal': causal_vector[0].numpy(),
            'id': node_id
        }
        self.buffer.append(episode)
        self.graph_nodes[node_id] = episode

        if node_id > 0:
            past_node_id = random.randint(0, node_id - 1)
            relation_vec = np.random.rand(CAUSAL_STATE_DIM).astype(np.float32)
            self.graph_edges.setdefault(past_node_id, []).append((node_id, relation_vec))

        if len(self.buffer) > self.max_size:
            old_node = self.buffer.pop(0)
            del self.graph_nodes[old_node['id']]
            if old_node['id'] in self.graph_edges:
                del self.graph_edges[old_node['id']]

    def retrieve_context(self, query_vector):
        if not self.graph_nodes:
            return tf.zeros((1, FRONTAL_LOBE_UNITS + CAUSAL_STATE_DIM), dtype=tf.float32)
        query = np.asarray(query_vector[0])
        query_len = query.shape[-1]
        scores = []
        nodes = list(self.graph_nodes.values())
        for node in nodes:
            node_state_slice = node['state'][:query_len]
            denom = (np.linalg.norm(query) * np.linalg.norm(node_state_slice) + 1e-8)
            score = float(np.dot(query, node_state_slice) / denom)
            scores.append(score)
        weights = np.exp(scores) / np.sum(np.exp(scores))
        retrieved_state = np.zeros(FRONTAL_LOBE_UNITS, dtype=np.float32)
        retrieved_causal = np.zeros(CAUSAL_STATE_DIM, dtype=np.float32)
        for i, node in enumerate(nodes):
            retrieved_state += weights[i] * node['state'][:FRONTAL_LOBE_UNITS]
            retrieved_causal += weights[i] * node['causal']
        retrieved_state = tf.constant(retrieved_state, dtype=tf.float32)[tf.newaxis, :]
        retrieved_causal = tf.constant(retrieved_causal, dtype=tf.float32)[tf.newaxis, :]
        return Concatenate()([retrieved_state, retrieved_causal])

---

## Cell 4: Relational Self-Attention (RSA) and Hierarchical Temporal State Processor (HTSP)

This section implements two advanced feature-processing units:

1.  **Relational Self-Attention (RSA):** A novel attention mechanism that dynamically scales the importance of each of the **21 feature streams** based on their learned **relational context**. This ensures that the network prioritizes relevant modalities (e.g., visual features are down-weighted when only a structural graph input is relevant).
2.  **HTSP Unit (Hierarchical Temporal State Processor):** A conceptual unit (simplified here) that would manage **fast (GRU)** and **slow (LSTM)** time scales. This structure is essential for distinguishing between immediate, short-term changes and long-term, slow-evolving causal relationships.

In [None]:
# Cell 4 ‚Äî Relational Self-Attention and HTSP
class RelationalSelfAttention(Layer):
    """Dynamically scales features based on global context."""
    def __init__(self, num_features, feature_dim, **kwargs):
        super().__init__(**kwargs)
        self.num_features = num_features
        self.feature_dim = feature_dim
        self.relational_embeddings = self.add_weight(
            name="relational_emb_matrix",
            shape=(num_features, num_features, RELATIONAL_EMB_DIM),
            initializer='glorot_uniform',
            trainable=True
        )
        self.query_gen = GPlasticDense(feature_dim, activation='relu', name='rsa_query_plastic')
        self.key_gen = GPlasticDense(feature_dim, activation='relu', name='rsa_key_plastic')
        self.rel_context_proj_gate = GPlasticDense(self.num_features * self.num_features, activation='tanh', name='rel_context_proj_gate')
        self.attention_weights_generator = GPlasticDense(num_features, activation='softmax', name='rsa_weights_plastic')

    def call(self, projected_feature_list):
        global_context_fused = Concatenate()(projected_feature_list)
        queries = tf.stack([self.query_gen(f) for f in projected_feature_list], axis=1)
        keys = tf.stack([self.key_gen(f) for f in projected_feature_list], axis=1)
        attention_scores = Dot(axes=-1)([queries, keys])
        relational_context_proj = self.rel_context_proj_gate(global_context_fused)
        combined_context = Concatenate()([global_context_fused, relational_context_proj])
        attention_weights = self.attention_weights_generator(combined_context)
        attended_features = []
        for i, feature in enumerate(projected_feature_list):
            weight = attention_weights[:, i:i+1]
            weighted_feature = Multiply()([feature, weight])
            attended_features.append(weighted_feature)
        return attended_features, attention_weights, tf.reduce_mean(attention_scores)

class HTSP_Unit(Layer):
    """Manages temporal context across different time scales."""
    def __init__(self, units, **kwargs):
        super().__init__(**kwargs)
        self.fast_gru = GRU(units // 2, return_sequences=False, return_state=True)
        self.slow_lstm = LSTM(units // 2, return_sequences=False, return_state=True)
        self.dense_gate = GPlasticDense(units, activation='sigmoid', name='htsp_gate_plastic')
    def call(self, inputs, fast_state, slow_state):
        fast_output, fast_state = self.fast_gru(inputs, initial_state=fast_state)
        slow_output, slow_state_h, slow_state_c = self.slow_lstm(inputs, initial_state=slow_state)
        slow_state = [slow_state_h, slow_state_c]
        fused = Concatenate()([fast_output, slow_output])
        gate = self.dense_gate(fused)
        output = Multiply()([fused, gate])
        return output, fast_state, slow_state

---

## Cell 5: The Cognitive Control Core (CGW, MLC, CIM, BG)

This cell defines the primary executive and regulatory modules inspired by cognitive neuroscience:

* **Conscious Global Workspace (CGW):** The central hub where all high-level vectors ($\text{Plastic GRU output, Axiomatic Knowledge, HLS Decoded, MCC Confidence, etc.}$) are fused and gated by a learnable attention mechanism. This output represents the network's **moment-to-moment "conscious" state.**
* **Basal Ganglia (BG) Selector:** A soft-max gating layer that selects the most appropriate **Prefrontal Cortex (PFC) Context** ($\text{NUM\_PFC\_CONTEXTS}=64$) based on the current $\text{CGW}$ state and the $\text{Task Vector}$.
* **Causal Inference Module (CIM):** A dense gating unit that processes the $\text{PFC}$ output and $\text{ERM}$'s causal context to predict the **next causal state** of the environment.
* **Meta-Learning Control (MLC):** Predicts dynamic control parameters, most notably the **MLC Plasticity Rate**, which is fed back to the $\text{G-Plasticity}$ layers.

In [None]:
# Cell 5 ‚Äî CGW (Conscious Global Workspace) and executive modules
class CGWAttentionLayer(Layer):
    """Conscious Global Workspace (CGW) Layer (TCS-25)."""
    def build(self, input_shape):
        # compute fusion dim
        fusion_dim = (input_shape[0][-1] + 1 + 1 + 1 + input_shape[4][-1] + input_shape[5][-1])
        self.kernel = self.add_weight(name="kernel_fusion", shape=(fusion_dim, input_shape[0][-1]), initializer='glorot_uniform', trainable=True)
        self.gating_dense = GPlasticDense(1, activation='sigmoid', name='cgw_gating_plastic')
        super().build(input_shape)
    def call(self, plastic_gru_output, scaled_symbolic_bias, scaled_vigilance, mcc_confidence, hls_vector, axiomatic_vector):
        fused = Concatenate()([plastic_gru_output, scaled_symbolic_bias, scaled_vigilance, mcc_confidence, hls_vector, axiomatic_vector])
        gated_fused = Multiply()([fused, self.gating_dense(fused)])
        output = tf.matmul(gated_fused, self.kernel)
        return output

class MetaLearningControl(Layer):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.dense_pred = GPlasticDense(MLC_OUTPUT_DIM, activation='sigmoid', name='mlc_output_gplastic')
        self.dropout = Dropout(0.2)
    def call(self, pfc_gated_output):
        x = self.dropout(pfc_gated_output)
        return self.dense_pred(x)

class BasalGangliaSelectionLayer(Layer):
    def __init__(self, num_contexts, **kwargs):
        super().__init__(**kwargs)
        self.context_gate = GPlasticDense(num_contexts, activation='softmax', name='bg_context_gate_gplastic')
    def call(self, cgw_output, task_vector_input):
        fusion = Concatenate()([cgw_output, task_vector_input])
        return self.context_gate(fusion)

class MultiContextExecutiveGating(Layer):
    def __init__(self, units, num_contexts, **kwargs):
        super().__init__(**kwargs)
        self.context_networks = [GPlasticDense(units, activation='tanh', name=f'pfc_context_gplastic_{i}') for i in range(num_contexts)]
    def call(self, cgw_output, context_mask):
        context_outputs = tf.stack([net(cgw_output) for net in self.context_networks], axis=1)
        weighted_mask = tf.expand_dims(context_mask, axis=-1)
        gated_output = Multiply()([context_outputs, weighted_mask])
        return tf.reduce_sum(gated_output, axis=1)

class CausalInferenceModule(Layer):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.dense = GPlasticDense(CAUSAL_STATE_DIM, activation='sigmoid', name='cim_causal_output_gplastic')
        self.gating = GPlasticDense(CAUSAL_STATE_DIM, activation='tanh', name='cim_gating_gplastic')
    def call(self, pfc_gated_output, retrieved_causal_context):
        fused = Concatenate()([pfc_gated_output, retrieved_causal_context])
        return Multiply()([self.dense(fused), self.gating(fused)])

---

## Cell 6: Hyper-Latent State (HLS) Processor and Core Model Builder Setup

This section defines the $\text{Hyper-Latent State (HLS)}$ system and the input structure for the final model:

1.  **HLS Processor/Decoder:** The $\text{HLS}$ is a massive, high-dimensional vector ($\text{HYPER\_LATENT\_DIM}=4096$) designed to capture a sparse, high-level summary of the network's state. The $\text{Processor}$ projects the $\text{PFC}$ output into this space, and the $\text{Decoder}$ brings it back to a working dimension for the $\text{CGW}$.
2.  **Core Model Builder:** This function defines **all 21 input streams** (Image, Text, Time Series, Graph, etc.) and the 4 control inputs ($\text{Vigilance, Bias, Task, Retrieval}$) that feed into the massive architecture.

In [None]:
# Cell 6 ‚Äî HLS Processor and start of core model builder
class HLS_Processor(Layer):
    """Projects the PFC output into the massive, sparse HLS."""
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.input_proj = GPlasticDense(FRONTAL_LOBE_UNITS * 2, activation='relu', name='hls_input_proj_plastic')
        # Minimal fix: instantiate as a plastic dense layer (previous mixin misuse corrected)
        self.latent_dense = GPlasticDense(HYPER_LATENT_DIM, activation='relu', name='hls_core_plastic', use_bias=False)
    def build(self, input_shape):
        self.latent_dense.build(input_shape)
        super().build(input_shape)
    def call(self, pfc_output):
        x = self.input_proj(pfc_output)
        hls_output_raw = self.latent_dense(x)
        return hls_output_raw

class HLS_Decoder(Layer):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.decoder = GPlasticDense(FRONTAL_LOBE_UNITS // 2, activation='tanh', name='hls_decoder_plastic')
    def call(self, hls_vector):
        return self.decoder(hls_vector)

def build_tcs_net_core_model(image_shape, data_size, ts_steps, ts_dim, seq_len, seq_dim, graph_dim, num_classes):
    # Inputs
    s1_image_input = Input(shape=image_shape, name='stream_01_image_input')
    s9_small_cnn_input = Input(shape=(32, 32, 1), name='stream_09_small_cnn_input')
    s10_deep_cnn_input = Input(shape=(64, 64, 3), name='stream_10_deep_cnn_input')
    s11_conv_ae_input = Input(shape=(16, 16, 8), name='stream_11_conv_autoencoder_input')
    s3_lstm_input = Input(shape=(ts_steps, ts_dim), name='stream_03_lstm_input')
    s4_gru_input = Input(shape=(ts_steps, ts_dim), name='stream_04_gru_input')
    s5_bidirectional_input = Input(shape=(ts_steps, ts_dim), name='stream_05_bidirectional_input')
    s6_timedistributed_input = Input(shape=(ts_steps, ts_dim), name='stream_06_timedistributed_input')
    s7_text_seq_input = Input(shape=(seq_len,), name='stream_07_text_seq_input')
    s8_transformer_input = Input(shape=(seq_len, seq_dim), name='stream_08_transformer_input')
    s2_structured_data_input = Input(shape=(data_size,), name='stream_02_structured_data_input')
    s12_fnn_input_100 = Input(shape=(100,), name='stream_12_fnn_input_100')
    s13_fnn_input_256 = Input(shape=(256,), name='stream_13_fnn_input_256')
    s14_vae_latent_input = Input(shape=(128,), name='stream_14_vae_latent_input')
    s15_rbm_feature_input = Input(shape=(64,), name='stream_15_rbm_feature_input')
    s16_graph_input_flat = Input(shape=(graph_dim * graph_dim,), name='stream_16_graph_input_flat')
    s17_attention_vector_input = Input(shape=(40,), name='stream_17_attention_vector_input')
    s18_custom_encoder_input = Input(shape=(80,), name='stream_18_custom_encoder_input')
    s19_residual_fwd_input = Input(shape=(512,), name='stream_19_residual_fwd_input')
    s20_residual_bwd_input = Input(shape=(512,), name='stream_20_residual_bwd_input')
    s21_residual_final_input = Input(shape=(512,), name='stream_21_residual_final_input')
    snn_vigilance_input = Input(shape=(1,), name='snn_vigilance_input')
    symbolic_bias_input = Input(shape=(1,), name='symbolic_bias_input')
    task_vector_input = Input(shape=(16,), name='task_vector_input')

    retrieved_memory_and_causal = Input(shape=(FRONTAL_LOBE_UNITS + CAUSAL_STATE_DIM,), name='erm_retrieval_input')
    retrieved_state  = retrieved_memory_and_causal[:, :FRONTAL_LOBE_UNITS]
    retrieved_causal_context = retrieved_memory_and_causal[:, FRONTAL_LOBE_UNITS:]

    input_list = [s1_image_input, s2_structured_data_input, s3_lstm_input, s4_gru_input, s5_bidirectional_input,
                 s6_timedistributed_input, s7_text_seq_input, s8_transformer_input, s9_small_cnn_input,
                 s10_deep_cnn_input, s11_conv_ae_input, s12_fnn_input_100, s13_fnn_input_256,
                 s14_vae_latent_input, s15_rbm_feature_input, s16_graph_input_flat, s17_attention_vector_input,
                 s18_custom_encoder_input, s19_residual_fwd_input, s20_residual_bwd_input, s21_residual_final_input]
    all_inputs = input_list + [snn_vigilance_input, symbolic_bias_input, task_vector_input, retrieved_memory_and_causal]

---

## Cell 7: Multimodal Feature Extraction and RSA Integration

This cell orchestrates the initial feature processing of the 21 data streams:

* **Specialized Encoders:** Each stream is processed by an appropriate encoder ($\text{Conv2D, LSTM, Embedding}$) before being funneled through a **Hybrid Synapse** (a $\text{G-Plastic Dense}$ layer).
* **Relational Self-Attention (RSA) Application:** The raw features are projected into a common dimension ($\text{PROJECTED\_FEATURE\_DIM}=512$) and then fed into the $\text{RSA}$ layer, which outputs **attentional weights** used to scale the features.
* **Ventral/Dorsal Split:** The attended features are conceptually split into **Ventral** (what/object recognition) and **Dorsal** (where/action-oriented) streams for parallel processing, a key neuro-architectural feature.

In [None]:
# Cell 7 ‚Äî Feature extraction and RSA projections
    # Helper
def extract_and_hybrid(input_tensor, units, name):
    return GPlasticDense(units, activation='tanh', name=f'hybrid_synapse_gplastic_{name}')(input_tensor)

# Feature extraction
x = GPlasticConv2D(128, 3, activation='relu', padding='same')(s1_image_input)
f1 = extract_and_hybrid(Flatten()(x), 512, 'image')
s7_embedded = Embedding(input_dim=VOCAB_SIZE, output_dim=128, input_length=SEQ_LEN)(s7_text_seq_input)
f7 = extract_and_hybrid(GlobalMaxPooling1D()(s7_embedded), 256, 'text')
f3 = extract_and_hybrid(LSTM(256)(s3_lstm_input), 512, 'lstm')
f4 = extract_and_hybrid(GRU(256)(s4_gru_input), 512, 'gru')
f5 = extract_and_hybrid(Bidirectional(LSTM(256))(s5_bidirectional_input), 512, 'bidir')
s6_td_output = TimeDistributed(GPlasticDense(128, activation='relu'))(s6_timedistributed_input)
f6 = extract_and_hybrid(GlobalMaxPooling1D()(s6_td_output), 256, 'timedist')
f2 = extract_and_hybrid(s2_structured_data_input, 256, 'structured')
f8 = extract_and_hybrid(Flatten()(s8_transformer_input), 512, 'transformer')
f9_conv = GPlasticConv2D(64, 3, activation='relu')(s9_small_cnn_input)
f9 = extract_and_hybrid(Flatten()(f9_conv), 256, 'small_cnn')
f10_conv = GPlasticConv2D(128, 3, activation='relu')(s10_deep_cnn_input)
f10 = extract_and_hybrid(Flatten()(f10_conv), 512, 'deep_cnn')
f11_conv = GPlasticConv2D(64, 3, activation='relu')(s11_conv_ae_input)
f11 = extract_and_hybrid(Flatten()(f11_conv), 256, 'conv_ae')
f12 = extract_and_hybrid(s12_fnn_input_100, 256, 'fnn_100')
f13 = extract_and_hybrid(s13_fnn_input_256, 512, 'fnn_256')
f14 = extract_and_hybrid(s14_vae_latent_input, 512, 'vae_latent')
f15 = extract_and_hybrid(s15_rbm_feature_input, 256, 'rbm_feature')
f16 = extract_and_hybrid(s16_graph_input_flat, 512, 'graph_flat')
f17 = extract_and_hybrid(s17_attention_vector_input, 256, 'attention_vec')
f18 = extract_and_hybrid(s18_custom_encoder_input, 512, 'custom_encoder')
f19 = extract_and_hybrid(s19_residual_fwd_input, 512, 'residual_fwd')
f20 = extract_and_hybrid(s20_residual_bwd_input, 512, 'residual_bwd')
f21 = extract_and_hybrid(s21_residual_final_input, 512, 'residual_final')

all_features_raw = [f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15, f16, f17, f18, f19, f20, f21]

# RSA projections
PROJECTED_FEATURE_DIM = 512
projected_features = [GPlasticDense(PROJECTED_FEATURE_DIM, activation='relu', name=f'rsa_proj_gplastic_{i}')(f) for i, f in enumerate(all_features_raw)]

rsa_features, rsa_weights, rsa_scores = RelationalSelfAttention(num_features=21, feature_dim=PROJECTED_FEATURE_DIM, name='relational_self_attention_unit')(projected_features)

ventral_indices = [0, 6, 7, 8, 9, 10, 1, 11, 12, 13, 14, 16]
dorsal_indices = [2, 3, 4, 5, 15, 17, 18, 19, 20]

rsa_ventral = [rsa_features[i] for i in ventral_indices]
rsa_dorsal = [rsa_features[i] for i in dorsal_indices]

---

## Cell 8: Core Fusion, GRU Recurrence, and HLS Injection

This cell marks the transition from multimodal perception to **cognitive working memory**:

1.  **Ventral/Dorsal Fusion:** The separate ventral and dorsal streams are concatenated and projected into the $\text{FRONTAL\_LOBE\_UNITS}$ space.
2.  **ERM Fusion and GRU Recurrence:** The fused features are concatenated with the **Retrieved State** from $\text{ERM}$ and passed into the $\text{GPlasticGRU}$. This forms the **Current Working Memory State** (the core recurrent state).
3.  **HLS Injection:** The current working memory state is then immediately used to update the $\text{Hyper-Latent State}$ via the $\text{HLS\_Processor}$.

In [None]:
# Cell 8 ‚Äî Fusion, GRU recurrence, HLS injection
ventral_features = Concatenate(name='ventral_path_fusion_rsa')(rsa_ventral)
ventral_features = GPlasticDense(FRONTAL_LOBE_UNITS, name='ventral_synapse_gplastic')(ventral_features)

dorsal_features = Concatenate(name='dorsal_path_fusion_rsa')(rsa_dorsal)
dorsal_features = GPlasticDense(FRONTAL_LOBE_UNITS, name='dorsal_synapse_gplastic')(dorsal_features)

fusion_features = Concatenate(name='core_fusion')([ventral_features, dorsal_features])
fusion_features = GPlasticDense(FRONTAL_LOBE_UNITS, name='core_pfc_input_synapse')(fusion_features)
fusion_features = BatchNormalization(name='pfc_input_bn')(fusion_features)

fused_with_memory = Concatenate(name='fusion_with_erm')([fusion_features, retrieved_state])

# HTSP placeholders (kept as Inputs so model signature matches your design)
htsp_fast_state = Input(shape=(FRONTAL_LOBE_UNITS,), name='htsp_fast_state')
htsp_slow_state_h = Input(shape=(FRONTAL_LOBE_UNITS,), name='htsp_slow_state_h')
htsp_slow_state_c = Input(shape=(FRONTAL_LOBE_UNITS,), name='htsp_slow_state_c')

# GRU recurrence (plastic)
plastic_gru_output, current_working_memory_state = GPlasticGRU(FRONTAL_LOBE_UNITS, name='erm_gplastic_gru')(tf.expand_dims(fused_with_memory, axis=1))

# HLS injection
hls_vector_raw = HLS_Processor(name='hls_processor')(current_working_memory_state)
hls_decoded = HLS_Decoder(name='hls_decoder')(hls_vector_raw)

---

## Cell 9: The Final Executive and Predictive Heads

This is the main orchestration cell where all executive functions are executed and the final outputs are generated:

* **Executive Flow:** The working memory state passes through the $\text{MLC, BG Selector, AKL, and CGW}$ to generate the highly-filtered $\text{PFC Gated Output}$.
* **Predictive Working Memory (PWM):** The $\text{PFC}$ output drives the main **predictive heads**: predicting the **next internal state**, the **next reward**, and the **next causal state** ($\text{CIM}$). These predictions drive the primary self-supervised learning signal.
* **Model Assembly:** The `tcs_net_core` model is formally compiled, mapping all 25 inputs to the 10 required outputs (including all control/meta-learning vectors).

In [None]:
# --- Cell 9 ‚Äî MCC, BG, CGW, PFC gating, predictions ---

# Meta-Cognitive Control (MCC)
mcc_attention_budget = GPlasticDense(1, activation='sigmoid', name='mcc_attention_budget_gplastic')(current_working_memory_state)
mcc_confidence = GPlasticDense(1, activation='sigmoid', name='mcc_confidence_gplastic')(current_working_memory_state)

# Meta-Learning Control (MLC)
mlc_output = MetaLearningControl(name='meta_learning_control_head')(current_working_memory_state)
mlc_plasticity_rate = mlc_output[:, 0:1]

# Basal Ganglia (BG) and Axiomatic Knowledge Integration
bg_context_mask = BasalGangliaSelectionLayer(NUM_PFC_CONTEXTS, name='basal_ganglia_selector')(
    current_working_memory_state, task_vector_input
)
axiomatic_knowledge_vector = AxiomaticKnowledgeLayer(name='axiomatic_knowledge_layer')(bg_context_mask)

# Conscious Global Workspace (CGW)
scaled_symbolic_bias = Multiply()([symbolic_bias_input, mcc_attention_budget])
scaled_vigilance = Multiply()([snn_vigilance_input, mcc_attention_budget])

cgw_output = CGWAttentionLayer(name='conscious_global_workspace')(
    plastic_gru_output,
    scaled_symbolic_bias,
    scaled_vigilance,
    mcc_confidence,
    hls_decoded,
    axiomatic_knowledge_vector
)

# Prefrontal Cortex (PFC) Executive Gating
pfc_gated_output = MultiContextExecutiveGating(
    FRONTAL_LOBE_UNITS, NUM_PFC_CONTEXTS, name='executive_gating_pfc_output'
)(cgw_output, bg_context_mask)

# PWM Heads (Predictive Working Memory Outputs)
PWM_STATE_DIM = FRONTAL_LOBE_UNITS * 2 + CAUSAL_STATE_DIM

predicted_next_state = GPlasticDense(
    PWM_STATE_DIM, activation='sigmoid', name='pwm_next_state_prediction_gplastic'
)(pfc_gated_output)

predicted_next_reward = GPlasticDense(
    1, activation='tanh', name='pwm_next_reward_prediction_scalar_gplastic'
)(pfc_gated_output)

# Causal Inference Module
predicted_causal_state = CausalInferenceModule(name='causal_inference_module')(
    pfc_gated_output, retrieved_causal_context
)

# Final Outputs
final_classification_output = GPlasticDense(
    NUM_CLASSES, activation='softmax', name='classification_output_gplastic'
)(pfc_gated_output)

final_confidence_output = mcc_confidence

# --- Assemble Final Core Model ---
tcs_net_core = Model(
    inputs=all_inputs,
    outputs=[
        final_classification_output,
        predicted_next_state,
        predicted_next_reward,
        bg_context_mask,
        final_confidence_output,
        rsa_weights,
        predicted_causal_state,
        hls_vector_raw,
        mlc_plasticity_rate,
        axiomatic_knowledge_vector,
    ],
    name='Temporal_Causal_Synthesis_Network_Core'
)

---

## Cell 10: The TCS-25 Learner, Loss Functions, and Training Driver

The final cell defines the overarching $\text{TCS\_GeneralIntelligence}$ model, the $\text{Learner}$:

1.  **Custom `train_step`:** The heart of the architecture is the custom $\text{train\_step}$ method. This is where the $\text{G-Plasticity}$ updates are calculated and applied directly to the gradients, and the $\text{MLC}$ rate, $\text{Surprisal}$, and $\text{Causal}$ signals are injected.
2.  **Self-Supervised Temporal Contrastive Loss ($\text{L}_{\text{SSTC}}$):** A critical loss function that ensures the predicted causal state and axiomatic knowledge are temporally consistent with the next true state, forcing the model to learn continuity.
3.  **Execution Driver:** The final script instantiates the model, generates **dummy data** (matching the 25 required inputs and 7 targets), compiles the custom metrics (including $\text{Surprisal}$ and $\text{Causal}$ magnitude), and executes a small training and inference demo.

In [None]:
# Cell 10 ‚Äî Learner, data generator, training driver, inference
class TCS_GeneralIntelligence(Model):
    """TCS-25: Learner with SSTC and plasticity handling."""
    def __init__(self, tcs_net_core_model, snn_neuron, rule_reasoner, erm_buffer):
        super().__init__()
        self.tcs_net_core = tcs_net_core_model
        self.snn = snn_neuron
        self.rule_reasoner = rule_reasoner
        self.erm_buffer = erm_buffer
        self.sru = GPlasticDense(1, activation='sigmoid', name='sru_dopamine_gate')

        self.classification_loss_fn = tf.keras.losses.CategoricalCrossentropy(from_logits=False)
        self.state_prediction_loss_fn = tf.keras.losses.MeanSquaredError(name='state_prediction_mse_loss')
        self.reward_prediction_loss_fn = tf.keras.losses.MeanSquaredError(name='reward_prediction_mse_loss')
        self.confidence_loss_fn = tf.keras.losses.MeanSquaredError(name='confidence_mse_loss')
        self.causal_loss_fn = tf.keras.losses.MeanSquaredError(name='causal_mse_loss')
        self.hls_contrastive_loss_fn = tf.keras.losses.CosineSimilarity(axis=-1, name='hls_contrastive_loss')
        self.mlc_reg_loss_fn = tf.keras.losses.MeanSquaredError(name='mlc_reg_loss')

        self.temporal_causal_contrastive_loss = tf.keras.losses.CosineSimilarity(axis=-1, name='sstc_loss')

        self.plasticity_targets = []
        for layer in tcs_net_core_model.layers:
            if isinstance(layer, GPlasticityMixin):
                self.plasticity_targets.append(layer)
            if hasattr(layer, 'proj_input') and isinstance(layer.proj_input, GPlasticDense):
                self.plasticity_targets.append(layer.proj_input)
            if isinstance(layer, HLS_Processor) and isinstance(layer.latent_dense, GPlasticDense):
                self.plasticity_targets.append(layer.latent_dense)
            if isinstance(layer, Dense) and hasattr(layer, 'w_old'):
                self.plasticity_targets.append(layer)

    def compile(self, optimizer, loss, metrics=None):
        if metrics is None: metrics = []
        metrics.extend([
            tf.keras.metrics.Mean(name='classification_loss'),
            tf.keras.metrics.Mean(name='state_prediction_loss'),
            tf.keras.metrics.Mean(name='causal_prediction_loss'),
            tf.keras.metrics.Mean(name='hls_contrastive_loss'),
            tf.keras.metrics.Mean(name='sstc_loss_mean'),
            tf.keras.metrics.Mean(name='mlc_rate_mean'),
            tf.keras.metrics.Mean(name='surprisal_update_mag'),
            tf.keras.metrics.Mean(name='causal_update_mag')
        ])
        super().compile(optimizer=optimizer, loss=loss, metrics=metrics)

    @tf.function
    def train_step(self, data):
        (x_data, y_targets) = data
        x_inputs = x_data[:21]
        x_snn_vigilance_raw = x_data[21]
        x_symbolic_bias_raw = x_data[22]
        x_task_vector = x_data[23]

        y_true_classification = y_targets[0]
        y_true_next_state = y_targets[1]
        y_true_next_reward = y_targets[2]
        y_true_confidence = y_targets[3]
        y_true_causal_state = y_targets[4]
        y_true_next_hls = y_targets[5]
        y_true_temporal_causal = y_targets[6]

        query_vector = x_inputs[18][0:1]
        retrieved_mem_causal = self.erm_buffer.retrieve_context(query_vector)
        x_retrieved_mem_causal = tf.tile(retrieved_mem_causal, [tf.shape(x_inputs[0])[0], 1])

        total_surprisal_mag = tf.constant(0.0)
        total_causal_mag = tf.constant(0.0)

        with tf.GradientTape() as tape:
            core_inputs = x_inputs + [x_snn_vigilance_raw, x_symbolic_bias_raw, x_task_vector, x_retrieved_mem_causal]
            y_pred_classification, y_pred_state, y_pred_reward, context_mask, y_pred_confidence, rsa_weights, y_pred_causal, y_pred_hls, mlc_plasticity_rate, axiomatic_vector = self.tcs_net_core(core_inputs, training=True)

            state_prediction_loss = self.state_prediction_loss_fn(y_true_next_state, y_pred_state)
            reward_prediction_loss = self.reward_prediction_loss_fn(y_true_next_reward, y_pred_reward)
            causal_prediction_loss = self.causal_loss_fn(y_true_causal_state, y_pred_causal)
            classification_loss = self.classification_loss_fn(y_true_classification, y_pred_classification)
            confidence_loss = self.confidence_loss_fn(y_true_confidence, y_pred_confidence)
            rsa_sparsity_loss = tf.reduce_mean(tf.norm(rsa_weights, ord=1, axis=-1)) * 0.0005

            hls_contrastive_loss = (1 + self.hls_contrastive_loss_fn(y_true_next_hls, y_pred_hls)) * 0.005
            mlc_reg_loss = tf.reduce_mean(mlc_plasticity_rate) * 0.0001
            sstc_loss = (1 + self.temporal_causal_contrastive_loss(y_true_temporal_causal, Concatenate()([y_pred_causal, axiomatic_vector]))) * 0.01

            total_loss = classification_loss \
                       + (LOSS_WEIGHT_SSTC * state_prediction_loss) \
                       + (LOSS_WEIGHT_SSTC * reward_prediction_loss) \
                       + (LOSS_WEIGHT_SSTC * causal_prediction_loss) \
                       + confidence_loss \
                       + rsa_sparsity_loss \
                       + hls_contrastive_loss \
                       + mlc_reg_loss \
                       + sstc_loss

        mean_pred_error = tf.expand_dims(tf.reduce_mean(state_prediction_loss + reward_prediction_loss), axis=-1)
        sru_input_enhanced = Concatenate()([x_snn_vigilance_raw[0:1], mean_pred_error, y_pred_confidence[0:1]])
        global_surprisal_rate = self.sru(sru_input_enhanced)
        global_causal_rate = tf.expand_dims(tf.reduce_mean(causal_prediction_loss + sstc_loss), axis=-1)

        trainable_vars = self.trainable_variables
        gradients = tape.gradient(total_loss, trainable_vars)

        for layer in self.plasticity_targets:
            # set neuromodulatory signals (tensors)
            layer.dynamic_plasticity_strength = mlc_plasticity_rate
            layer.surprisal_signal = global_surprisal_rate
            layer.causal_signal = global_causal_rate

            target_var = None
            if hasattr(layer, 'kernel') and layer.kernel in trainable_vars:
                target_var = layer.kernel
            elif hasattr(layer, 'w') and layer.w in trainable_vars:
                target_var = layer.w

            if target_var is not None:
                delta_p, mag_p = layer.calculate_plasticity_change()
                try:
                    var_index = trainable_vars.index(target_var)
                    if gradients[var_index] is not None:
                        try:
                            gradients[var_index] = gradients[var_index] - delta_p
                        except Exception:
                            gradients[var_index] = gradients[var_index] - tf.cast(delta_p, gradients[var_index].dtype)
                except ValueError:
                    pass

                total_surprisal_mag += mag_p * global_surprisal_rate[0,0]
                total_causal_mag += mag_p * global_causal_rate[0,0]

        self.optimizer.apply_gradients(zip(gradients, trainable_vars))
        self.compiled_metrics.update_state(y_true_classification, y_pred_classification)

        state_to_store = Concatenate(axis=-1)([y_pred_state, y_pred_reward])
        self.erm_buffer.store(state_to_store, context_mask, y_pred_causal)

        return {m.name: m.result() for m in self.metrics}

# Data generator (same as original)
def generate_dummy_data_tcs_net(n_samples, num_classes):
    ts_steps, ts_dim, seq_len, seq_dim, graph_dim = TS_STEPS, TS_DIM, SEQ_LEN, SEQ_DIM, GRAPH_DIM
    X_inputs_data = [
        np.random.rand(n_samples, *IMAGE_SHAPE).astype(np.float32),
        np.random.rand(n_samples, DATA_INPUT_SIZE).astype(np.float32),
        np.random.rand(n_samples, ts_steps, ts_dim).astype(np.float32),
        np.random.rand(n_samples, ts_steps, ts_dim).astype(np.float32),
        np.random.rand(n_samples, ts_steps, ts_dim).astype(np.float32),
        np.random.rand(n_samples, ts_steps, ts_dim).astype(np.float32),
        tf.constant(np.random.randint(0, VOCAB_SIZE, size=(n_samples, seq_len))),
        np.random.rand(n_samples, seq_len, seq_dim).astype(np.float32),
        np.random.rand(n_samples, 32, 32, 1).astype(np.float32),
        np.random.rand(n_samples, 64, 64, 3).astype(np.float32),
        np.random.rand(n_samples, 16, 16, 8).astype(np.float32),
        np.random.rand(n_samples, 100).astype(np.float32),
        np.random.rand(n_samples, 256).astype(np.float32),
        np.random.rand(n_samples, 128).astype(np.float32),
        np.random.rand(n_samples, 64).astype(np.float32),
        np.random.rand(n_samples, graph_dim * graph_dim).astype(np.float32),
        np.random.rand(n_samples, 40).astype(np.float32),
        np.random.rand(n_samples, 80).astype(np.float32),
        np.random.rand(n_samples, 512).astype(np.float32),
        np.random.rand(n_samples, 512).astype(np.float32),
        np.random.rand(n_samples, 512).astype(np.float32),
    ]
    X_controls = [
        np.random.rand(n_samples, 1).astype(np.float32),
        np.random.rand(n_samples, 1).astype(np.float32),
        np.random.rand(n_samples, 16).astype(np.float32)
    ]
    X_inputs_all = X_inputs_data + X_controls
    Y_classification = tf.one_hot(np.random.randint(0, num_classes, n_samples), depth=num_classes)
    PWM_STATE_DIM = FRONTAL_LOBE_UNITS * 2 + CAUSAL_STATE_DIM
    Y_true_next_state = np.random.rand(n_samples, PWM_STATE_DIM).astype(np.float32)
    Y_true_next_reward = np.random.rand(n_samples, 1).astype(np.float32)
    Y_true_confidence = 1.0 - np.random.rand(n_samples, 1).astype(np.float32) * 0.5
    Y_true_causal_state = np.random.rand(n_samples, CAUSAL_STATE_DIM).astype(np.float32)
    Y_true_next_hls = np.random.rand(n_samples, HYPER_LATENT_DIM).astype(np.float32)
    SSTC_TARGET_DIM = CAUSAL_STATE_DIM + AXIOMATIC_DIM
    Y_true_temporal_causal = np.random.rand(n_samples, SSTC_TARGET_DIM).astype(np.float32)
    Y_targets = [Y_classification, Y_true_next_state, Y_true_next_reward, Y_true_confidence, Y_true_causal_state, Y_true_next_hls, Y_true_temporal_causal]
    return X_inputs_all, Y_targets

# Execution driver (small demo; may be heavy in Colab)
N_SAMPLES = 1024
EPOCHS = 3
print("\nüåå --- ASSEMBLING TEMPORAL CAUSAL SYNTHESIS NETWORK (TCS-25) --- üåå")

try:
    tcs_net_core = build_tcs_net_core_model(IMAGE_SHAPE, DATA_INPUT_SIZE, TS_STEPS, TS_DIM, SEQ_LEN, SEQ_DIM, GRAPH_DIM, NUM_CLASSES)
    class LIFNeuron: pass
    class RuleBasedReasoner: pass
    snn_neuron_instance = LIFNeuron()
    rule_reasoner_instance = RuleBasedReasoner()
    erm_buffer_instance = EpisodicRelationalMemory()

    tcs_net_model = TCS_GeneralIntelligence(tcs_net_core, snn_neuron_instance, rule_reasoner_instance, erm_buffer_instance)

    tcs_net_model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0003), loss=tf.keras.losses.CategoricalCrossentropy(), metrics=[tf.keras.metrics.CategoricalAccuracy()])

    X_inputs_all, Y_targets = generate_dummy_data_tcs_net(N_SAMPLES, NUM_CLASSES)

    print(f"\n--- Starting TCS-25 Training ({N_SAMPLES} Samples, {EPOCHS} Epochs) ---")
    history = tcs_net_model.fit(X_inputs_all, Y_targets, epochs=EPOCHS, batch_size=TRAINING_BATCH_SIZE, verbose=1)
    print("\n‚úÖ Training Complete. TCS-25 is functional and hyper-plastic.")
except NameError as e:
    print(f"üö® ERROR: A required constant or class is missing. Error: {e}")
except Exception as e:
    print(f"üö® ERROR during Training: {e}")
    print(traceback.format_exc())

# Minimal inference demo
print("\n--- Executing Minimal TCS-25 Inference Cycle ---")
try:
    X_test_all, _ = generate_dummy_data_tcs_net(1, NUM_CLASSES)
    query_vector = X_test_all[18][0:1]
    X_test_erm_retrieval = erm_buffer_instance.retrieve_context(query_vector)
    X_test_final = X_test_all + [X_test_erm_retrieval.numpy()]
    results = tcs_net_core.predict(X_test_final, verbose=0)
    final_output, pred_state, pred_reward, context_mask, pred_confidence, rsa_weights, pred_causal, pred_hls, mlc_rate, axiomatic_vector = results

    try:
        surprisal_metric = tcs_net_model.get_metric('surprisal_update_mag')
        causal_metric = tcs_net_model.get_metric('causal_update_mag')
        sstc_metric = tcs_net_model.get_metric('sstc_loss_mean')
    except Exception:
        surprisal_metric = causal_metric = sstc_metric = None

    surprisal_mag = surprisal_metric.result().numpy() if surprisal_metric is not None else 0.0
    causal_mag = causal_metric.result().numpy() if causal_metric is not None else 0.0
    sstc_loss = sstc_metric.result().numpy() if sstc_metric is not None else 0.0

    attended_indices = np.argsort(rsa_weights[0])[-3:][::-1]

    print("\n====================================================")
    print("  TEMPORAL CAUSAL SYNTHESIS NETWORK (TCS-25) RESULTS ")
    print("====================================================")
    print(f"   -> Raw Predicted Class: {np.argmax(final_output[0])}   (Prob: {np.max(final_output[0]):.4f})")
    print(f"   -> Predicted Next Reward (PWM): {pred_reward[0,0]:.4f}")
    print(f"   -> Inferred Winning Context (BG): Context {np.argmax(context_mask[0])}")
    print(f"   -> Predicted System Confidence (MCC Output): {pred_confidence[0,0]:.4f}")
    print(f"   -> Predicted Causal State (CIM Output): Mean: {np.mean(pred_causal[0]):.4f}")
    print(f"   -> Axiomatic Knowledge Vector (AKE): L2 Norm: {np.linalg.norm(axiomatic_vector[0]):.4f}")
    print(f"   -> Dynamic Plasticity Rate (MLC Output): {mlc_rate[0,0]:.6f}")
    print(f"   -> HLS L1 Norm (Sparsity Check): {np.linalg.norm(pred_hls[0], ord=1):.4f}")
    print(f"   -> SSTC Loss (Temporal Consistency Check): {sstc_loss:.6f}")
    print(f"   -> G-Plasticity Surprisal Update Magnitude (Avg.): {surprisal_mag:.4f}")
    print(f"   -> G-Plasticity Causal Update Magnitude (Avg.): {causal_mag:.4f}")
    print(f"   -> Top 3 Attended Feature Streams (RSA): Index {attended_indices[0]} ({rsa_weights[0, attended_indices[0]]:.4f}), Index {attended_indices[1]} ({rsa_weights[0, attended_indices[1]]:.4f})")
    print("====================================================")
except Exception as e:
    print(f"‚ùå ERROR during TCS-25 Inference Demo: {e}")
    print(traceback.format_exc())