# üß† MMABA-PSEUDO: Mamba 2 Neural Memory Benchmark

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dawsonblock/MMABA-PSEUDO/blob/main/MMABA_Colab.ipynb)

Train **Mamba 2 + PseudoMode Memory** on long-horizon RL tasks.

| Task | Description | Difficulty |
|:--|:--|:--|
| `delayed_cue` | Remember signal for 200 steps | Medium |
| `copy_memory` | Memorize and reproduce sequence | Hard |
| `assoc_recall` | Learn key‚Üívalue pairs | Hard |
| `tmaze` | Navigate using start hint | Very Hard |

**Expected runtime**: ~1-2 hours on T4 GPU

---
## 1Ô∏è‚É£ Environment Setup

In [None]:
#@title Check GPU Availability { display-mode: "form" }
import torch

if torch.cuda.is_available():
    gpu_name = torch.cuda.get_device_name(0)
    gpu_memory = torch.cuda.get_device_properties(0).total_memory / 1e9
    print(f"‚úÖ GPU Available: {gpu_name}")
    print(f"   Memory: {gpu_memory:.1f} GB")
else:
    print("‚ùå No GPU detected!")
    print("   Go to Runtime ‚Üí Change runtime type ‚Üí Select T4 GPU")
    raise RuntimeError("GPU required for efficient training")

In [None]:
#@title Clone Repository { display-mode: "form" }
import os

REPO_URL = "https://github.com/dawsonblock/MMABA-PSEUDO.git"
REPO_NAME = "MMABA-PSEUDO"

if os.path.exists(REPO_NAME):
    print(f"üìÅ Repository already exists, pulling latest...")
    %cd {REPO_NAME}
    !git pull
else:
    print(f"üì• Cloning {REPO_URL}...")
    !git clone {REPO_URL}
    %cd {REPO_NAME}

print(f"\n‚úÖ Working directory: {os.getcwd()}")

In [None]:
#@title Install Dependencies { display-mode: "form" }
print("üì¶ Installing dependencies...")
!pip install -q torch numpy wandb einops triton

print("\nüì¶ Installing Mamba SSM...")
%cd mamba-main
!pip install -q -e . 2>&1 | tail -5
%cd ..

# Verify installation
import torch
try:
    from mamba_ssm.modules.mamba2 import Mamba2
    print("\n‚úÖ Mamba SSM installed successfully")
except ImportError as e:
    print(f"\n‚ö†Ô∏è Mamba import warning: {e}")
    print("   Falling back to compatibility mode...")

print(f"\nüìä PyTorch: {torch.__version__}")
print(f"   CUDA: {torch.version.cuda}")

---
## 2Ô∏è‚É£ Configuration

In [None]:
#@title Training Configuration { display-mode: "form" }

#@markdown ### Task Settings
TASK = "delayed_cue" #@param ["delayed_cue", "copy_memory", "assoc_recall", "tmaze"]
CONTROLLER = "mamba" #@param ["mamba", "gru"]
HORIZON = 200 #@param {type:"integer"}

#@markdown ### Training Settings
TOTAL_UPDATES = 2000 #@param {type:"integer"}
NUM_ENVS = 64 #@param {type:"integer"}
ROLLOUT_LENGTH = 256 #@param {type:"integer"}

#@markdown ### PPO Hyperparameters
LEARNING_RATE = 3e-4 #@param {type:"number"}
ENT_COEF = 0.05 #@param {type:"number"}
CLIP_COEF = 0.2 #@param {type:"number"}
GAMMA = 0.99 #@param {type:"number"}

#@markdown ### Model Architecture
HIDDEN_SIZE = 128 #@param {type:"integer"}
MEMORY_SLOTS = 16 #@param {type:"integer"}
MEMORY_DIM = 64 #@param {type:"integer"}

#@markdown ### Logging
LOG_INTERVAL = 50 #@param {type:"integer"}
USE_WANDB = False #@param {type:"boolean"}
WANDB_PROJECT = "neural-memory-suite" #@param {type:"string"}
RUN_NAME = "mamba_colab" #@param {type:"string"}

# Build command
CMD = f"""python3 neural_memory_long_ppo.py \\
    --task {TASK} \\
    --controller {CONTROLLER} \\
    --device cuda \\
    --horizon {HORIZON} \\
    --num-envs {NUM_ENVS} \\
    --rollout-length {ROLLOUT_LENGTH} \\
    --total-updates {TOTAL_UPDATES} \\
    --learning-rate {LEARNING_RATE} \\
    --ent-coef {ENT_COEF} \\
    --clip-coef {CLIP_COEF} \\
    --gamma {GAMMA} \\
    --hidden-size {HIDDEN_SIZE} \\
    --memory-slots {MEMORY_SLOTS} \\
    --memory-dim {MEMORY_DIM} \\
    --log-interval {LOG_INTERVAL}"""

if USE_WANDB:
    CMD += f" \\
    --track \\
    --wandb-project {WANDB_PROJECT} \\
    --run-name {RUN_NAME}"

print("üìã Training Command:")
print(CMD)

---
## 3Ô∏è‚É£ WandB Login (Optional)

In [None]:
#@title Login to Weights & Biases { display-mode: "form" }
#@markdown Run this cell if you enabled `USE_WANDB` above.

if USE_WANDB:
    import wandb
    wandb.login()
    print("‚úÖ WandB logged in!")
else:
    print("‚ÑπÔ∏è WandB logging disabled. Set USE_WANDB=True to enable.")

---
## 4Ô∏è‚É£ Run Training

In [None]:
#@title Start Training { display-mode: "form" }
print(f"üöÄ Starting {CONTROLLER.upper()} training on {TASK}...")
print(f"   Config: {NUM_ENVS} envs √ó {ROLLOUT_LENGTH} steps √ó {TOTAL_UPDATES} updates")
print(f"   Total samples: {NUM_ENVS * ROLLOUT_LENGTH * TOTAL_UPDATES:,}")
print("\n" + "="*60 + "\n")

!{CMD}

---
## 5Ô∏è‚É£ Benchmark Suite (All Tasks)

In [None]:
#@title Run Full Benchmark Suite { display-mode: "form" }
#@markdown This runs all 4 tasks sequentially (~4-6 hours total)

RUN_BENCHMARK = False #@param {type:"boolean"}

if RUN_BENCHMARK:
    print("üèÉ Running full benchmark suite...")
    !python3 neural_memory_long_ppo.py \
        --benchmark-suite \
        --controller {CONTROLLER} \
        --device cuda \
        --num-envs 64 \
        --total-updates 2000 \
        --ent-coef 0.05
else:
    print("‚ÑπÔ∏è Benchmark suite disabled. Set RUN_BENCHMARK=True to run all tasks.")

---
## 6Ô∏è‚É£ Compare Controllers (Mamba vs GRU)

In [None]:
#@title Run Controller Comparison { display-mode: "form" }
#@markdown Train both Mamba and GRU on the same task for comparison

RUN_COMPARISON = False #@param {type:"boolean"}
COMPARISON_UPDATES = 500 #@param {type:"integer"}

if RUN_COMPARISON:
    print("üî¨ Running controller comparison...")
    
    print("\n" + "="*40)
    print("Training MAMBA controller...")
    print("="*40)
    !python3 neural_memory_long_ppo.py \
        --task {TASK} \
        --controller mamba \
        --device cuda \
        --num-envs 64 \
        --total-updates {COMPARISON_UPDATES} \
        --ent-coef 0.05 \
        --log-interval 50
    
    print("\n" + "="*40)
    print("Training GRU controller...")
    print("="*40)
    !python3 neural_memory_long_ppo.py \
        --task {TASK} \
        --controller gru \
        --device cuda \
        --num-envs 64 \
        --total-updates {COMPARISON_UPDATES} \
        --ent-coef 0.05 \
        --log-interval 50
    
    print("\n‚úÖ Comparison complete!")
else:
    print("‚ÑπÔ∏è Comparison disabled. Set RUN_COMPARISON=True to compare Mamba vs GRU.")

---
## 7Ô∏è‚É£ Quick Analysis

In [None]:
#@title Analyze Results { display-mode: "form" }
print("üìä Training completed!")
print("\nKey metrics to look for:")
print("  - Return: Should trend toward 1.0 (perfect score)")
print("  - GateMean: Should be ~0.01-0.05 (sparse memory usage)")
print("  - KL: Should stay below 0.01 (stable training)")
print("\nIf using WandB, view detailed charts at: https://wandb.ai")