# üß† Nested Learning Quickstart

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/aryateja2106/nested-learning/blob/main/notebooks/quickstart.ipynb)

**What you'll learn:**
- How to create and run the HOPE model
- Forward and backward pass basics
- Visualize the Continuum Memory System (CMS) update schedule

**Time**: ~2 minutes (CPU) | ~30 seconds (GPU)

> üí° **Tip**: For full training, use `train_hope.py` from the README.

## Step 1: Install Dependencies

Run this cell to install required packages. Skip if already installed.


In [None]:
!pip install -q torch numpy matplotlib gradio


In [None]:
## Step 2: Setup & Device Check

Check your device and import the model.


In [None]:
import torch
import sys
from pathlib import Path

# Add repo root to path
repo_root = Path().resolve().parent if Path().resolve().name == "notebooks" else Path().resolve()
sys.path.insert(0, str(repo_root))

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"‚úÖ Device: {device}")
if torch.cuda.is_available():
    print(f"   GPU: {torch.cuda.get_device_name(0)}")
else:
    print("   Using CPU (works fine for this demo)")


## Step 3: Hello World - Forward Pass

Create a tiny HOPE model and run a forward pass.


In [None]:
from src.models.hope import Hope, HopeConfig

torch.manual_seed(42)

# Tiny config - fits on CPU or small GPUs
config = HopeConfig(
    d_model=64,
    d_hidden=256,
    d_key=16,
    d_value=16,
    num_heads=4,
    num_layers=1,
    vocab_size=256,
    max_seq_len=128,
    cms_num_levels=2,
    cms_base_chunk_size=4,
)

model = Hope(config).to(device)
print(f"‚úÖ Model created: {sum(p.numel() for p in model.parameters()):,} parameters")

# Simple forward pass
input_ids = torch.randint(0, config.vocab_size, (2, 16), device=device)
labels = torch.randint(0, config.vocab_size, (2, 16), device=device)

with torch.no_grad():
    output = model(input_ids, labels=labels)
    print(f"‚úÖ Forward pass successful!")
    print(f"   Loss: {output['loss'].item():.4f}")
    print(f"   Logits shape: {output['logits'].shape}")


## Step 4: Mini Training Loop

Run a quick training step to verify everything works.


In [None]:
optimizer = torch.optim.AdamW(model.parameters(), lr=3e-4)

model.train()
output = model(input_ids, labels=labels)
loss = output["loss"]
loss.backward()
optimizer.step()
optimizer.zero_grad()

print(f"‚úÖ Training step successful!")
print(f"   Loss: {loss.item():.4f}")
print(f"   ‚úì Forward pass")
print(f"   ‚úì Backward pass")
print(f"   ‚úì Optimizer step")


## Step 5: Visualize CMS Update Schedule

The Continuum Memory System updates different levels at different frequencies. Let's visualize this!


In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Get CMS update schedule
num_levels = config.cms_num_levels
base_chunk_size = config.cms_base_chunk_size
total_steps = 64

fig, ax = plt.subplots(figsize=(12, 4))
colors = plt.cm.viridis(np.linspace(0, 1, num_levels))

for level in range(num_levels):
    chunk_size = base_chunk_size * (2 ** level)
    update_steps = list(range(chunk_size, total_steps + 1, chunk_size))
    
    for step in update_steps:
        ax.axvline(x=step, color=colors[level], alpha=0.7, linewidth=2)
    
    ax.plot([], [], color=colors[level], linewidth=2, 
            label=f'Level {level} (updates every {chunk_size} steps)')

ax.set_xlim(0, total_steps)
ax.set_xlabel('Training Step', fontsize=12)
ax.set_ylabel('CMS Update Events', fontsize=12)
ax.set_title('Continuum Memory System: Multi-Scale Update Schedule', fontsize=14)
ax.legend(loc='upper right')
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print("‚úÖ CMS visualization complete!")
print("   High-frequency levels update often (fast adaptation)")
print("   Low-frequency levels update rarely (long-term memory)")


## üéâ What's Next?

**You've successfully run the HOPE model!** Here's what to explore next:

### For Researchers & Developers

1. **Full Training**: Use `train_hope.py` with configs (small/medium/large)
   ```bash
   python train_hope.py --config small --steps 500
   ```

2. **Interactive Demo**: Launch the Gradio demo
   ```bash
   python demo/app.py
   ```

3. **Explore Components**: Check out `src/core/optimizers.py` and `src/core/memory.py`

### Learn More

- üìÑ **Paper**: [Nested Learning PDF](https://abehrouz.github.io/files/NL.pdf)
- üìù **Blog**: [Google Research Blog](https://research.google/blog/introducing-nested-learning-a-new-ml-paradigm-for-continual-learning/)
- üìö **Algorithm Notes**: See `docs/ALGORITHMS.md`

---

**‚≠ê If this helped you, please star the repo!** It helps others discover this implementation.

[GitHub Repository](https://github.com/aryateja2106/nested-learning)
