# 02 — Memory Architecture (MAC)
This notebook introduces a small Neural Memory module and the Surprise-driven `memorize()` update, inspired by Titans + MIRAS. The memory produces a soft prompt vector conditioned on recent context and learns online via gradients at inference time.

# Theory: Memory as Context (MAC)
Memory emits a small vector that is concatenated or prepended to the LLM input (soft prompt). The module learns to predict task-relevant features from recent hidden states. Surprise is measured by prediction error (MSE). High surprise triggers stronger learning; low surprise decays.

Key pieces:
- Input: recent hidden/context vector x
- Output: soft prompt vector p
- Surprise: L(x, y) = ||f(x) − y||^2
- Online update: θ ← θ − η ∇θ L
- Recall: use f(x) as a soft prompt for the next LLM step

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F

class NeuralMemory(nn.Module):
    def __init__(self, input_dim: int, hidden_dim: int, output_dim: int, lr: float = 1e-3, device: str = None):
        super().__init__()
        self.input_dim = input_dim
        self.hidden_dim = hidden_dim
        self.output_dim = output_dim
        self.device = torch.device(device) if device else torch.device("cuda" if torch.cuda.is_available() else "cpu")
        self.net = nn.Sequential(
            nn.Linear(input_dim, hidden_dim),
            nn.GELU(),
            nn.Linear(hidden_dim, output_dim),
        )
        self.to(self.device)
        self.optim = torch.optim.AdamW(self.parameters(), lr=lr)
        self.loss_fn = nn.MSELoss()

    @torch.no_grad()
    def recall(self, x: torch.Tensor) -> torch.Tensor:
        x = x.to(self.device)
        p = self.net(x)
        return p.detach()

    def memorize(self, x: torch.Tensor, y: torch.Tensor) -> float:
        x = x.to(self.device)
        y = y.to(self.device)
        pred = self.net(x)
        loss = self.loss_fn(pred, y)
        self.optim.zero_grad()
        loss.backward()
        self.optim.step()
        return float(loss.item())

# small helper to initialize device
device = "cuda" if torch.cuda.is_available() else "cpu"
print("NeuralMemory device:", device)

In [None]:
# Synthetic test: learn a simple linear mapping online
import math, random
torch.manual_seed(42)
random.seed(42)

in_dim, hid_dim, out_dim = 128, 64, 128
mem = NeuralMemory(in_dim, hid_dim, out_dim, lr=1e-2, device=device)

# ground truth mapping (unknown to memory)
W_true = torch.randn(in_dim, out_dim, device=device) * 0.5

def sample_xy(batch=32):
    x = torch.randn(batch, in_dim, device=device)
    y = x @ W_true
    return x, y

steps = 200
log_every = 40
losses = []
for t in range(1, steps + 1):
    x, y = sample_xy(batch=64)
    loss = mem.memorize(x, y)
    losses.append(loss)
    if t % log_every == 0:
        print(f"step {t:3d}  loss {loss:.6f}")

# verify recall quality on fresh batch
x_test, y_test = sample_xy(batch=16)
p = mem.recall(x_test)
mse = F.mse_loss(p, y_test).item()
print("test mse:", round(mse, 6))

# Next steps
Integrate this memory into the hybrid engine (Notebook 3) to emit soft prompts from LLM hidden states and adapt online using the Surprise signal.