# Hybrid Memory Demo — Notebook

This notebook walks through a tiny but complete hybrid-memory workflow, step by step, following the ideas in the [accompanying article](https://principia-agentica.io/blog/2025/09/19/memory-in-agents-episodic-vs-semantic-and-the-hybrid-that-works/).

What you'll see:
- Two memories:
  - Episodic: last turns and tool outputs (append-only log).
  - Semantic: small policy snippets indexed with a tiny offline encoder.
- A Hybrid Retriever that merges both, annotates provenance, and trims by a token budget.
- Two toy tools: `lookup_user` and `reset_password`.
- A minimal `Agent` that orchestrates: log → retrieve → (maybe) tool → retrieve → answer.
- Minimal traces written as JSONL and visualized at the end.

How to run:
- From the repo root: `just notebook` (opens Jupyter Lab in the examples folder).
- Then open this notebook and Run All.

This is fully offline and deterministic; great for a quick live walkthrough.


## 1) Setup: imports and paths
We add the project `src/` to `sys.path` so imports work smoothly when running from `examples/`.


In [1]:
from pathlib import Path
import sys, json, time

# Robustly locate project root when __file__ is undefined in notebooks
# Strategy: try this file's dir, else CWD, then walk up to find a folder containing 'src'
try:
    _base = Path(__file__).resolve().parent  # type: ignore[name-defined]
except NameError:
    _base = Path.cwd()

candidates = [_base, _base.parent, _base.parent.parent]
ROOT = next((p for p in candidates if (p / 'src').exists()), _base)
SRC = ROOT / 'src'
if str(SRC) not in sys.path:
    sys.path.append(str(SRC))

print('Project root:', ROOT)
print('Using src at:', SRC)


Project root: /Users/fabricio/Projects/fabricio/hybrid-memory-talk
Using src at: /Users/fabricio/Projects/fabricio/hybrid-memory-talk/src


## 2) Tiny offline encoder and policy seeding
We mirror the CLI demo by seeding a small semantic store from `examples/policies/`. The encoder is a tiny hash sketch — no external services.


In [2]:
import hashlib
from typing import Dict, List

from memory.semantic_store import SemanticStore

class TinyHashEncoder:
    def __init__(self, dim: int = 64):
        self.dim = dim
    def embed(self, text: str):
        d = self.dim
        vec = [0.0] * d
        for tok in (text or '').lower().split():
            h = int(hashlib.md5(tok.encode('utf-8')).hexdigest(), 16)
            for i in range(4):
                idx = (h >> (i * 8)) % d
                sign = 1.0 if ((h >> (i * 2)) & 1) else -1.0
                vec[idx] += sign
        return vec

POLICY_DIR = (ROOT / 'examples' / 'policies')

def read_policies() -> List[Dict]:
    items: List[Dict] = []
    for p in sorted(POLICY_DIR.glob('*.md')):
        text = p.read_text(encoding='utf-8').strip()
        lines = [ln.strip() for ln in text.splitlines() if ln.strip()]
        if not lines:
            continue
        section = lines[0].lstrip('# ')
        body = ' '.join(lines[1:]) if len(lines) > 1 else section
        items.append({
            'id': p.stem,
            'text': body,
            'metadata': {'source': p.name, 'section': section, 'tags': ['policy'], 'pii': False},
        })
    return items

encoder = TinyHashEncoder()
sem = SemanticStore(encoder=encoder)
seeded = []
for item in read_policies():
    sem.upsert(item)
    seeded.append(item['id'])
seeded


['policy_cooldown',
 'policy_password',
 'policy_security',
 'policy_steps',
 'policy_verification']

## 3) Build episodic store, retriever, agent, and tracer
We create the `EpisodicStore`, the `HybridRetriever` (which merges episodic and semantic memory), and the `Agent`.

The tracer writes minimal rows to a notebook-specific JSONL file so you can reload and visualize without clobbering other traces.


In [3]:
from memory.episodic_store import EpisodicStore
from memory.hybrid_retriever import HybridRetriever
from tracing.tracer import Tracer
from agent import Agent

OUT_DIR = ROOT / 'out'
OUT_DIR.mkdir(exist_ok=True)
TRACE_FILE = OUT_DIR / 'traces_nb.jsonl'

epi = EpisodicStore(max_len=50)
retriever = HybridRetriever(episodic=epi, semantic=sem)
tracer = Tracer(path=str(TRACE_FILE))
agent = Agent(retriever=retriever, task_id='support_demo_nb', session='sess_nb', tracer=tracer)

TRACE_FILE


PosixPath('/Users/fabricio/Projects/fabricio/hybrid-memory-talk/out/traces_nb.jsonl')

## 4) First user turn — triggers tools
We ask to reset a password with a concrete email. The agent should:
1. Log the user turn (episodic).
2. Retrieve hybrid context.
3. Plan tool calls (`lookup_user` and if verified `reset_password`).
4. Log tool call and result.
5. Retrieve again and answer with a short, cited response.


In [4]:
q1 = 'Hi, I forgot my password. My email is ana@example.com. Can you reset it?'
a1 = agent.answer(q1)
print(a1)


User asked: Hi, I forgot my password. My email is ana@example.com. Can you reset it?
Key recent events:
- reset_password({'email': 'ana@example.com'})
- reset_password -> {'ok': True, 'token': 'reset_36324b2c36', 'reason': None}
Relevant policy:
- After a successful password reset, users must wait 10 minutes before requesting another. (source: policy_cooldown.md#Cooldown period)
- Customers must verify their email before we can send a password reset link. (source: policy_password.md#Password reset)

Response:
It looks like you want to reset your password. If your email is verified, you'll receive a reset link.

Internal checklist:
- Confirm email on file
- Follow policy step 2 if unverified


### What context did we retrieve?
Below we list the merged context items with their kind and provenance (source).


In [11]:
ctx_after = retriever.retrieve(q1)
for it in ctx_after:
    kind = it.get('kind')
    src = it.get('source', it.get('metadata', {}).get('source'))
    text = it.get('text','')
    print(f'- [{kind}] {src} :: {text[:100]}')


- [episodic] episodic@2025-10-02T19:20:32.528138Z#tool_result :: reset_password -> {'ok': True, 'token': 'reset_36324b2c36', 'reason': None}
- [episodic] episodic@2025-10-02T19:20:32.528533Z#assistant_turn :: User asked: Hi, I forgot my password. My email is ana@example.com. Can you reset it?
Key recent even
- [episodic] episodic@2025-10-02T19:24:03.274029Z#user_turn :: Thanks! What are the steps involved?
- [episodic] episodic@2025-10-02T19:24:03.275956Z#assistant_turn :: User asked: Thanks! What are the steps involved?
Key recent events:
- User asked: Hi, I forgot my pa
- [semantic] policy_cooldown.md#Cooldown period :: After a successful password reset, users must wait 10 minutes before requesting another.
- [semantic] policy_password.md#Password reset :: Customers must verify their email before we can send a password reset link.
- [semantic] policy_security.md#Security :: Never share reset tokens in chat responses. Tokens are secret and should only be sent by email.


## 5) Second user turn — follow up
Now ask for the steps. This likely won't trigger tools, but should pull the relevant policy snippets.


In [12]:
q2 = 'Thanks! What are the steps involved?'
a2 = agent.answer(q2)
print(a2)

ctx2 = retriever.retrieve(q2)
for it in ctx2:
    kind = it.get('kind')
    src = it.get('source', it.get('metadata', {}).get('source'))
    text = it.get('text','')
    print(f'- [{kind}] {src} :: {text[:100]}')


User asked: Thanks! What are the steps involved?
Key recent events:
- User asked: Thanks! What are the steps involved?
Key recent events:
- User asked: Hi, I forgot my password. My email is ana@example.com. Can you reset it?
Key recent events:
- reset_password({'email': 'ana@example.com'})
- reset_password -> {'ok': True, 'token': 'reset_36324b2c36', 'reason': None}
Relevant policy:
- After a successful password reset, users must wait 10 minutes before requesting another. (source: policy_cooldown.md#Cooldown period)
- Customers must verify their email before we can send a password reset link. (source: policy_password.md#Password reset)

Response:
It looks like you want to reset your password. If your email is verified, you'll receive a reset link.

Internal checklist:
- Confirm email on file
- Follow policy step 2 if unverified
- Thanks! What are the steps involved?
Relevant policy:
- Checklist: confirm identity, verify email, send reset link, and log the action per policy. (source: po

## 6) Minimal traces — visualize spans
The tracer writes compact JSONL rows: `{ts, span, input_len, ctx_len, retrieved_ids, output_len, latency_ms}`.
We load them and show a small summary.


In [13]:
def read_jsonl(path: Path):
    rows = []
    if not path.exists():
        return rows
    for line in path.read_text(encoding='utf-8').splitlines():
        line = line.strip()
        if not line:
            continue
        try:
            rows.append(json.loads(line))
        except Exception:
            pass
    return rows

rows = read_jsonl(TRACE_FILE)

# Table-like view with dynamic column widths for neat alignment
last_rows = rows[-10:]

# Pre-format data strings
def _lat_str(v):
    try:
        return f"{float(v):.3f}"
    except Exception:
        return str(v)

data = []
for i, r in enumerate(last_rows):
    data.append({
        "idx": f"{i:02d}",
        "ts": r.get("ts", ""),
        "span": str(r.get("span", "")),
        "ctx": str(r.get("ctx_len", 0)),
        "out": str(r.get("output_len", 0)),
        "latency_ms": _lat_str(r.get("latency_ms", 0)),
        "ids": ",".join(str(x) for x in r.get("retrieved_ids", []))[:40],
    })

cols = ["idx", "ts", "span", "ctx", "out", "latency_ms", "ids"]
# Compute widths as max(header, data)
width = {c: len(c) for c in cols}
for row in data:
    for c in cols:
        width[c] = max(width[c], len(row[c]))

# Print header
header = " | ".join(f"{c:<{width[c]}}" for c in cols)
print(header)
print("-" * len(header))

# Print rows with alignment (left for text, right for numeric)
for row in data:
    line = " | ".join([
        f"{row['idx']:>{width['idx']}}",
        f"{row['ts']:<{width['ts']}}",
        f"{row['span']:<{width['span']}}",
        f"{row['ctx']:>{width['ctx']}}",
        f"{row['out']:>{width['out']}}",
        f"{row['latency_ms']:>{width['latency_ms']}}",
        f"{row['ids']:<{width['ids']}}",
    ])
    print(line)


idx | ts                      | span     | ctx | out  | latency_ms | ids                                     
-------------------------------------------------------------------------------------------------------------
 00 | 2025-10-02T16:00:09.850 | qa       |   7 | 1288 |      0.003 | episodic@2025-10-02T18:59:29.012559Z#too
 01 | 2025-10-02T16:20:32.527 | retrieve |   4 |    0 |      1.661 | episodic@2025-10-02T19:20:32.525858Z#use
 02 | 2025-10-02T16:20:32.528 | retrieve |   6 |    0 |      0.229 | episodic@2025-10-02T19:20:32.525858Z#use
 03 | 2025-10-02T16:20:32.528 | qa       |   6 |  699 |      0.002 | episodic@2025-10-02T19:20:32.525858Z#use
 04 | 2025-10-02T16:24:03.275 | retrieve |   7 |    0 |      1.039 | episodic@2025-10-02T19:20:32.528062Z#too
 05 | 2025-10-02T16:24:03.275 | retrieve |   7 |    0 |      0.195 | episodic@2025-10-02T19:20:32.528062Z#too
 06 | 2025-10-02T16:24:03.275 | qa       |   7 | 1288 |      0.002 | episodic@2025-10-02T19:20:32.528062Z#too
 07 | 2025

### Optional: quick latency bar
A tiny visualization using only the standard library (text-based).


In [14]:
def text_bar(val, max_val, width=40):
    n = 0 if max_val <= 0 else int((val / max_val) * width)
    return '█' * n

latencies = [r.get('latency_ms', 0) for r in rows]
if latencies:
    mx = max(latencies)
    print('Latency bars (ms):')
    for i, v in enumerate(latencies[-10:]):
        print(f"{i:02d} {v:6.1f} ms | {text_bar(v, mx)}")
else:
    print('No latencies recorded yet.')


Latency bars (ms):
00    0.0 ms | 
01    1.7 ms | ███████████████████████████
02    0.2 ms | ███
03    0.0 ms | 
04    1.0 ms | █████████████████
05    0.2 ms | ███
06    0.0 ms | 
07    0.2 ms | ███
08    0.2 ms | ██
09    0.0 ms | 


## Wrap-up
You just saw the full flow: episodic logging, semantic policy retrieval, tool calls, hybrid merge with provenance, and minimal traces.

Try tweaks:
- Edit the policies in `examples/policies/` and re-run the seeding cell.
- Change retrieval defaults via environment variables (see `src/config.py`).
- Ask different questions (with/without email) to see planning change.


### P95 retrieval latency (20 runs)
We can also measure retrieval latency directly by running the retriever multiple times and computing a 95th percentile.


In [15]:
from utils import p95_latency_ms
q_bench = 'password reset steps'
p95_ms = p95_latency_ms(retriever, q_bench, runs=20)
print(f"P95 retrieval latency for '{q_bench}': {p95_ms:.1f} ms")
# Tiny bar (reuse text_bar for consistency)
print('P95 bar:', text_bar(p95_ms, p95_ms))


P95 retrieval latency for 'password reset steps': 0.1 ms
P95 bar: ████████████████████████████████████████
