Skip to content

Commit de57958

Browse files
committed
docs(memory): architecture doc for SessionRetriever
1 parent f1bd977 commit de57958

1 file changed

Lines changed: 57 additions & 0 deletions

File tree

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# Session-Level Hierarchical Retriever
2+
3+
## What this is
4+
5+
`SessionRetriever` implements two-stage hierarchical retrieval at session granularity. It pairs with `SessionSummarizer` (which generates per-session summaries at ingest time) and `SessionSummaryStore` (which indexes those summaries in a dedicated vector collection). At retrieval time, it selects top-K sessions by summary similarity, then takes top-M chunks per selected session from the standard `MemoryStore`.
6+
7+
The effect is a coverage mechanism: by construction, retrieved chunks span multiple distinct sessions. Single-stage retrieval tends to cluster on the single most-relevant session, missing multi-session evidence. This retriever forces diversity without re-ranking hacks.
8+
9+
## Mental model
10+
11+
Parallel to `HydeRetriever` (hypothesis-driven) and `ProspectiveMemoryManager` (time/event-triggered), `SessionRetriever` is a query-time retrieval strategy under `memory/retrieval/`. All three are opt-in; callers wire them up when their use case benefits.
12+
13+
## Two-stage flow
14+
15+
1. **Stage 1**: `summaryStore.querySessions(query, topK=K)` selects the top-K sessions by cosine similarity between the query embedding and the indexed session summaries.
16+
2. **Stage 2**: a single `memoryStore.query(query, topK=K*M*3)` over-fetches candidates so post-filtering has enough per-session representatives.
17+
3. **Post-filter**: keep only traces whose `bench-session:<id>` tag (configurable via `sessionTagPrefix`) matches a Stage-1 session.
18+
4. **Group by session**, take top-M chunks per session (already sorted by cognitive score).
19+
5. **Optional rerank** over the merged pool via an injected `RerankerService` (0.7 cognitive + 0.3 neural blend, matching `CognitiveMemoryManager.retrieve`).
20+
6. **Truncate** to `recallTopK` (default 10).
21+
22+
## Fallbacks
23+
24+
- **Stage 1 empty**: no sessions indexed for the scope. Fall through to plain `memoryStore.query` and return its top-`recallTopK`. Diagnostics tag: `escalations: ['session-retriever:stage1-empty']`.
25+
- **Stage 2 post-filter empty**: Stage-2 pool had no chunks tagged for Stage-1 sessions. Return raw Stage-2 top-`recallTopK` without session filtering. Diagnostics tag: `escalations: ['session-retriever:stage2-empty']`.
26+
27+
## When to use
28+
29+
- Long-term conversational memory where answers span multiple sessions (LongMemEval multi-session, LOCOMO multi-hop).
30+
- Deployments where per-session topical coherence is high and session boundaries are semantically meaningful.
31+
- Configurations with an LLM budget for ingest-time summary generation (`SessionSummarizer` call per unique session).
32+
33+
## When NOT to use
34+
35+
- Single-session question answering where `CognitiveMemoryManager.retrieve` already surfaces the right chunks.
36+
- Deployments without ingest-time summarization (no `SessionSummarizer`). SessionRetriever would fall through to plain retrieval every call.
37+
- Very short sessions (< 5 turns) where the summary and chunks are essentially the same content.
38+
39+
## References
40+
41+
- **xMemory** ([arxiv 2602.02007v3](https://arxiv.org/abs/2602.02007v3), 2026) — four-level hierarchy (raw → episode → semantic → theme) with two-stage retrieval. Ablation on LoCoMo shows hierarchy alone beats Naive RAG BLEU 27.9→31.8, F1 36.4→40.8 before any retrieval optimization.
42+
- **TACITREE** ([EMNLP 2025](https://aclanthology.org/2025.emnlp-main.580.pdf), UCSD) — hierarchical tree for multi-session personalized conversation. Level-based retrieval progressively refines from abstract summaries to detail.
43+
- **Anthropic contextual retrieval** (Sep 2024) — the pattern `SessionSummarizer` implements at session granularity. `SessionRetriever` is the retrieval-time counterpart.
44+
45+
## Performance characteristics
46+
47+
- **Stage 1 cost**: one embedding (reusable via a shared `CachedEmbedder`) plus one vector search per query. Bounded by `topK=K`.
48+
- **Stage 2 cost**: one `MemoryStore.query` per query with `topK = K × M × 3` (over-fetch multiplier). Typical: 45 at the defaults.
49+
- **Optional rerank cost**: one Cohere `rerank-v3.5` call over the merged K×M pool (~15 documents at defaults). Approximately $0.0001 per query.
50+
- **Fallback cost**: Stage-1 empty → plain `MemoryStore.query` (no extra cost). Stage-2 empty → raw Stage-2 pool (no extra cost).
51+
52+
## Related modules
53+
54+
- [`src/memory/retrieval/session/SessionSummaryStore.ts`](../../src/memory/retrieval/session/SessionSummaryStore.ts)
55+
- [`src/memory/retrieval/session/SessionRetriever.ts`](../../src/memory/retrieval/session/SessionRetriever.ts)
56+
- [`src/memory/ingest/SessionSummarizer.ts`](../../src/memory/ingest/SessionSummarizer.ts) — summary generation (companion)
57+
- [`src/memory/retrieval/store/MemoryStore.ts`](../../src/memory/retrieval/store/MemoryStore.ts) — underlying trace store used by Stage 2

0 commit comments

Comments
 (0)