By Atum
A comprehensive guide to understanding how state-of-the-art (SOTA) agent memory systems work — what techniques they use, how they differ, and which situations each one fits best.
- AI engineers building agents that need to remember across sessions
- Product builders choosing a memory layer for their AI application
- Researchers studying the evolving landscape of agent memory architectures
- Curious minds who want to understand what makes an AI agent "remember"
- Foundations — What agent memory is, why it matters, and the core taxonomy
- Techniques — The engineering building blocks: retrieval, graphs, compression, reflection, and more
- Provider Index — Overview and quick navigation
- Mem0 — Two-phase extract/update pipeline + graph memory
- OpenViking — Filesystem paradigm + tiered context loading
- Hindsight — 4-network structured memory (retain/recall/reflect)
- ByteRover — Hierarchical Context Tree + 5-tier retrieval
- Zep / Graphiti — Temporal knowledge graph with bi-temporal edges
- Supermemory — All-in-one memory + RAG + connectors platform
- Honcho — Dialectic reasoning + deep user identity modeling
- Letta (MemGPT) — Memory-as-OS with self-editing agents
- Cognee — Knowledge graph + vector hybrid with ECL pipeline
- RetainDB — 7 memory types + delta compression (managed SaaS)
- Nuggets — Holographic Reduced Representations (zero deps)
- Claude Code — Leaked source reveals 200-line index, Sonnet side-calls, KAIROS daemon
- The Consumer AI Memory Race — How OpenAI, Anthropic, and Google approach memory
- Benchmarks & Evaluation — LongMemEval, LoCoMo leaderboards and analysis
- Decision Guide — Choosing the right memory system for your use case
- The Future — Open challenges and where the field is heading
| System | Architecture | Open Source | Best For | LongMemEval | LoCoMo |
|---|---|---|---|---|---|
| Mem0 | Extract → Update pipeline + vector/graph | Yes (Apache 2.0) | Production chat agents | — | 66.9% |
| OpenViking | Filesystem paradigm + tiered context | Yes (Apache 2.0) | Unified context management | — | — |
| Hindsight | 4-network structured memory bank | Yes (MIT) | Long-horizon reasoning agents | 91.4% | 89.6% |
| ByteRover | Hierarchical Context Tree + file-based | Partial (CLI) | Coding agents | — | 92.2% |
| Zep/Graphiti | Temporal knowledge graph | Graphiti: Yes | Enterprise temporal reasoning | — | 75.1% |
| Supermemory | Memory graph + RAG + connectors | Core: Yes | All-in-one platform | 85.2% | #1 |
| Honcho | Dialectic reasoning + user modeling | Yes | Deep personalization | — | — |
| Letta (MemGPT) | Memory-as-OS (RAM/disk tiers) | Yes | Stateful autonomous agents | — | — |
| Cognee | Knowledge graph + vector hybrid | Yes (Apache 2.0) | Institutional knowledge | — | — |
| RetainDB | 7 memory types + delta compression | No (SaaS) | Quick integration, production | SOTA | — |
| Nuggets (Holographic) | HRR superposed vectors | Yes | Lightweight local memory | — | — |
| Claude Code | Markdown files + Sonnet side-call | Source leaked | Individual/team dev with Claude | — | — |
This book is designed to be kept up-to-date by AI agents. See the Update Guide for instructions on how to:
- Discover which agent platforms are currently top-tier (don't assume — research it)
- Scan each platform for memory provider integrations
- Check what developers are actively discussing
- Decide what qualifies for inclusion
- Update the book without falling into inertia or coverage gap traps
Found an error or want to add a provider? Open an issue or submit a pull request.
Key academic surveys referenced throughout this book:
- Memory for Autonomous LLM Agents (Du et al., Mar 2026)
- Memory in the Age of AI Agents (Zhang et al., Dec 2025)
- Graph-based Agent Memory (Feb 2026)
- Memory Operations Survey (Du et al., May 2025)
- LongMemEval Benchmark (Wu et al., ICLR 2025)
This book is released under CC BY-SA 4.0. You are free to share and adapt, with attribution.