v0.1.0 - Initial public release
PMB · Personal Memory Brain - local-first persistent memory for AI agents (Claude Code, Cursor, Codex).
Headline numbers
- 94.5% LoCoMo recall@10 (reproducible, full 10-conversation run)
- 70ms p50 warm recall (~3.7s MCP cold start)
- 99.2% top-10 on 900-query multi-language stress test
- 50+ languages via multilingual embedder, with explicit pattern coverage
What's inside
- MCP-native: one command to wire into Claude Code / Cursor / Codex
- Hybrid retrieval: BM25 + vector + PAMVR boosts, no LLM in the hot path
- Auto VOCAB_BRIDGES: domain adaptation mined from your own data
- Pattern query split: compound queries handled via fan-out + RRF fusion
- Atomic fact extraction: mem0-style, regex-based, no LLM
- Keyed-upsert: fact replacement with history (no duplicate current values)
- Durable embed queue: SQLite-backed, survives process restarts
- Multilingual safety:
pmb doctorwarns if your model doesn't match your data - Web dashboard at
http://127.0.0.1:8765
Install
pip install pmb-ai
pmb connect claude-code
Restart your agent and say "remember - I prefer Postgres". See the README for full benchmarks, screenshots, and architecture.
Compared to alternatives
| PMB | mem0 | Letta | Zep | |
|---|---|---|---|---|
| LoCoMo recall@10 | 94.5% | ~67-70% | ~76-80% | ~80% |
| p50 latency | 70ms | 1-3s | 1-3s | 1-3s |
| Runs offline | yes | no (cloud) | partial | partial |
| API key required | no | yes | yes | yes |
Honest limitations
- Top-1 universal ceiling without LLM rerank: ~58% on LoCoMo, ~73% on the multi-language stress. Optional
recall.llm_rerank(Ollama) lifts top-1 by 10-15pp at +200ms latency cost. - Some adversarial cases (mixed-language queries with ambiguous proper nouns) remain hard without LLM-based coreference.
- Tested primarily on Windows + Linux. macOS works but has fewer CI runs.
License
Apache 2.0. No telemetry, no API keys, no cloud.
Full Changelog: https://github.com/oleksiijko/pmb/commits/v0.1.0