Git for knowledge graphs. Versioned, content-addressed agent memory with retrieval under a token budget. Embed it in Rust or Python, run the CLI, or talk to it over MCP.
mnem is a small database for knowledge graphs - nodes (people, documents, products, concepts) connected by edges (relationships). It's "Git for knowledge graphs" in the literal sense: you commit facts, branch timelines, diff revisions, merge concurrent edits, and sign what you wrote. Runs entirely in-process (embed it into your app or agent) or on disk via an embedded key-value store. No server required.
An AI agent that talks to tools, reads documents, or collaborates with other agents needs somewhere to remember what it learned. Today the options are bad:
- Dump everything into the LLM prompt. Expensive, context-window-bound, no versioning, no multi-agent reconciliation.
- Build a bespoke memory service. Weeks of plumbing, no shared format, no provenance.
- Shove markdown "skills" files around. Unversioned, unqueryable, wasteful of tokens.
mnem is the missing third option. Write facts as a graph, query only what you need, version everything, let agents share and reconcile concurrent edits automatically.
mnem is infrastructure. Git, SQLite, and LMDB are substrates; the products people actually use are built on top of them. mnem aims to be that layer for agent memory.
Concretely, that means:
- Zero schema opinions. No hardcoded
user_id/agent_id/run_idtriples, no conversation-message shape, no rooms-and-wings metaphor. Bring your own nodes, edges, labels, and properties. - Content-addressed and deterministic. Every object has a cryptographic name. Given the same inputs, two machines produce byte-identical CIDs, so replay and regression tests are real tests.
- Versioned by construction. Commit, branch, diff, 3-way merge, signed history. Not bolted on; the data model is a DAG from the first byte.
- WASM-clean core.
mnem-corehas no tokio, no filesystem, no network. Async and I/O live at the edges (mnem-http, storage backends) so the core compiles to browsers, edge workers, and embedded runtimes unchanged.
Detailed comparisons with specific products (mem0, MemPalace, Graphiti, Letta, Cognee, and others) live in docs/competitive/.
- Retrieval under a token budget -
repo.retrieve().text(q).token_budget(500).execute()returns ranked, rendered nodes that fit the budget, plus explicittokens_used/dropped/candidates_seenmetadata. No silent truncation of LLM context. - Hybrid BM25 + vector + optional graph-expand + optional cross-encoder rerank, with automatic paraphrase bridging - five tiers, every one tunable or opt-out:
- BM25 + cosine vector, fused via Reciprocal Rank Fusion (always on).
- Auto query expansion via pseudo-relevance feedback (on by default when an embedder is configured). Surfaces aunts for "who is my father's sister" without any user-written synonym map. PRF tokens feed the embedder only, not BM25, so exact-keyword semantics stay intact.
- HyDE (Hypothetical Document Embeddings) via
--hyde openai:gpt-4o-mini. The LLM generates a hypothetical answer; mnem embeds that instead of the raw query. Bridges compositional paraphrase a bi-encoder alone cannot. Cohere/OpenAI/Ollama chat adapters ship inmnem-llm-providers. - Graph-expand via
--graph-expand N. After fusion, mnem traverses outgoing edges 1 hop from each seed and adds neighbors as scored candidates. Exploits the authored graph β no other chunk-bag memory system can do this. - Cross-encoder reranker via
--rerank cohere:rerank-english-v3.0(also Voyage, Jina adapters). Re-scores the top-K of the fused+expanded list. Reads(query, candidate)jointly, which BM25 and bi-encoders cannot.
- Content-addressed storage - every object has a cryptographic hash. Deduplication and integrity are automatic.
- Git-shaped versioning - commits, branches, diff, 3-way merge, history walk. Designed for graphs from day one, not retrofitted from a file-tree VCS.
- Sub-millisecond agent queries - label, property, and adjacency indexes turn "nodes where name='Alice'", "all Person nodes", and "outgoing edges of X" into O(log n) point lookups.
- MCP-native - ships an MCP server (
mnem-mcp) with 13 tools (mnem_stats,mnem_schema,mnem_search,mnem_text_search,mnem_vector_search,mnem_retrieve,mnem_get_node,mnem_traverse,mnem_commit,mnem_delete_node,mnem_list_nodes,mnem_resolve_or_create,mnem_recent). Every response carries_metatelemetry (bytes, latency, tokens_estimate). - Git-shaped CLI -
mnem init / add / status / stats / query / retrieve / embed / log / show / diff / ref / config / integrate / doctor, zero-config, works offline, walks up to find the nearest.mnemlikegitdoes. - Deterministic across machines - node tree and IndexSet CIDs are byte-identical given the same inputs, so agent replay and regression tests actually work. Verified by the B5 invariant in
docs/benchmarks/ai-native.md. - Cryptographic signatures - every commit can carry an Ed25519 signature; revocation lists let a repo disown compromised keys without invalidating historical commits.
- Embeddable everywhere - pure Rust core compiles to native binaries and WASM. Python bindings (
pip install mnem) ship today; Node, Go, and Swift sit on the roadmap. The MCP server covers most agent use cases without per-language bindings. - Local-first - stores a whole repository in a single redb file. No network, no cloud, no account. Optional remote sync lands in v0.3.
Full methodology, reproduction commands, and honest limitations in docs/benchmarks/ai-native.md.
| Invariant (corpus: 1000 Doc nodes, in-memory backend) | Result |
|---|---|
B1 token efficiency: budget=100 vs naive top-20 |
20.32x fewer tokens |
B2 recall@5 jump when you flip index_content(true) for content-heavy corpora |
~79x |
| B4 end-to-end retrieve latency (p50, fresh index) | ~6 ms |
| B4 amortised text-search latency (p50) | ~11 Β΅s |
| B5 node-tree + IndexSet CID stable across fresh backends | PASS |
| B6 entity dedup across agents (sequential) | PASS |
A B3 head-to-head protocol against Mem0, Zep, LlamaIndex, and LangChain memory is defined in benchmarks/b3/. The mnem side and the shared scorer are checked in; external-system numbers land as each adapter gets its API keys.
Below the retrieval layer, mnem's index structure is a Prolly tree - which beats a correctness-matched HAMT at diff by up to 324Γ at 100k entries. Raw numbers in docs/benchmarks/prolly-vs-hamt.md.
| Component | Linux | macOS | Windows | WASM |
|---|---|---|---|---|
mnem-core |
yes | yes | yes | yes |
mnem-backend-redb (embedded KV) |
yes | yes | yes | - |
GitHub Actions runs the test matrix on Ubuntu, macOS, and Windows on every push and PR (.github/workflows/ci.yml).
# macOS / Linux
curl -fsSL https://mnemos.ai/install.sh | sh
# Windows (PowerShell)
iwr -useb https://mnemos.ai/install.ps1 | iexOr via your language's package manager - pip install mnem, npm install -g @mnem/cli, cargo install mnem-cli, brew install mnemos/tap/mnem, winget install mnemos.mnem. Full matrix (Docker, WASM, build-from-source) in docs/guide/installation.md. Installer and prebuilt binaries arrive with the first tagged release; until then, clone + cargo build --release -p mnem-cli.
After install, one command wires mnem into every agent host on your machine:
mnem integrateDetects Claude Desktop, Cursor, Continue, Zed, Claude Code, Codex, and Gemini CLI by probing their config paths. Asks which to wire. Backs up originals. Writes the MCP entry. Prints how to restart each host. No hand-editing of JSON, no "where does Cursor keep its config" hunt. Details in docs/guide/integrate.md.
# write a few agent memories
mnem init
mnem add node --label Memory --summary "Alice lives in Berlin and works at Globex"
mnem add node --label Memory --summary "Alice's hobby is climbing; she goes weekly"
mnem add node --label Memory --summary "Bob moved to Paris last month"
mnem add node --label Memory --summary "Alice is allergic to penicillin"
# retrieve under a 200-token budget
mnem retrieve --text "Alice Berlin" --budget 200Output:
# 2 item(s), 57/200 tokens, 0 dropped, 2 candidates
---
[0] score=0.0164 tokens=29 id=... Memory
ntype: Memory
summary: Alice lives in Berlin and works at Globex
---
[1] score=0.0161 tokens=28 id=... Memory
ntype: Memory
summary: Alice's hobby is climbing; she goes weekly
The 57/200 tokens, 0 dropped, 2 candidates line is the feature: mnem tells you whether the budget was tight, so your agent can react (raise the budget, narrow the query, move on).
Once you point mnem at an embedding provider, mnem add node auto-embeds new nodes and mnem retrieve auto-fuses BM25 with cosine similarity under RRF.
Local (no API key, runs on your machine):
# One-time: install Ollama (https://ollama.com/download), then:
ollama serve & # background daemon
ollama pull nomic-embed-text
mnem config set embed.provider ollama
mnem config set embed.model nomic-embed-text
mnem embed # backfill existing nodes (one commit)
mnem retrieve --text "where does Alice live" --budget 200Cloud (OpenAI):
export OPENAI_API_KEY=sk-...
mnem config set embed.provider openai
mnem config set embed.model text-embedding-3-small
mnem config set embed.api_key_env OPENAI_API_KEY
mnem embed
mnem retrieve --text "where does Alice live" --budget 200Provider failures never block commits or queries - they warn on stderr and degrade to text-only retrieval. API keys live in environment variables, never in config.toml. Add --no-vector (or --no-text) to force a single-ranker run.
Once providers are configured, mnem retrieve --text "..." automatically runs the best pipeline your config supports. Everything is tunable via flags; no silent constraints:
| Tier | What it does | How to enable | How to tune |
|---|---|---|---|
| 1. Hybrid BM25 + cosine vector | always on | configure embed.provider |
--no-text, --no-vector, --text-cap N, --vector-cap N |
| 2. Auto query expansion (PRF) | on by default with embedder | already | --auto-expand N, --no-auto-expand |
| 3. HyDE β LLM hypothetical answer | opt-in per call | --hyde openai:gpt-4o-mini or [llm] config |
--hyde-max-tokens N, --hyde-temperature F |
| 4. Graph-expand β 1-hop edge traversal | opt-in per call | --graph-expand N |
--graph-decay F, --graph-etype ETYPE |
| 5. Cross-encoder reranker | auto-on with [rerank] config, or --rerank PROVIDER:MODEL |
configure rerank.provider |
--rerank-top-k N |
Example full stack:
export OPENAI_API_KEY=sk-... ; export COHERE_API_KEY=...
mnem config set embed.provider openai
mnem config set embed.model text-embedding-3-small
mnem config set rerank.provider cohere
mnem config set rerank.model rerank-english-v3.0
mnem retrieve --text "who is my father's sister" \
--hyde openai:gpt-4o-mini \
--graph-expand 20 \
--limit 5The design principle: what you opt into is a provider, not a retrieval tactic. Every tier has sensible defaults; every default has a CLI override. See docs/guide/semantic-search.md for the full tier-by-tier hierarchy and ADR-0020 for the reranker trade-off analysis.
For honest head-to-head numbers vs mem0, Zep, Graphiti, MemPalace, pinecone+cohere, see the Mnemos-Lab (private) benchmark sandbox.
use mnem_core::{objects::Node, id::NodeId, repo::ReadonlyRepo};
// ... bring up repo from in-memory or redb stores (see docs/QUICKSTART.md) ...
// Write: commit a node with a short LLM-facing summary.
let alice = Node::new(NodeId::new_v7(), "Person")
.with_summary("Alice lives in Berlin and works at Globex.");
let mut tx = repo.start_transaction();
tx.add_node(&alice)?;
let repo = tx.commit("alice@example.org", "add Alice")?;
// Read: retrieve under a token budget. Hybrid BM25 + optional vector.
let result = repo
.retrieve()
.label("Person")
.text("Alice Berlin")
.token_budget(500)
.execute()?;
for item in &result.items {
println!("score={:.3} tokens={}\n{}", item.score, item.tokens, item.rendered);
}
println!("used {}/{} tokens, {} dropped, {} candidates",
result.tokens_used, result.tokens_budget,
result.dropped, result.candidates_seen);More: sign + verify, persist to disk, CAS on refs, Prolly-tree diff, op-log walk. See docs/QUICKSTART.md and the runnable examples.
pip install mnemimport mnem
repo = mnem.Repo.init_memory() # or mnem.Repo.open_or_init("/tmp/agent.redb")
with repo.transaction(author="alice@example.com", message="seed") as tx:
tx.add_node(ntype="Memory", summary="Alice lives in Berlin and works at Globex")
tx.add_node(ntype="Memory", summary="Alice's hobby is climbing")
tx.add_node(ntype="Memory", summary="Bob moved to Paris last month")
result = repo.retrieve(text="Alice Berlin", token_budget=500, limit=10)
for item in result:
print(f"{item.score:.3f} [{item.tokens}t] {item.summary}")
print(f"used {result.tokens_used}/{result.tokens_budget} tokens,",
f"{result.dropped} dropped of {result.candidates_seen} candidates")The Python wrapper (crates/mnem-py) is a thin pyo3 layer over the same Rust core; retrieval throughput is the numbers in docs/benchmarks/ai-native.md minus a <50 Β΅s FFI crossing per call.
mnem-http --repo /path/to/project --bind 127.0.0.1:9876
# Another terminal:
curl -s http://127.0.0.1:9876/v1/healthz
curl -s -X POST http://127.0.0.1:9876/v1/nodes \
-H 'content-type: application/json' \
-d '{"label":"Memory","summary":"Alice lives in Berlin","author":"me"}'
curl -s 'http://127.0.0.1:9876/v1/retrieve?text=Alice&budget=200'Every response carries a schema field (mnem.v1.healthz / mnem.v1.post-node / mnem.v1.retrieve / mnem.v1.err) so breaking changes are detectable. Binds loopback-only by default; non-loopback binds emit a stderr warning because v1 has no auth layer - front with a reverse proxy. Design record: ADR-0019.
mnem-mcp is a JSON-RPC 2.0 server over stdio that any MCP-aware agent host (Claude Desktop, Continue, Cursor, ...) can spawn. Every response carries _meta with byte count, latency, and an estimated token count.
{"jsonrpc":"2.0","id":1,"method":"tools/call",
"params":{"name":"mnem_retrieve",
"arguments":{"text":"Alice Berlin","token_budget":500,"limit":5}}}Tools shipped: mnem_stats, mnem_schema, mnem_search, mnem_text_search, mnem_vector_search, mnem_retrieve, mnem_get_node, mnem_traverse, mnem_commit, mnem_delete_node, mnem_list_nodes, mnem_resolve_or_create, mnem_recent.
mnem-core composes over two pluggable traits - Blockstore (content-addressed bytes) and OpHeadsStore (current op-head set) - and presents a Jujutsu-style ReadonlyRepo / Transaction facade on top. Storage backends plug in as separate crates; the same core API drives an in-memory store or a redb-backed persistent store.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β mnem-mcp (stdio server) mnem (CLI) your Rust app β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β retrieve::{Retriever, HeuristicEstimator, render_node} β
β index::{Query, TextIndex, VectorIndex, PropPredicate} β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β ReadonlyRepo Transaction CAS sign::Verifier β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β objects::{Node, Edge, Commit, Operation, View, IndexSet} β
β prolly::{chunker, tree, lookup, cursor, diff} β
β codec::{dagcbor, dagjson} id::{Cid, Multihash, Link} β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β store::Blockstore store::OpHeadsStore β traits β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β MemoryBlockstore RedbBlockstore β
β MemoryOpHeadsStore RedbOpHeadsStore β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Depth: ARCHITECTURE.md.
mnem offers two complementary retrieval surfaces. Pick whichever fits the shape of the question.
Structured - exact label / prop-eq lookups backed by the Prolly indexes. Use when the agent knows the key.
let hits = repo.query()
.label("Person")
.where_eq("name", "Alice")
.with_outgoing("knows")
.limit(5)
.execute()?;
let alice_id = tx.resolve_or_create_node("Person", "name", "Alice")?; // dedup-on-writeRanked + budgeted - BM25 + vector + filters fused under a token budget. Use when the agent has keywords or an embedding and needs to assemble LLM context under a cap.
let result = repo.retrieve()
.label("Document")
.text("vector databases latency")
.vector("openai:text-embedding-3-small", query_embedding)
.text_weight(0.3) // let the semantic ranker dominate
.index_content(true) // also tokenise content (Week-4 win)
.token_budget(2000)
.limit(10)
.execute()?;IndexSet (label / prop / adjacency) is built automatically on every commit and stored through Commit.indexes (SPEC Β§4.8). The BM25 TextIndex and cosine VectorIndex are rebuilt on demand; persisted variants land in v0.3.
| Phase | State | Scope |
|---|---|---|
| 0 Design | done | SPEC, ARCHITECTURE, 19 ADRs |
| 1 Core library | done | content-addressed DAG-CBOR, Prolly trees, redb backend, Ed25519 signing, CAS, cross-platform proptest fuzz |
| 2 Adoption surface | done | mnem-cli + mnem-mcp, Node.summary, BM25 + vector indexes, Retriever with RRF + token budget, B1/B2/B4/B5/B6 benchmarks |
| 3 Remote protocol | planned | mnem-server + clone/push/pull (refs/a2a/* reserved), persisted TextIndex / VectorIndex, B3 head-to-head published |
| 4 Integrations | in progress | mnem-py shipped (pyo3 + abi3-py39); mnem-a2a, Node, WASM still planned |
| 5 Launch | in prep | Show HN, Product Hunt, Twitter |
Full roadmap: ROADMAP.md. Test count at HEAD: 279, owned by mnem-core (224), mnem-cli (21 across integrate + doctor), mnem-backend-redb (11), mnem-embed-providers (9), and mnem-http (7 axum integration tests) plus 7 fuzz cases; mnem-mcp / mnem-py are covered end-to-end through the example binaries rather than per-crate unit tests.
| Crate | Role |
|---|---|
mnem-core |
objects, codec, Prolly trees, repo facade, signing, Query / TextIndex / VectorIndex / Retriever |
mnem-backend-redb |
embedded KV persistent backend |
mnem-core-testutils |
shared dev-only fixtures |
mnem-cli |
mnem binary - init / add / status / stats / query / retrieve / embed / log / show / diff / ref / config / integrate / doctor |
mnem-mcp |
mnem-mcp stdio JSON-RPC server exposing retrieval + commit tools to MCP hosts |
mnem-http |
mnem-http REST server (axum) exposing POST /v1/nodes, GET /v1/nodes/:id, DELETE /v1/nodes/:id, GET /v1/retrieve, GET /v1/stats, GET /v1/healthz. Tokio lives only here; core stays WASM-clean (ADR-0019) |
mnem-py |
pyo3 bindings: pip install mnem gets you Repo.retrieve(text=..., token_budget=N) in Python |
mnem-embed-providers |
sync HTTP adapters (OpenAI + Ollama) behind a single Embedder trait; what powers semantic search in the CLI |
- Guide - getting-started, core concepts, CLI, Python, MCP, semantic search. Structured to deploy as a static site (e.g.
docs.mnemos.ai) later. - Competitive - where mnem sits vs mem0, MemPalace, and the rest. Detailed tables per competitor; humble framing.
- SPEC.md - canonical format specification. An engineer reading only this document can build a compatible implementation.
- ARCHITECTURE.md - module boundaries, trait hierarchy, data flow, invariants.
- ROADMAP.md - phased execution plan with exit criteria.
- QUICKSTART.md - five-minute hands-on tour (mirrors
docs/guide/getting-started.md). - CAPABILITIES.md - the runnable example binaries and what they prove.
- ADRs - 19 Architecture Decision Records. Every non-trivial choice has a rationale, alternatives considered, consequences, and review triggers. The five most recent cover the Phase-2 retrieval + onboarding + HTTP surface: ADR-0015 (BM25 + vector + RRF), ADR-0016 (content indexing + fusion weights), ADR-0017 (embedding-provider adapters), ADR-0018 (one-line install +
mnem integratewizard +mnem doctor), ADR-0019 (mnem-httpcrate + tokio boundary). - Benchmarks - empirical validation of each major design choice.
mnem stands on four bodies of prior work. The studies that shaped the design:
- Git internals - object format, hashing, atomic writes, refs, packfiles.
- Jujutsu architecture - the modern Rust VCS template.
- IPFS / IPLD / multiformats - CID, multihash, DAG-CBOR, HAMT.
- Dolt / TerminusDB / Pijul - Prolly trees, patch theory, graph merge.
- Google A2A protocol - agent-to-agent interop surface.
- Rust crate audit - 2026-04 dep state of the art.
- Phase 1 risk review - pre-implementation critique.
# Clone
git clone https://github.com/Project-Mnemos/mnem.git
cd mnem
# Full test suite (279 tests across the workspace)
cargo test --workspace --tests --lib
# AI-native pitch in one runnable file
cargo run --release -p mnem-core --example agent_retrieve_under_budget
# Token efficiency + latency + determinism + multi-agent invariants
cargo run --release -p mnem-core --example ai_native_bench
# Recall benchmark (judge-free synthetic ground truth)
cargo run --release -p mnem-core --example b2_recall_bench
# Head-to-head scaffold (corpus + questions + mnem runner)
cargo run --release -p mnem-core --example b3_runner -- gen-corpus > benchmarks/b3/corpus.json
cargo run --release -p mnem-core --example b3_runner -- gen-questions > benchmarks/b3/questions.json
cargo run --release -p mnem-core --example b3_runner -- run-mnem \
benchmarks/b3/corpus.json benchmarks/b3/questions.json \
> benchmarks/b3/results/mnem.jsonl
python benchmarks/b3/score.py \
--questions benchmarks/b3/questions.json \
benchmarks/b3/results/*.jsonlSee docs/CAPABILITIES.md for the full demo list and docs/benchmarks/ai-native.md for benchmark methodology.
Early days. The library, CLI, MCP server, and AI-native retrieval surface are all shipped; the remote protocol and cross-system head-to-head numbers are the next major scope.
- Full guide in CONTRIBUTING.md.
- Start at SPEC.md to understand the format.
- Every non-trivial change cites an ADR or opens one.
- Security issues: see SECURITY.md. Do not file them in the public tracker.
- All contributors are expected to uphold the Code of Conduct.
Licensed under the Apache License, Version 2.0.
Contributions are accepted under the same license. By opening a pull request you agree that your contribution is licensed under Apache-2.0, including the patent grant in section 3 of the license.
Built by independent contributors. Star the repo if mnem is useful to you.