Local knowledge graph for AI agents. Hybrid search, vault graph, and MCP server for Obsidian vaults — entirely offline.
engraph turns your markdown vault into a searchable knowledge graph that AI agents can query through MCP. It combines semantic embeddings, full-text search, wikilink graph traversal, and LLM-powered reranking into a single local binary. Same model stack as qmd. No API keys, no cloud — everything runs on your machine.
Plain vector search treats your notes as isolated documents. But knowledge isn't flat — your notes link to each other, share tags, reference the same people and projects. engraph understands these connections.
- 4-lane hybrid search — semantic embeddings + BM25 full-text + graph expansion + cross-encoder reranking, fused via Reciprocal Rank Fusion. An LLM orchestrator classifies queries and adapts lane weights per intent.
- MCP server for AI agents —
engraph serveexposes 13 tools (search, read, context bundles, note creation) that Claude, Cursor, or any MCP client can call directly. - Real-time sync — file watcher keeps the index fresh as you edit in Obsidian. No manual re-indexing needed.
- Smart write pipeline — AI agents can create notes with automatic tag resolution, wikilink discovery, and folder placement based on semantic similarity.
- Fully local — llama.cpp inference with GGUF models (~300MB mandatory, ~1.3GB optional for intelligence). Metal GPU-accelerated on macOS (88 files indexed in 70s). No API keys, no cloud.
You have hundreds of markdown notes. You want your AI coding assistant to understand what you've written — not just search keywords, but follow the connections between notes, understand context, and write new notes that fit your vault's structure.
Existing options are either cloud-dependent (Notion AI, Mem), limited to keyword search (Obsidian's built-in), or require you to copy-paste context manually. engraph gives AI agents direct, structured access to your entire vault through a standard protocol.
Your vault (markdown files)
│
▼
┌─────────────────────────────────────────────┐
│ engraph index │
│ │
│ Walk → Chunk → Embed (llama.cpp) → Store │
│ │
│ SQLite: files, chunks, FTS5, vectors, │
│ edges, centroids, tags, LLM cache │
└─────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ engraph serve │
│ │
│ MCP Server (stdio) + File Watcher │
│ │
│ Search: Orchestrator → 4-lane retrieval │
│ → Reranker → Two-pass RRF fusion │
│ │
│ 13 tools: search, read, list, context, │
│ who, project, create, append, move... │
└─────────────────────────────────────────────┘
│
▼
Claude / Cursor / any MCP client
- Index — walks your vault, chunks markdown by headings, embeds with a local GGUF model via llama.cpp (Metal GPU on macOS), stores everything in SQLite with FTS5 + sqlite-vec + a wikilink graph
- Search — an orchestrator classifies the query and sets lane weights, then runs up to four lanes (semantic KNN, BM25 keyword, graph expansion, cross-encoder reranking), fused via RRF
- Serve — starts an MCP server that AI agents connect to, with a file watcher that re-indexes changes in real time
Install:
# Homebrew (macOS)
brew install devwhodevs/tap/engraph
# Pre-built binaries (macOS arm64, Linux x86_64)
# → https://github.com/devwhodevs/engraph/releases
# From source (requires CMake for llama.cpp)
cargo install --git https://github.com/devwhodevs/engraphIndex your vault:
engraph index ~/path/to/vault
# Downloads embedding model on first run (~300MB)
# Incremental — only re-embeds changed files on subsequent runsSearch:
engraph search "how does the auth system work" 1. [0.04] 02-Areas/Development/Auth-Architecture.md > # Auth Architecture #6e1b70
OAuth 2.0 with PKCE for all client types. Session tokens stored in HTTP-only cookies...
2. [0.04] 01-Projects/API-Design.md > # API Design #e3e350
All endpoints require Bearer token authentication. Tokens are issued by the OAuth 2.0...
3. [0.04] 03-Resources/People/Sarah-Chen.md > # Sarah Chen #4adb39
Senior Backend Engineer. Tech lead for authentication and security systems...
Note how result #3 was found via graph expansion — Sarah's note doesn't mention "auth system" directly, but she's linked from the auth architecture doc via [[Sarah Chen]].
Connect to Claude Code:
# Start the MCP server
engraph serve
# Or add to Claude Code's settings (~/.claude/settings.json):
{
"mcpServers": {
"engraph": {
"command": "engraph",
"args": ["serve"]
}
}
}Now Claude can search your vault, read notes, build context bundles, and create new notes — all through structured tool calls.
Enable intelligence (optional, ~1.3GB download):
engraph configure --enable-intelligence
# Downloads Qwen3-0.6B (orchestrator) + Qwen3-Reranker (cross-encoder)
# Adds LLM query expansion + 4th reranker lane to search4-lane search with intent classification:
engraph search "how does authentication work" --explain 1. [0.04] 01-Projects/API-Design.md > # API Design #e3e350
All endpoints require Bearer token authentication...
Intent: Conceptual
--- Explain ---
01-Projects/API-Design.md
RRF: 0.0387
semantic: rank #2, raw 0.38, +0.0194
rerank: rank #2, raw 0.01, +0.0194
02-Areas/Development/Auth-Architecture.md
RRF: 0.0384
semantic: rank #1, raw 0.51, +0.0197
rerank: rank #4, raw 0.00, +0.0187
The orchestrator classified the query as Conceptual (boosting semantic lane weight). The reranker scored each result for relevance as the 4th RRF lane.
Rich context for AI agents:
engraph context topic "authentication" --budget 8000Returns a token-budgeted context bundle: relevant notes, connected people, related projects — ready to paste into a prompt or serve via MCP.
Person context:
engraph context who "Sarah Chen"Returns Sarah's note, all mentions across the vault, connected notes via wikilinks, and recent activity.
Vault structure overview:
engraph context vault-mapReturns folder counts, top tags, recent files — gives an AI agent orientation before it starts searching.
Create a note via the write pipeline:
engraph write create --content "# Meeting Notes\n\nDiscussed auth timeline with Sarah." --tags meeting,authengraph resolves tags against the registry (fuzzy matching), discovers potential wikilinks ([[Sarah Chen]]), suggests the best folder based on semantic similarity to existing notes, and writes atomically.
AI-assisted knowledge work — Give Claude or Cursor deep access to your personal knowledge base. Instead of copy-pasting context, the agent searches, reads, and cross-references your notes directly.
Developer second brain — Index architecture docs, decision records, meeting notes, and code snippets. Search by concept across all of them.
Research and writing — Find connections between notes that you didn't explicitly link. The graph lane surfaces related content through shared wikilinks and mentions.
Team knowledge graphs — Index a shared docs vault. AI agents can answer "who knows about X?" and "what decisions were made about Y?" by traversing the note graph.
| engraph | Basic RAG (vector-only) | Obsidian search | |
|---|---|---|---|
| Search method | 4-lane RRF (semantic + BM25 + graph + reranker) | Vector similarity only | Keyword only |
| Query understanding | LLM orchestrator classifies intent, adapts weights | None | None |
| Understands note links | Yes (wikilink graph traversal) | No | Limited (backlinks panel) |
| AI agent access | MCP server (13 tools) | Custom API needed | No |
| Write capability | Create/append/move with smart filing | No | Manual |
| Real-time sync | File watcher, 2s debounce | Manual re-index | N/A |
| Runs locally | Yes, llama.cpp + Metal GPU | Depends | Yes |
| Setup | One binary, one command | Framework + code | Built-in |
engraph is not a replacement for Obsidian — it's the intelligence layer that sits between your vault and your AI tools.
- 4-lane hybrid search (semantic + FTS5 + graph + cross-encoder reranker) with two-pass RRF fusion
- LLM research orchestrator: query intent classification + query expansion + adaptive lane weights
- llama.cpp inference via Rust bindings (GGUF models, Metal GPU on macOS, CUDA on Linux)
- Intelligence opt-in: heuristic fallback when disabled, LLM-powered when enabled
- MCP server with 13 tools (7 read, 6 write) via stdio
- Real-time file watching with 2s debounce and startup reconciliation
- Write pipeline: tag resolution, fuzzy link discovery, semantic folder placement
- Context engine: topic bundles, person bundles, project bundles with token budgets
- Vault graph: bidirectional wikilink + mention edges with multi-hop expansion
- Placement correction learning from user file moves
- Configurable model overrides for multilingual support
- 270 unit tests, CI on macOS + Ubuntu
-
Research orchestrator — query classification and adaptive lane weighting(v1.0) -
LLM reranker — optional local model for result quality(v1.0) - MCP edit/rewrite tools — full note editing for AI agents (v1.1)
- Temporal search — find notes by time period, detect trends (v1.2)
- HTTP/REST API — complement MCP with a standard web API (v1.3)
- Multi-vault — search across multiple vaults (v1.4)
- Vault health monitor — surface orphan notes, broken links, stale content
Optional config at ~/.engraph/config.toml:
vault_path = "~/Documents/MyVault"
top_n = 10
exclude = [".obsidian/", "node_modules/", ".git/"]
# Enable LLM-powered intelligence (query expansion + reranking)
intelligence = true
# Override models for multilingual or custom use
[models]
# embed = "hf:Qwen/Qwen3-Embedding-0.6B-GGUF/qwen3-embedding-0.6b-q8_0.gguf"
# rerank = "hf:ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/qwen3-reranker-0.6b-q8_0.gguf"All data stored in ~/.engraph/ — single SQLite database (~10MB typical), GGUF models, and vault profile.
cargo test --lib # 270 unit tests, no network (requires CMake for llama.cpp)
cargo clippy -- -D warnings
cargo fmt --check
# Integration tests (downloads GGUF model)
cargo test --test integration -- --ignoredContributions welcome. Please open an issue first to discuss what you'd like to change.
The codebase is 19 Rust modules behind a lib crate. CLAUDE.md in the repo root has detailed architecture documentation for AI-assisted development.
MIT
