engraph

Local knowledge graph for AI agents. Hybrid search, vault graph, and MCP server for Obsidian vaults — entirely offline.

engraph turns your markdown vault into a searchable knowledge graph that AI agents can query through MCP. It combines semantic embeddings, full-text search, wikilink graph traversal, and LLM-powered reranking into a single local binary. Same model stack as qmd. No API keys, no cloud — everything runs on your machine.

Why engraph?

Plain vector search treats your notes as isolated documents. But knowledge isn't flat — your notes link to each other, share tags, reference the same people and projects. engraph understands these connections.

4-lane hybrid search — semantic embeddings + BM25 full-text + graph expansion + cross-encoder reranking, fused via Reciprocal Rank Fusion. An LLM orchestrator classifies queries and adapts lane weights per intent.
MCP server for AI agents — engraph serve exposes 13 tools (search, read, context bundles, note creation) that Claude, Cursor, or any MCP client can call directly.
Real-time sync — file watcher keeps the index fresh as you edit in Obsidian. No manual re-indexing needed.
Smart write pipeline — AI agents can create notes with automatic tag resolution, wikilink discovery, and folder placement based on semantic similarity.
Fully local — llama.cpp inference with GGUF models (~300MB mandatory, ~1.3GB optional for intelligence). Metal GPU-accelerated on macOS (88 files indexed in 70s). No API keys, no cloud.

What problem it solves

You have hundreds of markdown notes. You want your AI coding assistant to understand what you've written — not just search keywords, but follow the connections between notes, understand context, and write new notes that fit your vault's structure.

Existing options are either cloud-dependent (Notion AI, Mem), limited to keyword search (Obsidian's built-in), or require you to copy-paste context manually. engraph gives AI agents direct, structured access to your entire vault through a standard protocol.

How it works

Your vault (markdown files)
        │
        ▼
┌─────────────────────────────────────────────┐
│              engraph index                   │
│                                             │
│  Walk → Chunk → Embed (llama.cpp) → Store   │
│                                             │
│  SQLite: files, chunks, FTS5, vectors,      │
│          edges, centroids, tags, LLM cache  │
└─────────────────────────────────────────────┘
        │
        ▼
┌─────────────────────────────────────────────┐
│              engraph serve                   │
│                                             │
│  MCP Server (stdio) + File Watcher          │
│                                             │
│  Search: Orchestrator → 4-lane retrieval    │
│          → Reranker → Two-pass RRF fusion   │
│                                             │
│  13 tools: search, read, list, context,     │
│  who, project, create, append, move...      │
└─────────────────────────────────────────────┘
        │
        ▼
  Claude / Cursor / any MCP client

Index — walks your vault, chunks markdown by headings, embeds with a local GGUF model via llama.cpp (Metal GPU on macOS), stores everything in SQLite with FTS5 + sqlite-vec + a wikilink graph
Search — an orchestrator classifies the query and sets lane weights, then runs up to four lanes (semantic KNN, BM25 keyword, graph expansion, cross-encoder reranking), fused via RRF
Serve — starts an MCP server that AI agents connect to, with a file watcher that re-indexes changes in real time

Quick start

Install:

# Homebrew (macOS)
brew install devwhodevs/tap/engraph

# Pre-built binaries (macOS arm64, Linux x86_64)
# → https://github.com/devwhodevs/engraph/releases

# From source (requires CMake for llama.cpp)
cargo install --git https://github.com/devwhodevs/engraph

Index your vault:

engraph index ~/path/to/vault
# Downloads embedding model on first run (~300MB)
# Incremental — only re-embeds changed files on subsequent runs

Search:

engraph search "how does the auth system work"

 1. [0.04] 02-Areas/Development/Auth-Architecture.md > # Auth Architecture  #6e1b70
    OAuth 2.0 with PKCE for all client types. Session tokens stored in HTTP-only cookies...

 2. [0.04] 01-Projects/API-Design.md > # API Design  #e3e350
    All endpoints require Bearer token authentication. Tokens are issued by the OAuth 2.0...

 3. [0.04] 03-Resources/People/Sarah-Chen.md > # Sarah Chen  #4adb39
    Senior Backend Engineer. Tech lead for authentication and security systems...

Note how result #3 was found via graph expansion — Sarah's note doesn't mention "auth system" directly, but she's linked from the auth architecture doc via [[Sarah Chen]].

Connect to Claude Code:

# Start the MCP server
engraph serve

# Or add to Claude Code's settings (~/.claude/settings.json):
{
  "mcpServers": {
    "engraph": {
      "command": "engraph",
      "args": ["serve"]
    }
  }
}

Now Claude can search your vault, read notes, build context bundles, and create new notes — all through structured tool calls.

Enable intelligence (optional, ~1.3GB download):

engraph configure --enable-intelligence
# Downloads Qwen3-0.6B (orchestrator) + Qwen3-Reranker (cross-encoder)
# Adds LLM query expansion + 4th reranker lane to search

Example usage

4-lane search with intent classification:

engraph search "how does authentication work" --explain

 1. [0.04] 01-Projects/API-Design.md > # API Design  #e3e350
    All endpoints require Bearer token authentication...

Intent: Conceptual

--- Explain ---
01-Projects/API-Design.md
  RRF: 0.0387
    semantic: rank #2, raw 0.38, +0.0194
    rerank: rank #2, raw 0.01, +0.0194
02-Areas/Development/Auth-Architecture.md
  RRF: 0.0384
    semantic: rank #1, raw 0.51, +0.0197
    rerank: rank #4, raw 0.00, +0.0187

The orchestrator classified the query as Conceptual (boosting semantic lane weight). The reranker scored each result for relevance as the 4th RRF lane.

Rich context for AI agents:

engraph context topic "authentication" --budget 8000

Returns a token-budgeted context bundle: relevant notes, connected people, related projects — ready to paste into a prompt or serve via MCP.

Person context:

engraph context who "Sarah Chen"

Returns Sarah's note, all mentions across the vault, connected notes via wikilinks, and recent activity.

Vault structure overview:

engraph context vault-map

Returns folder counts, top tags, recent files — gives an AI agent orientation before it starts searching.

Create a note via the write pipeline:

engraph write create --content "# Meeting Notes\n\nDiscussed auth timeline with Sarah." --tags meeting,auth

engraph resolves tags against the registry (fuzzy matching), discovers potential wikilinks ([[Sarah Chen]]), suggests the best folder based on semantic similarity to existing notes, and writes atomically.

Use cases

AI-assisted knowledge work — Give Claude or Cursor deep access to your personal knowledge base. Instead of copy-pasting context, the agent searches, reads, and cross-references your notes directly.

Developer second brain — Index architecture docs, decision records, meeting notes, and code snippets. Search by concept across all of them.

Research and writing — Find connections between notes that you didn't explicitly link. The graph lane surfaces related content through shared wikilinks and mentions.

Team knowledge graphs — Index a shared docs vault. AI agents can answer "who knows about X?" and "what decisions were made about Y?" by traversing the note graph.

How it compares

	engraph	Basic RAG (vector-only)	Obsidian search
Search method	4-lane RRF (semantic + BM25 + graph + reranker)	Vector similarity only	Keyword only
Query understanding	LLM orchestrator classifies intent, adapts weights	None	None
Understands note links	Yes (wikilink graph traversal)	No	Limited (backlinks panel)
AI agent access	MCP server (13 tools)	Custom API needed	No
Write capability	Create/append/move with smart filing	No	Manual
Real-time sync	File watcher, 2s debounce	Manual re-index	N/A
Runs locally	Yes, llama.cpp + Metal GPU	Depends	Yes
Setup	One binary, one command	Framework + code	Built-in

engraph is not a replacement for Obsidian — it's the intelligence layer that sits between your vault and your AI tools.

Current capabilities

4-lane hybrid search (semantic + FTS5 + graph + cross-encoder reranker) with two-pass RRF fusion
LLM research orchestrator: query intent classification + query expansion + adaptive lane weights
llama.cpp inference via Rust bindings (GGUF models, Metal GPU on macOS, CUDA on Linux)
Intelligence opt-in: heuristic fallback when disabled, LLM-powered when enabled
MCP server with 13 tools (7 read, 6 write) via stdio
Real-time file watching with 2s debounce and startup reconciliation
Write pipeline: tag resolution, fuzzy link discovery, semantic folder placement
Context engine: topic bundles, person bundles, project bundles with token budgets
Vault graph: bidirectional wikilink + mention edges with multi-hop expansion
Placement correction learning from user file moves
Configurable model overrides for multilingual support
270 unit tests, CI on macOS + Ubuntu

Roadmap

~~Research orchestrator — query classification and adaptive lane weighting~~ (v1.0)
~~LLM reranker — optional local model for result quality~~ (v1.0)
MCP edit/rewrite tools — full note editing for AI agents (v1.1)
Temporal search — find notes by time period, detect trends (v1.2)
HTTP/REST API — complement MCP with a standard web API (v1.3)
Multi-vault — search across multiple vaults (v1.4)
Vault health monitor — surface orphan notes, broken links, stale content

Configuration

Optional config at ~/.engraph/config.toml:

vault_path = "~/Documents/MyVault"
top_n = 10
exclude = [".obsidian/", "node_modules/", ".git/"]

# Enable LLM-powered intelligence (query expansion + reranking)
intelligence = true

# Override models for multilingual or custom use
[models]
# embed = "hf:Qwen/Qwen3-Embedding-0.6B-GGUF/qwen3-embedding-0.6b-q8_0.gguf"
# rerank = "hf:ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/qwen3-reranker-0.6b-q8_0.gguf"

All data stored in ~/.engraph/ — single SQLite database (~10MB typical), GGUF models, and vault profile.

Development

cargo test --lib          # 270 unit tests, no network (requires CMake for llama.cpp)
cargo clippy -- -D warnings
cargo fmt --check

# Integration tests (downloads GGUF model)
cargo test --test integration -- --ignored

Contributing

Contributions welcome. Please open an issue first to discuss what you'd like to change.

The codebase is 19 Rust modules behind a lib crate. CLAUDE.md in the repo root has detailed architecture documentation for AI-assisted development.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
.github/workflows		.github/workflows
assets		assets
examples		examples
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

engraph

Why engraph?

What problem it solves

How it works

Quick start

Example usage

Use cases

How it compares

Current capabilities

Roadmap

Configuration

Development

Contributing

License

About

Uh oh!

Releases 14

Packages

Uh oh!

Contributors 2

Languages

Folders and files

Latest commit

History

Repository files navigation

engraph

Why engraph?

What problem it solves

How it works

Quick start

Example usage

Use cases

How it compares

Current capabilities

Roadmap

Configuration

Development

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 14

Packages 0

Uh oh!

Contributors 2

Languages

Packages