Skip to content

rishiskhare/lexa

Repository files navigation

Lexa

Local-first hybrid retrieval for your Obsidian vault and code. A single static Rust binary plus an MCP server, so Codex / Claude Desktop / Cursor / Claude Code can answer questions from your notes without anything leaving your machine.

Crates.io Crates.io License: MIT OR Apache-2.0 CI Release


Quick start

curl -fsSL https://raw.githubusercontent.com/rishiskhare/lexa/main/scripts/install.sh | sh
lexa-obsidian setup
# restart Codex / Claude Desktop / Cursor, then ask:
#   > what did I write about <topic>?

That's it. setup is interactive: it picks up your vault, optionally pre-indexes it, writes the right MCP config block into ~/.codex/config.toml (and Claude Desktop / Claude Code if you opt in), and drops an AGENTS.md in your vault root so agents route note questions through Lexa without the "Use lexa-obsidian." prefix.


Demo

> what did I write about the rate limiter redis fallback last quarter?
[
  {
    "path": "Daily/2026-04-30.md",
    "title": "Daily 2026-04-30",
    "score": 0.7141,
    "heading": "Followups",
    "excerpt": "redis down, switching to in-memory backoff in `acquire`. The reranker latency was acceptable at p95 of 261 ms.",
    "line_start": 12,
    "line_end": 18,
    "tags": ["daily", "project/lexa"],
    "breakdown": { "routed_to": "fast" }
  }
]

Want to try it without your own vault? Point Lexa at the bundled demo-vault/:

lexa-obsidian --vault ./demo-vault setup --no-codex --no-agents-md
lexa-obsidian --vault ./demo-vault --hash-embeddings search "reranker latency"

Features

Feature What it gives you
Hybrid retrieval BM25 (FTS5) + binary-quantized 768-d Matryoshka KNN (sqlite-vec) + cross-encoder rerank, fused with RRF (k=60).
Five tiers instant / dense / fast / deep / auto — mirroring Exa's API. p50 ~9 ms on the fast tier with real Nomic v1.5-Q.
Auto-routing Single-identifier queries → instant (BM25); long question-form → deep; default → fast. No "Use lexa-obsidian." needed.
Obsidian-native Frontmatter stripped before embedding. Wiki-links, inline #tags, ^block-ids, and ![[embeds]] parsed into sidecar tables.
MCP-first Eight tools — search_notes, find_backlinks, list_tags, get_note, get_similar, index_vault, purge_vault, vault_status.
Background indexing The MCP server indexes in the background; content calls return {indexing: true, ...} while in-flight, so Codex never hangs.
Live re-indexing Built-in notify-debouncer-mini watcher: notes you add, edit, or delete in Obsidian appear in (or vanish from) search within ~500 ms. Idempotent on every event.
One static binary No daemon, no Python, no Docker. SQLite + sqlite-vec is the entire backend.
100% local First run downloads Nomic + BGE ONNX (~390 MB). After that, zero network calls. No telemetry, no API keys.
--offline flag Sets HF_HUB_OFFLINE=1 so fastembed refuses every network fetch — hard offline guarantee after models prefetch.

Use cases

Want to... Do this
Ask Codex / Claude / Cursor about your vault lexa-obsidian setup and ask in plain English.
Search a code repo from the CLI lexa index ~/repo && lexa search "rate limiter fallback".
Wire MCP search into a custom agent Use lexa-obsidian-mcp (rmcp stdio) or lexa-mcp (file-shaped).
Embed retrieval into your own Rust app lexa_obsidian::LexaObsidianDb::open(...) — see the crate docs.
Try it without your own data lexa-obsidian --vault ./demo-vault setup --no-codex --no-agents-md.

Subcommands

lexa-obsidian setup            # interactive bootstrap (most users only need this)
lexa-obsidian doctor           # diagnose every common failure mode
lexa-obsidian models prefetch  # download retrieval models (~390 MB) ahead of time
lexa-obsidian --vault <path> index
lexa-obsidian --vault <path> status
lexa-obsidian --vault <path> tags [--prefix X] [--limit N]
lexa-obsidian --vault <path> backlinks <note>
lexa-obsidian --vault <path> search <query> [--tier auto|fast|deep] [--tag X] [--folder Y] [--json]
lexa-obsidian --vault <path> watch

--vault falls back to LEXA_OBSIDIAN_VAULT. The DB defaults to ~/.lexa/obsidian-<sha-of-vault>.sqlite — two distinct vaults never share an index.


Other ways to install

# Cargo (any Rust target, including Linux ARM64):
cargo install lexa-obsidian

# Homebrew (macOS) — coming soon, file an issue if you want it sooner.

# Manual: grab a tarball from
# https://github.com/rishiskhare/lexa/releases/latest

Project layout

lexa/
├── crates/
│   ├── lexa-core/        # Library: chunking, embedding, hybrid retrieval, fusion.
│   ├── lexa-cli/         # `lexa` CLI for any file tree (code, docs).
│   ├── lexa-mcp/         # `lexa-mcp` rmcp stdio server (file-shaped tools).
│   ├── lexa-obsidian/    # `lexa-obsidian` CLI + `lexa-obsidian-mcp` server.
│   └── lexa-bench/       # Five reproducible benchmark harnesses.
├── docs/
│   ├── ARCHITECTURE.md   # Crate map, schema, retrieval pipeline.
│   ├── BENCHMARKS.md     # Latency / BEIR / agent / SimpleQA numbers.
│   ├── FAQ.md            # First-run latency, vault switching, uninstall, privacy.
│   ├── THREAT_MODEL.md   # Read-only-on-vault guarantee, network footprint.
│   └── adr/              # Six one-page architecture decision records.
├── bench/                # Hand-curated query sets for the agent + SimpleQA harnesses.
├── bench-results/        # Committed JSON artifacts so every published number is reproducible.
├── demo-vault/           # 6-note synthetic vault — try Lexa without your own data.
├── templates/AGENTS.md   # Dropped into your vault root by `setup`.
└── scripts/install.sh    # `curl | sh` installer.

Documentation

  • Architecturedocs/ARCHITECTURE.md
  • Benchmarksdocs/BENCHMARKS.md: real numbers with hardware, exact corpus URL, and the command that produces them.
  • FAQdocs/FAQ.md: first-run delay, vault switching, uninstall, model swap, privacy verification with tcpdump.
  • Threat modeldocs/THREAT_MODEL.md: what Lexa does, what it does not do, and how to verify each claim.
  • Decision logdocs/adr/: six ADRs covering name, storage, embeddings, chunking, search tiers, MCP posture, Obsidian adapter.

How Lexa maps to Exa

Lexa is local-first, but the architecture follows Exa's: tiered search, hybrid retrieval, Matryoshka prefixes, binary quantization, query-aware highlights.

Exa concept Lexa equivalent
Instant tier (<200 ms, BM25) --tier instant — FTS5 BM25 only, p50 ~250 µs.
Fast tier (neural / hybrid) --tier dense (KNN-only) or --tier fast (hybrid + RRF), p50 ~9 ms.
Auto tier --tier auto — query router in classify_query. Default.
Deep tier (agentic) --tier deep + SearchOptions::additional_queries for additionalQueries-style fan-out.
Hybrid retrieval (BM25 + dense) RRF (k=60) over FTS5 BM25 and binary-quantized vector KNN, run concurrently. See Exa: Composing a Search Engine.
Matryoshka prefix Nomic v1.5-Q (768d, MRL-trained at {64, 128, 256, 512, 768}) — vectors_bin_preview bit[256] table. See Exa 2.0.
Binary quantization sqlite-vec's vec_quantize_binary() and bit[N] columns; SIMD Hamming distance.
Cross-encoder reranking BAAI/bge-reranker-base over top-15 fused candidates, sigmoid-blended at α = 0.7 with the RRF score.
Highlights / contents API search.rs::highlight — query-token-overlap-scored sentence span.
LLM-as-judge eval lexa-bench simpleqa — Harness E. Five-dim rubric. Default judge qwen3:8b via local Ollama.

Privacy

Lexa runs entirely locally. The only outbound network call is the first-run download of two ONNX models (Nomic v1.5-Q ~110 MB, BGE reranker ~280 MB) from Hugging Face. After that — zero network. No telemetry, no analytics, no API keys.

For a hard offline guarantee, run lexa-obsidian models prefetch once on a connected machine, then use --offline (or LEXA_OFFLINE=1) to make fastembed refuse every network call. See docs/THREAT_MODEL.md for the verification recipe (tcpdump-style proof) and the full posture.


Development

git clone https://github.com/rishiskhare/lexa
cd lexa
cargo build --workspace --release
cargo test --workspace --release
cargo clippy --workspace --all-targets --release -- -D warnings
cargo fmt --all -- --check

47 tests across 11 suites, no unsafe outside the SQLite extension loader, all hot paths covered by Criterion benches in crates/lexa-bench/benches/.


Contributing

Issues and PRs welcome. The decision log under docs/adr/ is the best place to start if you want to understand the design before changing it. For larger changes, open an issue first so the architectural fit can be discussed.


License

Dual-licensed under either of:

at your option.

About

Pareto-optimal local search for agents — inspired by Exa's hybrid retrieval. Single Rust binary.

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

 
 
 

Contributors