Hybrid code search for your coding agent — one offline binary, no model, no database.
Quick Start · Features · Where it fits · How it works
A single Rust binary that runs as a Model Context Protocol server, giving a coding agent fast, fully-offline hybrid search over any local repository — no embedding model to download, no external vector or graph database. It installs in seconds, runs air-gapped in a few megabytes of RAM, and keeps its entire state in one SQLite file.
# Your agent calls the search_code MCP tool:
search_code(path=".", query="where does the runtime block on a future?")
# → top hit — a chunk WITH its structure, not just a line number:
{
"file": "src/runtime/handle.rs",
"start_line": 241, "end_line": 341,
"kind": "method",
"signature": "block_on<F: Future>(&self, future: F) -> F::Output",
"snippet": "/// Runs a future to completion on this Handle's associated Runtime...",
"imports": [{ "source": "crate::runtime::task::JoinHandle", "line": 17 }],
"exports": []
}Structure (signatures, imports/exports, and
struct/enum/class/typesymbols) is extracted for Rust, TypeScript, Python, and Go. Any other language is still searchable — indexed as overlapping text windows.
Note
A hash, not a model. The dominant code-intelligence tools are heavy: Node plus native bindings, a C/C++ toolchain for some grammars, an embedded graph or vector database, and a learned embedding model that downloads on first run. Their strength is deep understanding; their cost is that they are anything but lightweight.
apohara-codesearch takes the other side of that trade. The embedding is a deterministic blake3 feature-hash, not a learned model — so there is nothing to download, nothing to serve, and the same input always produces the same vector. That makes semantic recall weaker than a model-based tool; we compensate with hybrid retrieval (lexical + vector, fused) rather than pretending the hash is semantic. It is a Claude Code MCP plugin, and works with any MCP client.
| 🔌 MCP stdio server | Two tools — search_code (hybrid) and reindex (incremental) — over plain JSON-RPC. Works with Claude Code or any MCP client. |
| 🦀 One static binary | No Node, no native bindings, no toolchain, no service. cargo install or npx, then run. The only state is one SQLite file. |
| 🧠 Hybrid ranking | BM25 (SQLite FTS5) + a feature-hash vector (sqlite-vec), merged with Reciprocal Rank Fusion, then MMR-diversified. |
| 🌳 Structural extraction | Per-symbol chunks with signatures + file imports/exports for Rust, TS, Python, Go; everything else indexed as text. |
| 📴 Offline & air-gapped | Zero network at runtime AND at build. No model fetch, no telemetry, no API keys. |
| 🪶 Near-zero footprint | ~22 MB resident memory indexing a 224k-LOC repo — flat with repo size (memory-bounded pipeline). |
| ⚡ Incremental + watch | reindex does blake3 content-hash deltas; the watch subcommand keeps the index current as files change (a plain CLI loop, not a plugin hook). |
| 🔁 Deterministic | Same input ⇒ same vector ⇒ byte-stable recall@k/MRR. Re-indexing is stable. |
Register it with your MCP client. For Claude Code, add to .mcp.json:
{ "mcpServers": { "codesearch": { "command": "npx", "args": ["-y", "@apohara/codesearch-mcp"] } } }The npx wrapper downloads the matching prebuilt binary for your platform on first run. That is the whole install — no model, no database, no daemon.
Other acquisition paths — build from source, run directly, keep the index live
# Build + install from a checkout (lowest-trust path):
cargo install --path crates/apohara-codesearch
# Run the binary directly as a stdio MCP server:
apohara-codesearch serve
# Keep the index current as files change (plain CLI loop, NOT a Claude Code hook):
apohara-codesearch watch <path>Prebuilt, per-OS binaries are also published on Releases (built by cargo-dist). It installs as a Claude Code plugin via the apohara marketplace too.
[!WARNING] Downloading a prebuilt binary is itself a supply-chain surface. Verify the checksum from the Release, or prefer
cargo installand build from source.
| Tool | What it does |
|---|---|
search_code |
Hybrid BM25 + vector search over a repo path. Lazily indexes on first call. Returns the top-k hits with structural context. |
reindex |
Re-index a repo. Incremental by default (blake3 content-hash deltas); force: true rebuilds from scratch. |
Lighter than the graph tools, structure-aware where ripgrep is text-only. It does not match a model-based tool on conceptual recall, and it does not build a call graph — those are deliberately out of scope.
| apohara-codesearch | graph / embedding tools | ripgrep | |
|---|---|---|---|
| Runtime dependencies | one static binary | Node + native bindings + toolchain | one binary |
| Model download | none | hundreds of MB | none |
| External DB / service | none | embedded graph / vector DB | none |
| Offline / air-gapped | ✓ | usually requires a fetch | ✓ |
| Structural context | signatures + imports/exports (4 langs) | call graphs, deep | text only |
| Ranking | hybrid BM25 + vector (RRF) | learned embeddings | exact / regex |
- Walk + chunk. A
.gitignore-aware walk splits each file into per-symbol chunks (with the symbol's signature attached) plus bounded module-remainder and window chunks, so a giant file never collapses into one diluted chunk. - Index. Each chunk gets a BM25 lexical row (SQLite FTS5) and a feature-hash vector row (sqlite-vec), keyed on a shared row id. Both sides share one identifier tokenizer, so
parseStringandparse_stringmatch each other. - Search. A query runs through both BM25 and vector k-NN; the two ranked lists are merged with Reciprocal Rank Fusion, diversified with MMR, then the survivors are hydrated with their structural context.
- Stay current. Re-indexing hashes each file and reprocesses only what changed, in a single transaction that keeps the three tables consistent.
Measured with the default feature-hash embedder on a Ryzen 5 3600 / 48 GB box, driven over the stdio MCP tools:
| Repo | LOC | Cold index | Peak RSS | Warm query | Index on disk |
|---|---|---|---|---|---|
| tokio | 174k Rust | ~10 s | ~22 MB | ~18 ms | 39 MB |
| hugo | 224k Go | ~26 s | ~24 MB | ~22 ms | 54 MB |
Peak resident memory is flat across repo size — no OOM, no external process. One SQLite file is the only state.
Warning
The vector is a robustness layer, not a semantic engine. Because the embedding is a feature-hash, a conceptual query that shares no tokens with the target will not surface it — and on a clean corpus where lexical search already wins, fusion can be a slight net negative. BENCHMARK.md publishes this (synthetic corpus + a one-off external comparison on real OSS, with ≥30% committed known-miss queries) rather than hiding it. Deep structural context (callers/callees, call graphs) is out of scope by design. A real local embedding model is an opt-in, user-supplied build feature — never downloaded — so the default install stays zero-dependency.
See BENCHMARK.md for the method, the reproduce command, and per-mode recall@k / MRR across BM25-only, vector-only, and hybrid.
apohara-codesearch/
├── crates/
│ ├── apohara-indexer/ # the engine (library)
│ │ └── src/
│ │ ├── walker.rs # .gitignore-aware file walk
│ │ ├── parser.rs # tree-sitter structural extraction (Rust/TS/Python/Go)
│ │ ├── chunker.rs # per-symbol + bounded module/window chunks
│ │ ├── tokens.rs # shared snake/camel identifier tokenizer
│ │ ├── embeddings.rs # deterministic blake3 feature-hash vector
│ │ ├── embedder.rs # pluggable Embedder trait (opt-in gguf-embed)
│ │ ├── storage.rs # SQLite: chunks + FTS5 + sqlite-vec
│ │ ├── schema.rs # migrations + embedder refuse-to-mix meta
│ │ ├── search.rs # BM25 + vector + RRF + MMR + structural boost
│ │ └── incremental.rs # blake3-delta reindex in one transaction
│ └── apohara-codesearch/ # the MCP server + CLI
│ ├── src/{main,server,watch,dto}.rs
│ └── examples/ # bench-search (in-CI) · bench-external (one-off)
├── npm/ # @apohara/codesearch-mcp wrapper (downloads the Release binary)
├── .claude-plugin/ + marketplace.json # Claude Code plugin manifest
└── .github/workflows/ # ci.yml (test/clippy/fmt/dist) · release.yml (cargo-dist)
- MCP stdio server (
search_code+reindex) +watchsubcommand - Structural extraction for Rust, TypeScript, Python, Go
- Hybrid retrieval — BM25 + feature-hash vector, RRF + MMR + structural boost
- Incremental reindex (blake3 content-hash deltas), one SQLite file
- Honest benchmark — synthetic (in-CI) + external real-OSS, with committed known-miss
- Large-OSS soak (Rust + Go ≥100k LOC) — flat ~22 MB peak RSS
- Pluggable
Embeddertrait (opt-in, default stays zero-model) - Real local embedder backend (candle / safetensors, opt-in, user-supplied)
- Skip generated/minified assets in the walker (DB-bloat hardening)
- Per-language chunk-cap validation (TypeScript / Python)
Contributions are welcome.
- Fork the repository.
- Create a feature branch (
git checkout -b feature/my-change). - Make your change and run the suite:
cargo test --workspace(clippy-D warnings+rustfmt --checkgate CI). - Open a pull request.
Unless you state otherwise, any contribution you intentionally submit for inclusion in this work, as defined in the Apache-2.0 license, shall be dual-licensed as below, without any additional terms or conditions.
Licensed under either of MIT or Apache-2.0, at your option. See NOTICE for third-party dependency licenses.
Maintained by SuarezPM.