Lightweight MCP server for semantic code search
BM25 + optional dense vectors + tree-sitter chunking
English | 中文
QEX is a high-performance MCP server for semantic code search built in Rust. It combines BM25 full-text search with optional dense vector embeddings for hybrid retrieval — delivering Cursor-quality search from a single ~19 MB binary. Tree-sitter parsing understands code structure (functions, classes, methods), Merkle DAG change detection enables incremental indexing, and everything runs locally with zero cloud dependencies.
- Pluggable Embedding Backends — Trait-based abstraction over ONNX Runtime (local) and OpenAI API embedding providers with env var configuration
- Hybrid Search — BM25 + dense vector search with Reciprocal Rank Fusion for 48% better accuracy than dense-only retrieval
- 10 Language Support — Python, JavaScript, TypeScript, Rust, Go, Java, C, C++, C#, Markdown via tree-sitter
- Incremental Indexing — Merkle DAG change detection, only re-indexes what changed
- Optional Dense Vectors — snowflake-arctic-embed-s (33 MB, 384-dim, INT8 quantized) via ONNX Runtime, or OpenAI text-embedding-3-small via API
- MCP Native — plugs directly into Claude Code as a tool server via stdio
Claude Code uses grep + glob for code search — effective but token-hungry and lacks semantic understanding. Cursor uses vector embeddings with cloud indexing (~3.5 GB stack). QEX is the middle ground:
- BM25 + Dense Hybrid: 48% better accuracy than dense-only retrieval (Superlinked 2025)
- Tree-sitter Chunking: Understands code structure — functions, classes, methods — not just lines
- Incremental Indexing: Merkle DAG change detection, only re-indexes what changed
- Zero Cloud Dependencies: Everything runs locally via ONNX Runtime
- MCP Native: Plugs directly into Claude Code as a tool server
Install from crates.io:
cargo install qex-mcp
# Add to Claude Code
claude mcp add qex --scope user -- ~/.cargo/bin/qexBuild from source:
# Build (BM25-only, ~19 MB)
cargo build --release
# Or with dense vector search (~36 MB)
cargo build --release --features dense
# Or with OpenAI embedding support
cargo build --release --features openai
# Or with all embedding backends
cargo build --release --features "dense,openai"
# Install
cp target/release/qex ~/.local/bin/
# Add to Claude Code
claude mcp add qex --scope user -- ~/.local/bin/qexThat's it. Claude will now have access to search_code and index_codebase tools.
Dense search adds semantic understanding — finding "authentication middleware" even when the code says verify_token. Two embedding backends are available:
Requires the dense feature flag. Zero cloud dependencies.
# Download the embedding model (~33 MB)
./scripts/download-model.sh
# Or via MCP tool (after adding to Claude)
# Claude: "download the embedding model"Model: snowflake-arctic-embed-s — 384-dim, INT8 quantized, 512 token max.
When the model is present, search automatically switches to hybrid mode. No configuration needed.
Requires the openai feature flag and an API key. Supports any OpenAI-compatible API.
# Build with OpenAI support (can combine with dense)
cargo build --release --features "dense,openai"
# Configure
export QEX_EMBEDDING_PROVIDER=openai
export QEX_OPENAI_API_KEY=sk-... # or OPENAI_API_KEYSee Configuration for all options.
Claude Code ──(stdio/JSON-RPC)──▶ qex
│
┌───────────────┼───────────────┐
▼ ▼ ▼
tree-sitter tantivy ort + usearch
Chunking BM25 Dense Vectors
(11 langs) (<1ms) (optional)
│ │ │
└───────┬───────┘ │
▼ │
Ranking Engine ◄────────────────┘
(RRF + multi-factor)
│
▼
Ranked Results
- Query Analysis — Tokenization, stop-word removal, intent detection
- BM25 Search — Full-text search via tantivy with field boosts (name, content, tags, path)
- Dense Search (optional) — Embed query via pluggable backend (ONNX or OpenAI) → HNSW cosine similarity → top-k vectors
- Reciprocal Rank Fusion — Merge BM25 and dense results:
score = Σ 1/(k + rank) - Multi-factor Ranking — Re-rank by chunk type, name match, path relevance, tags, docstring presence
- Test Penalty — Down-rank test files (0.7×) to prioritize implementation code
- File Walking — Respects
.gitignore, filters by extension - Tree-sitter Parsing — Language-aware AST traversal, extracts functions/classes/methods
- Chunk Enrichment — Tags (async, auth, database...), complexity score, docstrings, decorators
- BM25 Indexing — 14-field tantivy schema with per-field boosts
- Dense Indexing (optional) — Batch embedding via
Embeddertrait (ONNX or OpenAI) → HNSW index - Merkle Snapshot — SHA-256 DAG for incremental change detection
- Dimension Guard —
dense_meta.jsontracks provider/model/dimensions; mismatches trigger full re-index
Index a project for semantic search.
| Parameter | Type | Required | Description |
|---|---|---|---|
path |
string | yes | Absolute path to project directory |
force |
boolean | no | Force full re-index (default: false) |
extensions |
string[] | no | Only index specific extensions, e.g. ["py", "rs"] |
Returns file count, chunk count, detected languages, and timing.
Search the indexed codebase with natural language or keywords.
| Parameter | Type | Required | Description |
|---|---|---|---|
path |
string | yes | Absolute path to project directory |
query |
string | yes | Search query (natural language or keywords) |
limit |
integer | no | Max results (default: 10) |
extension_filter |
string | no | Filter by extension, e.g. "py" |
Auto-indexes if needed. Returns ranked results with code snippets, file paths, line numbers, and relevance scores.
Check if a project is indexed and get stats.
| Parameter | Type | Required | Description |
|---|---|---|---|
path |
string | yes | Absolute path to project directory |
Returns index status, file/chunk counts, languages, and whether dense search is available.
Delete all index data for a project.
| Parameter | Type | Required | Description |
|---|---|---|---|
path |
string | yes | Absolute path to project directory |
Download the embedding model for dense search. Requires the dense feature.
| Parameter | Type | Required | Description |
|---|---|---|---|
force |
boolean | no | Re-download even if exists (default: false) |
| Language | Extensions | Chunk Types |
|---|---|---|
| Python | .py, .pyi |
function, method, class, module-level, imports |
| JavaScript | .js |
function, method, class, module-level |
| TypeScript | .ts, .tsx |
function, method, class, interface, module-level |
| Rust | .rs |
function, method, struct, enum, trait, impl, macro |
| Go | .go |
function, method, struct, interface |
| Java | .java |
method, class, interface, enum |
| C | .c, .h |
function, struct |
| C++ | .cpp, .cc, .cxx, .hpp |
function, method, class, struct, namespace |
| C# | .cs |
method, class, struct, interface, enum, namespace |
| Markdown | .md |
section, document |
| Crate | Description | |
|---|---|---|
qex-core |
Core library: chunking, search, indexing, Merkle DAG | |
qex-mcp |
MCP server binary (stdio transport via rmcp) |
qex/
├── Cargo.toml # Workspace root
├── scripts/
│ └── download-model.sh # Model download script
├── crates/
│ ├── qex-core/ # Core library
│ │ └── src/
│ │ ├── lib.rs
│ │ ├── chunk/ # Tree-sitter chunking engine
│ │ │ ├── tree_sitter.rs # AST traversal
│ │ │ ├── multi_language.rs # Language dispatcher
│ │ │ └── languages/ # 11 language implementations
│ │ ├── search/ # Search engines
│ │ │ ├── bm25.rs # Tantivy BM25 index
│ │ │ ├── dense.rs # HNSW vector index (feature: dense)
│ │ │ ├── embedding.rs # Embedder trait + ONNX backend (feature: dense|openai)
│ │ │ ├── openai_embedder.rs # OpenAI API backend (feature: openai)
│ │ │ ├── hybrid.rs # Reciprocal Rank Fusion (feature: dense)
│ │ │ ├── ranking.rs # Multi-factor re-ranking
│ │ │ └── query.rs # Query analysis
│ │ ├── index/ # Incremental indexer
│ │ │ ├── mod.rs # Main indexing logic
│ │ │ └── storage.rs # Project storage layout
│ │ ├── merkle/ # Change detection
│ │ │ ├── mod.rs # Merkle DAG
│ │ │ ├── change_detector.rs
│ │ │ └── snapshot.rs
│ │ └── ignore.rs # Gitignore-aware file walking
│ │
│ └── qex-mcp/ # MCP server binary
│ └── src/
│ ├── main.rs # Entry point, stdio transport
│ ├── server.rs # Tool handlers
│ ├── tools.rs # Parameter schemas
│ └── config.rs # CLI args
│
└── tests/fixtures/ # Test source files
All data is stored locally under ~/.qex/:
~/.qex/
├── projects/
│ └── {name}_{hash}/ # Per-project index
│ ├── tantivy/ # BM25 index
│ ├── dense/ # Vector index (optional)
│ │ ├── dense.usearch # HNSW index file
│ │ ├── dense_mapping.json # Chunk ID ↔ vector key mapping
│ │ └── dense_meta.json # Provider/model/dimensions guard
│ ├── snapshot.json # Merkle DAG
│ └── stats.json # Index stats
│
└── models/
└── arctic-embed-s/ # Embedding model (optional)
├── model.onnx # 33 MB, INT8 quantized
└── tokenizer.json
QEX uses a pluggable Embedder trait to support multiple embedding providers. The backend is selected via the QEX_EMBEDDING_PROVIDER environment variable.
Local inference with zero cloud dependencies. Requires the dense feature flag.
| Variable | Default | Description |
|---|---|---|
QEX_EMBEDDING_PROVIDER |
onnx |
Set to onnx (or omit) |
QEX_ONNX_MODEL_DIR |
~/.qex/models/arctic-embed-s |
Override model directory |
Cloud-based embeddings via the OpenAI API (or any compatible API like Ollama, LiteLLM, Azure). Requires the openai feature flag.
| Variable | Default | Description |
|---|---|---|
QEX_EMBEDDING_PROVIDER |
— | Set to openai |
QEX_OPENAI_API_KEY |
— | API key (also reads OPENAI_API_KEY) |
QEX_OPENAI_MODEL |
text-embedding-3-small |
Model name |
QEX_OPENAI_BASE_URL |
https://api.openai.com/v1 |
API base URL |
QEX_OPENAI_DIMENSIONS |
auto | Override dimensions for unknown models |
Security features:
- SSRF protection: only HTTPS or
http://localhostURLs allowed for base URL - API key sanitization: keys are never leaked in error messages
- Typed retry: exponential backoff (1s, 2s, 4s) on 429/5xx/timeout/connection errors
Compatible APIs: Any OpenAI-compatible embeddings endpoint works. Set QEX_OPENAI_BASE_URL to your provider's URL:
# Ollama
export QEX_OPENAI_BASE_URL=http://localhost:11434/v1
# Azure OpenAI
export QEX_OPENAI_BASE_URL=https://your-resource.openai.azure.com/openai/deployments/your-modelWhen switching embedding providers or models, QEX detects the mismatch via dense_meta.json and automatically triggers a full re-index. This prevents silent search quality degradation from mismatched vector spaces.
# Run tests (BM25-only)
cargo test # 41 tests
# Run tests (with dense search)
cargo test --features dense # 48 tests
# Run tests (with OpenAI embedder)
cargo test --features openai # 50 tests
# Run tests (all features)
cargo test --features "dense,openai" # 55 tests
# Build for release
cargo build --release # ~19 MB binary
cargo build --release --features dense # ~36 MB binary
cargo build --release --features "dense,openai" # All backends| Crate | Version | Purpose |
|---|---|---|
| tantivy | 0.22 | BM25 full-text search |
| tree-sitter | 0.24 | Code parsing (11 languages) |
| rmcp | 0.17 | MCP server framework (stdio) |
| rusqlite | 0.32 | SQLite metadata (bundled) |
| ignore | 0.4 | Gitignore-compatible file walking |
| rayon | 1.10 | Parallel chunking |
| ort | 2.0.0-rc.11 | ONNX Runtime (optional, dense) |
| usearch | 2.24 | HNSW vector index (optional, dense) |
| tokenizers | 0.22 | HuggingFace tokenizer (optional, dense) |
| ureq | 3 | Sync HTTP client (optional, openai) |
Benchmarked on an Apple Silicon Mac:
| Metric | Value |
|---|---|
| Full index (400 chunks) | ~20s with dense, ~2s BM25-only |
| Incremental index (no changes) | <100ms |
| BM25 search | <5ms |
| Hybrid search | ~50ms (includes embedding) |
| Binary size | 19 MB (BM25) / 36 MB (dense) |
| Model size | 33 MB (INT8 quantized) |