Code Intelligence for AI Agents — Hybrid Python/Rust MCP server with 80% token savings
Miller is a Model Context Protocol (MCP) server that gives AI agents fast, accurate code understanding without reading entire files. Pre-indexed search, symbol navigation, and call tracing return exactly what you need in a lean format that uses 80% fewer tokens than raw file reads.
| Without Miller | With Miller |
|---|---|
| Read 500-line file (2000+ tokens) | get_symbols → 20-line overview (200 tokens) |
grep across codebase (slow, no context) |
fast_search → semantic + text, ranked results |
| Manual reference hunting | fast_refs → all usages in <20ms |
| Reading files to trace calls | trace_call_path → visual call graph |
The key insight: AI agents waste most tokens on file reads. Miller's pre-indexed approach returns just what you need, in a format optimized for AI consumption.
Every tool outputs its own lean format optimized for that tool's data — no verbose JSON by default:
fast_search — grep-style with code context:
10 matches for "authenticate":
src/auth/handler.py:42
41: # Auth logic
42→ def authenticate(user, password):
43: return verify_credentials(user, password)
get_symbols — scannable signature list:
src/auth.py: 5 symbols
authenticate (function) line 42
def authenticate(user: str, password: str) -> bool
AuthMiddleware (class) line 58
class AuthMiddleware
fast_refs — compact reference list:
authenticate: 12 references in 5 files
src/routes/login.py:23 [Call]
src/middleware/auth.py:45 [Call]
tests/test_auth.py:12 [Call]
trace_call_path — ASCII tree visualization:
handleRequest (server.py:50)
├── authenticate (auth.py:42)
│ └── verify_credentials (crypto.py:15)
└── log_request (logger.py:30)
Token savings: 70-90% reduction compared to JSON, depending on the tool.
- 29 Languages: Python, TypeScript, Rust, Go, Java, C#, and more
- Hybrid Search: Tantivy FTS + semantic vectors + cross-encoder re-ranking
- Pre-Indexed: Sub-20ms queries on codebases of any size
- GPU Accelerated: CUDA, MPS (Apple Silicon), DirectML (Windows AMD/Intel)
- Development Memory:
checkpoint/recall/planfor cross-session continuity - Real-time Updates: File watcher keeps index fresh as you code
We use Jina-code-embeddings-0.5b with Matryoshka Representation Learning (MRL).
- Index: 64-dimension
short_vector(IVF-PQ) for lightning-fast candidate retrieval. - Rerank: Full 896-dimension
vectorloaded from disk for high-precision sorting. - Impact: Index size reduced by ~90%, search remains semantically rich.
Reachability, Transitive Closure, and PageRank are computed in Rust using petgraph and rayon.
- Impact Analysis: "What breaks if I change this?" is calculated in microseconds.
- Dead Code: Uses Strongly Connected Components (SCCs) to detect "Dead Islands" (circular dependencies reachable by no one), not just orphans.
We replaced Python's watchdog with Rust's notify crate.
- Behavior: Rust maintains the file hash map and event loop. Python is only notified when content actually changes (hash mismatch).
- Overhead: 0% CPU idle usage on 100k+ file repos.
Semantic search automatically filters based on intent.
- Query: "How is User defined?" -> Filters
kind IN ('Class', 'Struct') - Query: "Where is User used?" -> Filters
kind IN ('Variable', 'Parameter')
Python (MCP + ML) Rust (Parsing)
├── FastMCP Protocol ←───→ ├── Tree-sitter (29 languages)
├── sentence-transformers ├── Symbol extraction
├── SQLite (relations) ├── Call graph building
└── LanceDB (vectors) └── PyO3 zero-copy bridge
pip install miller-corePrerequisites:
- Python 3.12+
- Rust: rustup.rs
- uv: astral.sh/uv (recommended) or pip
All Platforms:
git clone https://github.com/anortham/miller.git
cd miller
uv sync # Install Python dependencies
maturin develop --release # Build Rust extension
uv run pytest python/tests/ -v -o "addopts=" # VerifyWindows Notes:
- Use Python 3.12 for CUDA + DirectML GPU support
- If using PowerShell, run from Developer Command Prompt for Rust builds
- See GPU_SETUP.md for GPU acceleration setup
macOS Notes:
- Apple Silicon (M1/M2/M3) uses MPS for GPU acceleration automatically
- Xcode Command Line Tools required:
xcode-select --install
Linux Notes:
- CUDA support requires PyTorch with CUDA:
pip install torch --index-url https://download.pytorch.org/whl/cu121
After building from source, restart Claude Code, then run:
# From the miller directory (recommended)
cd /path/to/miller
claude mcp add miller -- uv run python -m miller.server
# Or with explicit path (works from anywhere)
claude mcp add miller -- uv run --directory /path/to/miller python -m miller.server
# User-scoped (available in all projects, not just this one)
claude mcp add --scope user miller -- uv run --directory /path/to/miller python -m miller.server
# If installed from PyPI
claude mcp add miller -- python -m miller.serverAfter adding, restart Claude Code to connect. Verify with:
claude mcp listAlternatively, edit the config file directly:
Claude Code (~/.claude/settings.json):
{
"mcpServers": {
"miller": {
"command": "uv",
"args": ["run", "--directory", "/path/to/miller", "python", "-m", "miller.server"]
}
}
}Claude Desktop (config location varies by OS):
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"miller": {
"command": "python",
"args": ["-m", "miller.server"],
"cwd": "/path/to/your/project"
}
}
}If installed from PyPI, simplify to:
{
"mcpServers": {
"miller": {
"command": "python",
"args": ["-m", "miller.server"]
}
}
}fast_search("authentication logic") # Semantic search
fast_search("IUserService", method="pattern") # Code patternsget_symbols("src/auth.py", mode="structure") # Names only (90% token savings)
get_symbols("src/auth.py", target="login", mode="full") # Just what you needfast_refs("authenticate") # Find ALL usages before refactoringtrace_call_path("handleRequest", direction="downstream") # What does it call?
trace_call_path("saveUser", direction="upstream") # Who calls it?checkpoint("Fixed auth bug - was missing await") # Save progress
recall(query="authentication") # Find past work
plan(action="save", title="Add OAuth") # Track tasks| Format | Description | When to Use |
|---|---|---|
| text (default) | Tool-specific lean format | Most queries — optimized for AI reading |
| json | Structured data | Programmatic processing |
| toon | Tabular notation | Large result sets needing structure |
| tree | ASCII visualization | Call path tracing |
Default is always the leanest option. Override with output_format="json" when you need structured data.
# After Rust changes
maturin develop --release
# Run tests
uv run pytest python/tests/ -v
# Lint
cargo clippy && cargo fmt
uv run ruff check python/miller/This is a TDD project. See CLAUDE.md for development guidelines.
Miller provides advanced capabilities beyond simple search.
| Tool | Purpose | Key Tech |
|---|---|---|
fast_search |
Find code by meaning or pattern | Hybrid Search + Type-Aware Reranking |
fast_refs |
Find all usages (100% accurate) | SQLite Identifiers + Rust Graph |
trace_call_path |
visual call graph (up/down) | Rust GraphProcessor (BFS) |
fast_explore |
Find dead code, hot spots, types | Rust SCC Detection + PageRank |
get_architecture_map |
High-level module dependency graph | Aggregated Imports |
validate_imports |
"Dry run" import checking | SQLite Resolution |
find_similar_implementation |
Deduplication / DRY enforcement | Code-to-Code Embeddings |
export MILLER_USE_MRL=false && export MALLOC_ARENA_MAX=2 && uv run miller-server-http --port 8765- CLAUDE.md — Agent onboarding & development rules
- docs/GPU_SETUP.md — CUDA, MPS, DirectML setup
- docs/TOON.md — TOON format specification
- docs/DISTRIBUTION.md — Build & release
v0.1.0 — Initial release. Production-ready for code intelligence workflows.
- ✅ 29-language parsing (tree-sitter via Rust)
- ✅ Hybrid search (FTS + semantic + re-ranking)
- ✅ Development memory (checkpoint/recall/plan)
- ✅ Real-time file watching
- ✅ GPU acceleration (CUDA, MPS, DirectML)
MIT
Tree-sitter extractors originally developed for Julie, a Rust MCP server.