Skip to content

ezzy1630/Argyph

Argyph

The local-first MCP server for serious codebases. Zero config, zero cloud, full context.

CI crates.io npm License: MIT OR Apache-2.0

Argyph is a single MCP server that gives any AI coding agent fast, structured, and semantic context over a codebase. It runs entirely on your machine, indexes incrementally, and is ready in under a second on previously-indexed repos.

The name is a portmanteau of Argus (the hundred-eyed watcher of Greek myth) and Glyph (a carved symbol with bound meaning) — a server that watches a codebase and gives the agent its symbols.


What it does

Argyph replaces the half-dozen MCP servers most developers stitch together (one for grep, one for embeddings, one for symbol search, one for repo packing) with a single tool. It exposes three pillars of context behind one MCP endpoint:

  1. Ask-first retrievalask is the primary agent-facing lookup tool. It routes bare identifiers to symbol search, structured locators to locate, and natural-language questions to hybrid search, returning bounded Span results instead of whole files.
  2. File and symbol intelligence — a tree-sitter-driven symbol graph with find_definition, find_references, callers, callees, imports, and outline tools. Structural queries return in milliseconds.
  3. Semantic search — hybrid (BM25 + vector) search over AST-aware chunks, backed by an embedded LanceDB store. Bundled local embedding model means no API key is required for full functionality.
  4. Repo packing — token-budgeted, repomix-style flattening of a repo or subset for agents that need to absorb a codebase quickly.

Everything is read-only. Argyph never edits, commits, or executes code.


Why local-first

Most existing context servers depend on a cloud vector database (Milvus, Pinecone) and a remote embedding API. That's a non-starter for proprietary code at most companies, and a tax on cold starts everywhere else. Argyph runs entirely on the developer's machine: a single binary, an embedded vector store, an optional bundled embedding model, no daemon, no account, no key required to get full functionality.


Install

Current release: v1.0.1 — published on npm, crates.io, the Homebrew tap, and as prebuilt GitHub-release binaries.

Claude Code

claude mcp add argyph -- npx argyph@latest

npm / npx

npx argyph

Homebrew (macOS / Linux)

brew install Ezzy1630/argyph/argyph

Universal installer

curl -fsSL https://raw.githubusercontent.com/Ezzy1630/argyph/main/scripts/install.sh | bash

Intel Mac (x86_64-apple-darwin) users: the ONNX Runtime backend doesn't ship a prebuilt for Intel macOS, so no prebuilt binary is produced for that target. Use the Cargo path below — it builds in a couple of minutes and works fine on Intel Macs.

Cargo

cargo install argyph builds Argyph from source. Requires a Rust toolchain ≥ 1.88 (rustup update). The bundled local embedder uses ort (ONNX Runtime); on most platforms ort ships a prebuilt dynamic library and there is nothing to do. On Linux you may need libssl-dev and a working C toolchain; on Windows the MSVC build tools are required. If the build fails on ort-sys linkage, set ORT_STRATEGY=download before re-running.

cargo install argyph --locked

Build from source

git clone https://github.com/Ezzy1630/argyph.git
cd argyph
cargo build --release
./target/release/argyph serve

Claude Desktop (DXT)

Download argyph.dxt from the latest release and double-click.


Quick start

# In any repo
cd ~/code/your-repo
argyph init
claude mcp add argyph -- npx argyph@latest
claude

In the chat:

What does this codebase do, and where is session expiration controlled?

Argyph indexes Tier 0 in under a second on first run, Tier 1 (symbol graph) in seconds, and Tier 2 (embeddings) in the background. You can query immediately — tools return what's available now plus an index_coverage field so the agent knows.


How it works

Argyph three-tier indexing

Argyph builds the index in three tiers, each useful before the next completes:

Tier What it builds Time on a 1M-LOC repo Useful for
0 File inventory, hashes, .gitignore-aware tree <1 s Tree views, ripgrep, packing
1 Symbol graph (defs, refs, calls, imports) ~30 s Go-to-def, find-references, call graphs
1.5 Structural index (non-code: md, json, yaml…) seconds (after Tier 1) locate for structured non-code files
2 Embeddings + hybrid index minutes (background) Fuzzy semantic queries

The 80/20 insight: most agent queries are structural (where is parseConfig defined?, what calls validateUser?) and don't need embeddings. Argyph serves those from Tier 1 in milliseconds, even while Tier 2 is still building.

After the first index, the on-disk .argyph/ directory persists and only changed files are reprocessed on subsequent runs. A filesystem watcher keeps everything live.


Tools

Tool Description Tier required
ask Primary lookup router returning bounded spans 0/1/1.5/2
get_index_status Tier readiness, embedding progress, watcher state 0
get_repo_overview Languages, entry points, README excerpt, tree 0
search_text Ripgrep-style regex / literal search 0
find_definition Locate the definition of a named symbol 1
find_references Reference sites with surrounding context 1
get_callers Functions that call a given function 1
get_callees Functions a given function calls 1
get_imports Imports of a file, and files that import it 1
get_symbol_outline Hierarchical outline of a file 1
search_semantic Hybrid BM25 + vector over AST-aware chunks 2
pack_repo Token-budgeted repo flattening (XML or markdown) 0+1
locate Smallest natural span containing the target 1.5
locate_smart Retrieval subagent (opt-in; needs provider config) 1.5
expand_span Expand one truncated span from a session handle 0
memory_save Persist a memory entry under a scope 0
memory_search FTS5 search over persistent memories 0
memory_list List memories in a scope 0
memory_forget Delete a memory entry by id 0

Full schema reference: docs/tools-reference.md.


Optional features

locate_smart — retrieval subagent

locate_smart is an opt-in tool that runs a bounded multi-step retrieval loop using an LLM provider. It's disabled by default; enable it in .argyph/config.toml:

[locate_smart]
enabled  = true
provider = "openai"        # or "anthropic" | "ollama"
model    = "gpt-4o-mini"
# endpoint = "http://localhost:11434"   # only for local providers

Or via env:

ARGYPH_LOCATE_SMART_ENABLED=1
ARGYPH_LOCATE_SMART_PROVIDER=anthropic
ARGYPH_LOCATE_SMART_MODEL=claude-haiku-4-5

When disabled, calls to locate_smart return LOCATE_SMART_DISABLED immediately and no provider keys are required.

Build with the smart feature:

cargo install argyph --features smart

Configuration

Config is layered (highest priority first): env vars, .argyph/config.toml in the repo, built-in defaults. A config file is never required.

ARGYPH_LOG=info
ARGYPH_EMBED_PROVIDER=local        # local | openai | voyage
OPENAI_API_KEY=...                  # standard provider env vars
ARGYPH_DISABLE_WATCHER=true         # for sandboxed environments

Install agent lookup instructions in CLAUDE.md, AGENTS.md, or GEMINI.md:

argyph init

Why Argyph (vs alternatives)

Tool Symbol graph Semantic search Local-first Single install Incremental
claude-context (Zilliz) yes yes yes
GitNexus / CodeGraphContext yes
repomix yes yes
Serena yes yes yes
Argyph yes yes yes yes yes

Benchmarks

Reproducible numbers, methodology in docs/benchmarks.md. Reproduce locally with cargo bench --workspace. Numbers below are the median of three runs on the reference hardware tagged m3-pro (Apple M3 Pro, 36 GB RAM, macOS 15) unless stated otherwise.

Hot paths (criterion, mean times)

Bench Mean What it measures
locate_parse_path_bare ~15 ns Parsing a bare path locator
locate_parse_path_heading ~18 ns Parsing path > Heading locators
locate_strategy_path_only ~19 ns Strategy dispatch for path-only locator
locate_strategy_scoped ~32 ns Strategy dispatch for scoped query locator
token_count_rust_file ~4.4 µs cl100k_base tokenization of one source file
walk_project_root ~244 ms Full walkdir traversal of this repo

System-level

End-to-end timings on an Apple M-series laptop, macOS 15. Reproduce via:

cargo run --release -p argyph-benches --bin system_bench -- /path/to/repo
Fixture Files LOC Tier 0 cold Tier 1 full
BurntSushi/ripgrep 215 ~52K 71 ms 6.8 s
microsoft/TypeScript (src/) 709 ~452K 30 ms 8.2 s
microsoft/TypeScript (whole repo) 81,310 ~2M 2.1 s see note ↓

Tier 0 — the "useful immediately" gate — scales linearly and fast (81K files in 2.1 s). search_text and pack_repo are available the moment Tier 0 is ready.

Tier 1 (the symbol graph) is fast on normal-to-large repos. Two optimizations landed for v1.0: the within-file reference resolver was rewritten from an O(symbols²) substring scan to one-pass hash-indexed lookups (4.2× — TypeScript compiler source: 34.6 s → 8.2 s), and the per-file symbol/chunk SQL writes are now batched. On very large monorepos (the full 81K-file TypeScript repo, ~2M LOC) Tier 1 still does not finish within 30 min — the residual cost is raw tree-sitter parse volume across 79K files, which needs streaming/parallel indexing (tracked in ROADMAP.md). The server never blocks at any scale: Tier-1 tools return INDEX_NOT_READY with a retry hint until the graph is ready. Full data: docs/benchmarks.md.


Architecture

Argyph is a Rust workspace of twelve focused crates with strict module ownership. The full architecture, including the Supervisor lifecycle, the three-tier indexing model, and per-crate responsibility boundaries, is documented in ARCHITECTURE.md.


Project status

Release candidate. v1.0.0-rc.1 is published; v1.0.0 will cut once the system-level benchmark targets in docs/SPEC.md § 6 are verified on the reference hardware. The build plan and milestones are in docs/BUILD_PLAN.md; the current milestone is tracked at the top of ROADMAP.md.


Contributing

Argyph is built with substantial AI assistance, but human-architected and human-reviewed. Contribution guide and the (strict) AI agent rules are in CONTRIBUTING.md. Commit conventions, including the project's attribution policy, are in docs/COMMIT_CONVENTIONS.md.


Author

Built by Ezzy1630. See AUTHORS.md.


License

Dual-licensed under either of:

at your option.

About

Local-first MCP server giving AI coding agents fast, structured, and semantic context over any codebase. Zero config, zero cloud, full context.

Topics

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages