Skip to content

coseto6125/egent-code-plexus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

505 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

EgentCodePlexus

OpenSSF Scorecard Star History Chart

A code intelligence graph for LLMs and AI code agents β€” one-shot CLI, zero-copy mmap, sub-second per query.

繁體中文 (Traditional Chinese)


🎯 Mission

ecp exists to be the structural-knowledge layer that an autonomous AI coding agent calls 20–50 times per task. Every design decision falls out of that one premise:

  • Built for agents, not IDEs. Output is token-cheap (TOON / compact JSON), every flag surfaces via --help, every command is non-interactive and stdout-parseable. No UI, no human-skim layout cruft eating the agent's context window.
  • No warm-up, no daemon. Each invocation mmaps a zero-copy rkyv graph file and exits. Read queries return in ~140–170 ms including process startup; a 22k-file repo cold-indexes in under 3 s. An agent can fire dozens of queries per task without amortising a server boot, and there is no "daemon died, please restart" failure mode.
  • Honest answers over readable graphs. When a call site can't be statically resolved (dynamic dispatch, unresolved import, reflection), ecp emits a BlindSpot record β€” not a guessed edge. An agent that acts on a hallucinated dependency is much more expensive than one that gets an "I don't know" it can route around.
  • Polyglot reach. 31 languages parsed at the structural level so modern multi-stack repos (service code + Dockerfiles + GitHub Actions + Terraform + SQL + smart contracts) stop being black holes the moment you leave the main language.

πŸŽ™οΈ Agent Interviews β€” See how real AI agents (Gemini CLI, Codex) use and evaluate ecp in autonomous workflows.

Built on top of GitNexus by Abhigyan Patwari β€” same conceptual model (a structural knowledge graph of a repo), rewritten in Rust for a different audience. Licensed under PolyForm Noncommercial 1.0.0; see NOTICES.md for required attribution.


⚑ Performance

The Mission section above is why ecp is built the way it is. This section is the receipts.

Head-to-head vs. upstream GitNexus

Measured on the gitnexus codebase (TypeScript) using scripts/parity/benchmark_vs_gitnexus.py:

Phase ecp (Rust) gitnexus (Node) Speedup
Cold Index ~970 ms ~58 s 60Γ—
Symbol Context ~70 ms ~430 ms 6Γ—
Blast Radius ~70 ms ~460 ms 6Γ—
Cypher Query ~70 ms ~400 ms 5Γ—

Note: ecp query latency includes full process startup (no daemon). GitNexus (v1.6.5) query latency is against a warm, indexed repo via its CLI.

Scalability β€” single run on .sample_repo (a 2.1 GB polyglot collection of ~40 real-world open source projects across 25+ languages, used for cross-language stress testing)

Ingest performance:

Phase Value
Files indexed 22,645 across 25 detected languages
Wall-clock (Cold) 2.60 s (parse + resolve + serialize)
Wall-clock (Incremental) 4.9 ms (xxh3_64 hash walk, zero dirty files)
Hardware AMD Ryzen 9 9950X (16 logical), 39.2 GiB RAM, Linux 6.6.87

Per-query latency (including process startup):

Query Median Notes
coverage (registry overview) 1.4 ms smallest read β€” just registry mmap
routes (HTTP route map across repo) 142.3 ms enumerates declarative + imperative
coverage --detailed (frameworks + blind-spots) 143.4 ms full registry + per-framework scoring
impact <symbol> --direction down 145.0 ms BFS over Calls / Extends edges
inspect <symbol> (signature + callers + callees) 145.6 ms symbol resolution + 1-hop traversal
find <name> --mode bm25 (lexical search) 154.5 ms Tantivy query + 5-bucket partition
cypher 'MATCH (a:Class)-[:HasMethod]->(b:Method) ...' 161.5 ms one pattern, one row returned
cypher 'MATCH (a:Method)-[:Calls]->(b:Method) ...' 174.2 ms broader pattern, more matches
impact --baseline HEAD~1 (change-set blast radius) 359.0 ms git diff + parallel per-file parse + BFS

Reproduce: python scripts/benchmark/benchmark_ecp.py.


vs. upstream GitNexus

Same conceptual model, different audience. ecp is not a drop-in replacement β€” choose based on who reads the graph and what they do with it.

Dimension EgentCodePlexus GitNexus
Primary consumer Autonomous AI code agents Human devs + IDE integration
Runtime Stateless one-shot CLI (zero warm-up) Long-running MCP server
Performance < 2.5s cold index / < 150ms query ~60s cold index / ~400ms query
Unresolved edge BlindSpot record (honest unknown) Heuristic guess
Default output TOON / compact JSON (token-cheap) Wiki / UI rendering
Languages 31 (14 deep + 17 structural) 14 (deep, 9-dimension)
Storage Rust + rkyv zero-copy mmap Node.js + LadybugDB

Full breakdown of all 8 dimensions, philosophy, and decision matrix β†’ docs/vs-gitnexus.md


πŸ“¦ Install

Prebuilt binaries are published with each GitHub Release. The installer scripts fall back to a cargo source build only when a matching release asset is unavailable.

# Linux / macOS
curl -sSfL https://github.com/coseto6125/egent-code-plexus/releases/latest/download/install.sh | sh

# Windows PowerShell
iwr https://github.com/coseto6125/egent-code-plexus/releases/latest/download/install.ps1 -UseBasicParsing | iex

# Explicit cargo path (same source build, no installer wrapper)
cargo install --git https://github.com/coseto6125/egent-code-plexus egent-code-plexus --bin ecp --locked

Optional CPU-tuned source build:

repo=https://github.com/coseto6125/egent-code-plexus
RUSTFLAGS="-C target-cpu=native" cargo install --git "$repo" egent-code-plexus --bin ecp --locked --profile release-dist

πŸš€ Quick start

# 1. Index the current repo (incremental; first query also auto-indexes)
ecp admin index --repo .

# 2. Locate a symbol β€” exact name by default
ecp find loginUser
ecp find login --mode bm25       # ranked BM25, top-K partitioned by source/tests/ref/doc/config

# 3. Blast radius β€” who breaks if I change this?
ecp impact validateUser --direction upstream

# 4. Full symbol context (signature, body, callers, callees, 1-hop impact)
ecp inspect validateUser

# 5. Every HTTP route in the repo (declarative @Get + imperative app.get())
ecp routes
ecp routes /api/users --method POST     # route β†’ handler β†’ caller chain

Read-side commands accept --format text|json|toon. Default per command is the token-cheapest representation (mostly toon; find defaults to text; cypher/coverage default to json).


CLI surface

Two tiers β€” agent commands at top level (query/refactor/verify) and admin commands under ecp admin (registry/hooks/destructive). Run ecp --help and ecp admin --help for full flag matrices.

Command Purpose
inspect <name> One symbol β†’ metadata, decorators, signature, callers, callees, 1-hop impact
find <pattern> Locate symbols β€” exact (default) Β· --mode fuzzy substring Β· --mode bm25 lexical ranking; bm25 partitions output into source / tests / reference / document / config buckets
impact <name> --direction <up|down> Blast-radius traversal with confidence filtering. --since <ref> for change-set impact.
rename --symbol <old> --new-name <new> AST-aware multi-file rename across 14 languages. Always --dry-run first.
cypher '<query>' openCypher escape hatch; m.content returns source body.
coverage Registry overview, framework coverage, blind-spot catalog, graph freshness.
routes [<path>] Enumerate HTTP routes (declarative + imperative); with <path> show handler + callers.
contracts Cross-repo API contract inventory (routes / queue / RPC).
diff Resolver-delta β€” edge-level binding tier-degradation + route / contract changes.
tool-map Calls to external HTTP / DB / Redis / queue clients via per-file import-binding analysis.
shape-check Drift between HTTP consumer access patterns and Route response shapes.
peers Multi-session peer collaboration (status / diff / log / gc).
review Aggregated LLM-workflow audit: runs impact + coverage + tool-map + shape-check + diff in one shot, filtered to high-confidence signals.

Admin namespace (ecp admin <cmd> β€” hidden from top-level help):

Command Purpose
index --repo <path> Build / refresh the graph; incremental via xxh3_64 content cache. --force for full rebuild.
drop / prune / rename-branch Index lifecycle: delete, prune stale branch dirs, rename branch on-disk.
install-hook Install the git reference-transaction hook (auto-track branch switches).
config Interactive TOML wizard for .ecp/config.toml.
mcp serve / mcp tools MCP server (stdio) for LLM hosts; tools lists the exposed tool surface.

All commands resolve .ecp/graph.bin from CWD unless --graph <path> is given. Agent-facing commands are non-interactive by design β€” every flag surfaces via --help, every output stream is parseable. Run ecp admin with no subcommand to open the interactive admin TUI for index maintenance, host integrations, config, groups, and diagnostics.


MCP server (for LLM hosts)

ecp ships an MCP server exposing core commands as MCP tools. Hosts that speak MCP (Claude Code, Cursor, Windsurf, Cline, Codex CLI, Gemini CLI) can register ecp and call the tools autonomously.

ecp admin mcp tools          # inspect what tools will be exposed
ecp admin mcp serve          # run the server (default: spawn mode, fresh subprocess per call)

Manual host config example for Claude Code (~/.config/claude-code/mcp-servers.json):

{
  "mcpServers": {
    "ecp": { "command": "ecp", "args": ["admin", "mcp", "serve"] }
  }
}

Progressive path for human operators:

ecp admin
β†’ Agent Integrations
β†’ MCP
β†’ <host>
β†’ install

Codex CLI native integration

The Codex native path is separate from MCP. It prepares a patch for an openai/codex fork instead of editing the running Codex installation directly:

Progressive path for human operators:

ecp admin
β†’ Agent Integrations
β†’ Codex CLI
β†’ install
β†’ native-tools

Bundled skills use the same progressive path:

ecp admin
β†’ Agent Integrations
β†’ Codex CLI
β†’ install
β†’ skills
β†’ all | ecp | simplify

Scripted path for AI agents and automation:

ecp admin codex install native-tools
ecp admin codex install skills all
ecp admin codex install skills ecp
ecp admin codex install skills simplify

The bundled skills teach workflow selection that command help cannot infer by itself:

Skill Use when
ecp The agent needs to decide whether graph-aware symbol, impact, route, contract, or rename workflows are better than grep / file reads.
simplify The agent is reviewing changed code and should start from ecp impact, blind spots, egress, shape drift, and resolver deltas before reading raw diffs.

The native-tools component writes:

~/.config/ecp/host-integration/codex-cli.patch

Apply the patch in your Codex CLI fork, then wire the generated module into Codex's tool registry:

cd /path/to/openai-codex-fork
git apply ~/.config/ecp/host-integration/codex-cli.patch

To verify a fork that already has the native marker, set ECP_CODEX_CLI_CHECKOUT before checking status in the TUI:

ECP_CODEX_CLI_CHECKOUT=/path/to/openai-codex-fork ecp admin
# Agent Integrations β†’ Codex CLI β†’ status

The equivalent scripted checks are:

ECP_CODEX_CLI_CHECKOUT=/path/to/openai-codex-fork ecp admin codex status
ecp admin codex uninstall native-tools
ecp admin codex uninstall skills all

Architecture

crates/
β”œβ”€β”€ ecp-core        # Zero-copy graph (rkyv + mmap), incremental cache, graph queries
β”œβ”€β”€ ecp-analyzer    # Tree-sitter parsers, HTTP route detector, framework confidence
β”œβ”€β”€ ecp-mcp         # MCP server (stdio) β€” exposes core commands as tools
└── ecp-cli         # `ecp` binary, Tantivy BM25 engine, token-optimized output

Parse β†’ resolve β†’ serialize runs through an MPSC channel into a single builder thread that assembles the graph and writes a zero-copy .ecp/graph.bin. Read paths (inspect, cypher, impact, …) mmap this file directly. The xxh3_64 content cache keeps incremental rebuilds at sub-second on a 22k-file repo.


Language coverage

31 languages parsed at the structural level (functions / classes / methods / imports / calls). 14 of them β€” the original GitNexus set β€” get full-depth coverage across imports, named bindings, exports, heritage, types, constructors, config, frameworks, entry points, calls, and rename. The remaining 17 are structural-only (Bash, Crystal, Cairo, Dockerfile, Docker Compose, GitHub Actions, HCL, Lua, Markdown, Move, Nim, Solidity, SQL, Verilog, Vyper, YAML, Zig).

πŸ“Š Full Language Capability Matrix β€” Detailed per-language status and rationale.


βš™οΈ Tuning

Env var Default Effect
ECP_MAX_FILE_BYTES 16777216 (16 MiB) Skip source files larger than this during ingest. Caps worst-case worker RAM at num_threads Γ— MAX.
ECP_CSPROJ_MAX_DEPTH 4 Directory recursion depth for *.csproj discovery. Raise for deeply-nested .NET monorepos.

License & acknowledgments

Licensed under PolyForm Noncommercial 1.0.0. Personal use, research, hobby projects, and noncommercial organizations are explicitly permitted. Commercial use is not granted by this license β€” contact the upstream GitNexus author Abhigyan Patwari for commercial rights.

Built on:

  • GitNexus β€” original design, CLI surface, and conceptual model
  • tree-sitter β€” robust incremental AST parsing
  • rkyv β€” zero-copy deserialization framework
  • Tantivy β€” blazing fast Rust full-text search engine
  • Rayon β€” data parallelism for multi-core concurrent AST parsing
  • xxhash (xxh3_64) β€” extremely fast non-cryptographic hashing for content-based incremental indexing
  • DashMap β€” high-performance concurrent hash maps for graph assembly
  • memmap2 β€” zero-copy memory mapping for sub-millisecond graph access
  • msgspec β€” high-performance JSON serialization for inter-process communication

Onboarding for AI agents (URL bootstrap, Claude Code skill, plugin install) lives at docs/skills/ecp-onboard/. Concurrency invariants and how to re-verify them: ./scripts/audit/audit-concurrency.sh.

Release status

The current verified install path is cargo install --git ..., which builds ecp from source. Release installers already contain the checksum and provenance-verification flow, but they require a published tag and release assets before the binary download path can be end-to-end verified. The agent-facing onboarding skill is documented in docs/skills/ecp-onboard/ONBOARDING.md; it is intended to guide users through install, first index, optional groups, MCP wiring, and next steps. The assisted configuration/setup flow is still being refined.

About

A high-performance code intelligence graph for LLMs and AI agents. Sub-second structural queries, impact analysis, and cross-repo API contracts for autonomous coding workflows.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Contributors