-
Notifications
You must be signed in to change notification settings - Fork 0
Architecture and How It Works
SymForge is a local-first MCP server for code intelligence. It serves an agent from a live repository index instead of making the agent assemble context by reading broad chunks of files.
Use this page for
understanding what SymForge owns, how the index works, and where the boundary sits between SymForge, the shell, and the client.
flowchart LR
Client["MCP client<br/>Codex, Claude, Gemini, Kilo, etc."] --> Server["symforge stdio MCP server"]
Server --> Startup["startup planner"]
Startup -->|local session| Local["in-process LiveIndex"]
Startup -->|shared sessions| Daemon["optional local daemon"]
Daemon --> Local
Workspace["workspace files"] --> Parser["tree-sitter parsers<br/>config extractors"]
Parser --> Local
Watcher["filesystem watcher"] --> Local
Git["git status, diffs, history"] --> Signals["frecency, co-change,<br/>temporal hotspots"]
Signals --> Local
Local --> Snapshot[".symforge/index.bin"]
Snapshot --> Local
Local --> Tools["MCP tools<br/>resources<br/>prompts"]
Tools --> Client
Tools --> Edits["structural edit engine"]
Edits --> Workspace
Edits --> Impact["analyze_file_impact"]
Impact --> Local
Tools --> Analytics["optional analytics queue"]
Analytics --> AnalyticsDb[".symforge/analytics.db"]
SymForge should be the first stop for:
- repository orientation
- source-code file outlines
- symbol lookup and symbol source reads
- text search with enclosing symbol context
- structural AST search
- reference and dependent tracing
- changed-file and symbol-diff inspection
- syntax diagnostics for supported code/config files
- edit planning and symbol-scoped source edits
- post-edit reindexing and impact analysis
SymForge does not replace:
-
cargo,npm, test runners, or package managers - Docker and process control
- runtime debugging
- OS diagnostics
- literal document reads where exact prose is the target
- Startup discovers the workspace unless auto-indexing is disabled.
- Files are admitted according to project boundaries, ignore rules, and noise policy.
- Source files are parsed with tree-sitter language extractors.
- Config/document files use dedicated extractors where available.
- Symbols, references, file text, parse diagnostics, and metadata are published
into
LiveIndex. - Query tools read from the in-process index.
- The watcher and
analyze_file_impactkeep changed files fresh. - Snapshots under
.symforge/index.binwarm future startup.
| Area | Role |
|---|---|
src/protocol/ |
MCP protocol surface, tool handlers, prompts, resources, result metadata, formatting |
src/live_index/ |
In-memory file/symbol/reference store, queries, search, snapshots, rank signals |
src/parsing/ |
Tree-sitter integration, language extractors, config extractors, diagnostics |
src/analytics/ |
Local SQLite analytics store and bounded background writer |
src/cli/ |
init, hook, trust, and analytics command handling |
src/daemon.rs |
Shared local daemon and session routing |
src/sidecar/ |
Local sidecar state, token stats, and HTTP handler surfaces |
src/watcher/ |
Filesystem watching and reconciliation |
src/git.rs |
Git status, diffs, retry helpers, and temporal input |
npm/ |
JavaScript launcher, installer, and npm packaging tests |
- Local-first index: source spans depend on exact local bytes, so query serving should stay in process whenever possible.
-
Snapshots are acceleration, not authority:
.symforge/index.binis used for warm startup, but current files and reindexing remain the source of truth. -
Explicit recovery: bad parses and stale state are surfaced in
health. Recovery paths are tools such asvalidate_file_syntax,analyze_file_impact, andindex_folder. - Symbol-scoped edits: edit tools resolve targets server-side, write atomically, and reindex after successful changes.
- Capability evidence: optional ranking and routing features report whether they were applied, unavailable, disabled, stale, or falling back.
- No fake success: result metadata distinguishes found, empty, ambiguous, invalid, and failure states separately from the human-readable text.
Query responses open with a machine-readable header so the agent knows how much to believe the result instead of guessing:
- Match type — exact, constrained, or heuristic.
- Source authority — current index, disk-refreshed, or worktree target (rebased) for routed edits.
- Parse state — parsed, partial, or degraded for the files involved.
- Completeness — full, budget-limited, or truncated by a result cap, always with the actual numbers (for example "output is ~707 tokens; budget is 600").
- Scope — what was searched and which noise classes (vendor, generated, tests, personal tooling) were filtered.
-
Evidence anchors —
file:linereferences the agent can jump to.
Truncation is never silent: every bounded response says it was bounded and by
how much. ask extends the same idea to routing — it reports route confidence
(exact vs inferred), its rationale, and a suggested next step, and it
downgrades its own confidence on compound questions rather than returning a
confident false negative.
Some valid code trips upstream tree-sitter grammar limitations (for example,
TypeScript import('rxjs').Subscription[], or Angular @if (a > b) control
flow inside .html templates). SymForge separates these expected partials
from genuine repo defects in health — but only after a proof, never a
heuristic:
- Neutralize only the suspected construct, token-preservingly (a space replaces the offending characters, so adjacent tokens can never fuse).
- Re-parse the whole file.
- Excuse the file iff the re-parse is completely clean.
A genuinely broken file that merely contains the known construct stays
classified as an unexpected partial, so real defects cannot hide behind grammar
limitations. Verdicts are memoized by content hash, so repeated health calls
and render paths do not re-pay the parse cost.
-
healthto confirm index/runtime state. -
get_repo_map,explore, oraskto orient. -
search_symbols,search_text, orsearch_filesto narrow. -
get_file_context,get_symbol, orget_symbol_contextto inspect. -
edit_planbefore non-trivial edits. - A structural edit tool for source changes.
-
analyze_file_impactfor touched files orindex_folderafter broad work.
The simplest model:
- SymForge answers "where is the code, what does it reference, and how do I edit this symbol?"
- The shell answers "does the project build, do tests pass, what process is running, and what did the OS do?"