Skip to content
Colin Vaughn edited this page Jun 16, 2026 · 3 revisions

CodeGraph

CodeGraph turns any folder of code into a persistent, queryable knowledge graph, then lets you (or an AI assistant) query that compact graph instead of re-reading the whole codebase. It extracts symbols and relationships across 30+ languages with tree-sitter, clusters them into communities, surfaces the structurally important pieces, and writes machine-readable graphs, human-readable reports, and interactive visualizations.

It is a single static Rust binary (codegraph) with no runtime, plus an MCP server so a coding assistant can consult the graph before grepping or reading files.

60-second tour

codegraph extract .                      # build codegraph-out/ for the current directory
codegraph query "authentication flow"    # get a relevant subgraph back
codegraph explain parse_config           # show a node and its neighbours
codegraph affected parse_config          # what would changing it break?
codegraph search "MATCH (c:class) WHERE c.loc > 500 RETURN c"   # structural query (CGQL)
codegraph diff HEAD~10 HEAD              # how the graph changed between two revisions
codegraph refactor rename old_name --to new_name   # a safe, confidence-scored rename plan
codegraph serve                          # expose the graph to an assistant over MCP

See Quickstart for a fuller walk-through and Installation to get the binary.

What CodeGraph gives you

  • A canonical graph.json you can query without reading source.
  • A Markdown report of god nodes, surprising connections, import cycles, and suggested questions (Analysis and Reports).
  • 2D, 3D, and SVG visualizations plus GraphML / Cypher / DOT / Obsidian / wiki exports (Output Formats, Visualizations).
  • Graph queries: relevant-subgraph search, shortest path, and reverse impact (Querying).
  • Structural search with CGQL, a Cypher-inspired query language matching on kind/visibility/loc/fan-in-out plus named architectural patterns (Commands).
  • Time-travel diff between two git revisions: added/removed dependencies, removed APIs, architectural drift, new cycles, and change hotspots (Commands).
  • Safe refactor: confidence-scored rename/move/extract plans for an agent to apply, with a graph-invariant verify afterwards (Commands).
  • An MCP server (20 read-only tools) and one-command assistant integration (MCP Server, Assistant Integration).
  • Multi-repo federation with real cross-repo edge resolution (Workspaces and Federation).

Documentation map

Getting started: Installation - Quickstart

Concepts: Architecture - Languages - CodeGraph vs Other Tools

Using CodeGraph: Commands - Extraction - Querying

Integrations: MCP Server - Assistant Integration

Scaling: Workspaces and Federation - Incremental Updates - PR Dashboard

Reference: Configuration - Development

Design principles

  • Offline by default. A code-only corpus makes no network calls. Only the opt-in --semantic pass over documents calls an LLM (Semantic Analysis).
  • Auditable. Every edge carries a confidence level (EXTRACTED, INFERRED, AMBIGUOUS).
  • Deterministic. The same input produces the same graph; ids and community numbers are stable across rebuilds.

Clone this wiki locally