ContextAI

Code graph extraction for LLM-assisted debugging. Turn a codebase into a directed property graph so an LLM gets surgical, structured, bounded context — the broken node plus its neighbors — instead of the whole repository.

Why

LLMs fail on large codebases for three reasons:

Too much context → noise, confusion, hallucination
Too little context → blind spots, wrong fixes
No structure → the model can't see how components relate

Bugs don't live in isolation — they live in the relationships between components. ContextAI makes those relationships explicit and traversable by representing every meaningful piece of code as a node and every relationship as an edge. When debugging, you feed the LLM only the broken node, its adjacent nodes, and the connecting edges.

What it does

   source code ──▶ extraction pipeline ──▶ directed property graph ──▶ bounded LLM context
                   (AST · framework ·          (nodes + edges,           (a node + its
                    patterns · runtime)         fully typed schema)        neighborhood)

ContextAI scans a Python project — .py modules and .ipynb notebooks — and emits a typed graph (graph.json) plus an interactive visualization (graph.html). Each node carries its signature, side effects, error-handling profile, and complexity; each edge carries its contract, criticality, and failure behavior.

Quickstart

Requirements: Python 3.11+

# install
pip install -e .          # or: pip install -r requirements.txt

# extract a graph from any file or directory
contextai path/to/your/project/      # or: python3 run.py path/to/your/project/

# outputs (under out/, gitignored):
#   out/graph.json   — all nodes + edges
#   out/index.html   — dashboard: coverage · connectivity health · interactive graph
#   out/graph.html   — raw interactive graph (pyvis, self-contained)

Open out/index.html for the full picture — extraction coverage, connectivity health (fragmentation, isolated nodes, sinks/sources), node-type distribution, and the live graph with a code/location details panel, all in one self-contained file.

Run it against the bundled benchmark (the VS Code Flask tutorial):

python3 run.py python-sample-vscode-flask-tutorial/

Architecture

Layer	Tools	Gets you	Status
1. AST + types	Python `ast` (`.py` + `.ipynb`)	functions, classes, imports, call sites, signatures, side effects	✅ Implemented
2. Framework conventions	custom parsers	route → handler, templates, static assets (Flask)	✅ Flask; others planned
2.5. Pattern matching	regex on source	DB tables, hardcoded URLs, cache ops, template refs	🟡 Partial
3. Runtime tracing	`sys.settrace` + asyncio hooks	actual call chains, dynamic dispatch, async fan-out	✅ Implemented
4. LLM pass	Claude / GPT	semantic intent, implicit relationships	⏳ Planned

No single method captures everything: static analysis sees what code says, runtime tracing sees what it does. ContextAI merges both into one graph.

Pipeline

run.py
  ├─ Pass 1: walk files → emit NODES   (ast_extractor, flask_convention_extractor)
  └─ Pass 2: resolve IDs → emit EDGES  (edge_extractor)

graph/extractors/runtime/   ← trace a running app and merge real call chains
  ├─ tracer.py              sys.settrace + asyncio monkey-patches
  ├─ call_log.py            captured calls → call_log.json
  ├─ script_runner.py       drive a target script/entry point under the tracer
  └─ edge_injector.py       merge runtime calls into the static graph

The graph schema

schema.py is the source of truth (Pydantic v2).

Nodes carry: id, type, name, location, code, signature (typed inputs/outputs), side_effects, error_handling, and metadata (complexity, test coverage, staleness).

Edges carry: id, type, from, to, direction, contract (input/output shape), criticality, on_failure (retry / default / throw / circuit-break), and performance.

Node types

Boundary: API_ENDPOINT, MESSAGE_CONSUMER, CRON_JOB
Logic: FUNCTION, CLASS, MIDDLEWARE, ROUTE_HANDLER
Data: SCHEMA, MODEL, DTO, DATABASE, TABLE, COLLECTION
Infra: MESSAGE_QUEUE, FILE_STORAGE, EXTERNAL_LIBRARY
Scaffolding: MODULE_INIT, ENTRY_POINT, FILE, TEMPLATE, STATIC_ASSET

Edge types

Category	Edges
Call	`CALLS`, `CALLS_ASYNC`, `DELEGATES_TO`
Dependency	`IMPORTS`, `IMPORTS_SIDE_EFFECT`, `INHERITS`, `IMPLEMENTS`, `INSTANTIATES`, `INJECTS`
Data	`READS`, `WRITES`, `VALIDATES`, `TRANSFORMS`, `MAPS_TO`, `RETURNS`
Communication	`HANDLES`, `GUARDS`, `PUBLISHES_TO`, `SUBSCRIBES_TO`, `CALLS_EXTERNAL`, `RENDERS`, `SERVES_STATIC`
App wiring	`USES_APP_INSTANCE`

Analysis tools

All default to graph.json in the current directory.

python3 tools/graph_connectivity.py     # health score, isolated nodes, islands
python3 tools/coverage_from_graph.py    # % of source lines covered by nodes
python3 tools/graph_duplicates.py       # duplicate / overlapping node ranges
python3 tools/diff_graph.py a.json b.json   # diff two graphs

Runtime tracing

python3 tools/run_with_tracing.py \
  --target your_app/main.py \
  --project-root your_app/ \
  --build-static \
  --output graph_runtime.json

Runs your app under the tracer, then merges observed call chains into the static graph (confirming static edges, filling gaps, and adding runtime-only edges).

Public API

graph/api.py is the only surface the MCP server (and any other client) should import — never reach into extractors or GraphStore directly.

from graph.api import (
    build_graph, load_graph, find_node, get_context, list_gaps, get_edge_path,
    run_trace, merge_trace, start_trace, stop_trace,
)

Querying. get_context(store, node_id, depth=2, direction="in") returns a bounded subgraph around a node. It is incoming-biased by default: deep predecessors (who calls this — the blast radius), one shallow successor hop, and a successor pull around gap nodes. Pass direction="out" / "both" to change the bias.

Runtime tracing has three capture modes, all converging on one merge:

Mode	Entry point	Use it for
One-shot script / IDE run	`run_trace(target, project_root, …)`	capture + merge a single script or entry point in one call
Long-running session	`start_trace(project_root)` … `stop_trace(project_root, base_graph, output)`	a server or worker traced across many requests without restarting — every call in between is unioned into one capture
Per-request web	`TracingMiddleware` / `AsyncTracingMiddleware` (`tools/trace_middleware.py`)	trace one request at a time, triggered by an `X-Trace: 1` header

All three feed merge_trace(base_graph, call_log, project_root, output) — the single seam that folds a runtime call log onto a base graph. The base is a parameter: pass the static graph to merge a single action, or a prior runtime graph to accumulate a sequence of actions (call counts sum, edges union). The static graph is never mutated — every merge writes a fresh overlay.

Testing

pip install pytest
pytest -q

The suite (153 tests) is built on an inductive strategy: every atomic extraction pattern — structural, web/API, data access, messaging, signatures, data flow, async runtime — has a minimal fixture and an exact-count assertion. If the extractor handles every base case, it handles their combinations.

tests/
  test_static_induction.py   structural · web · data · messaging · signatures · data flow
  test_phase2_edges.py       call resolution, dynamic dispatch, super(), properties
  test_phase3_runtime.py     tracer capture + edge-injector merge + end-to-end
  test_phase3_http.py        HTTP / async routing patterns
  test_runtime_api.py        public API: direction-aware context, merge/run/session tracing
  test_notebook_extractor.py .ipynb flattening + node/edge extraction across cells
  test_dashboard.py          metrics builder + self-contained dashboard generation
  fixtures/                  minimal atomic patterns per test

Project structure

schema.py                       NodeSchema + EdgeSchema (Pydantic, source of truth)
run.py                          entry point: extract → store → visualize (contextai cli)
graph/
  api.py                        public API consumed by the MCP server
  extractors/
    ast_extractor.py            Python AST → nodes (functions, classes, schemas, …)
    edge_extractor.py           all edge types
    notebook_extractor.py       .ipynb → flatten code cells → reuse AST pipeline
    flask_convention_extractor.py  templates + static assets
    runtime/                    sys.settrace tracer + call log + script runner + edge injector
  store/graph_store.py          NetworkX graph + JSON persistence + direction-aware neighbor traversal
  visualizer/visualizer.py      pyvis HTML output (out/graph.html)
tools/                          connectivity, coverage, duplicates, diff, tracing
benchmarks/flask-tutorial/      hand-authored ground-truth graph (diff target)
dashboard/                      self-contained dashboard → out/index.html
  metrics.py                    reuses the coverage + connectivity tools → one payload
  dashboard.py / template.html  embed graph + metrics into a single HTML file
out/                            generated artifacts (gitignored)
docs/                           design + planning notes
tests/                          inductive test suite + fixtures

Roadmap

Static extraction (AST) — nodes, edges, signatures, side effects
Flask framework conventions (routes, templates, static assets)
Runtime tracing (sync + async call chains)
Inductive test suite (153 tests)
Jupyter .ipynb notebook extraction — code cells → AST pipeline, with cell-aware locations
MCP server — expose the graph to LLM clients as a tool (docs/MCP_SERVER_PLAN.md)
LLM integration — neighborhood-context retrieval for debugging
More frameworks (Django, FastAPI), git/version metadata, derived impact edges (AFFECTS, DEPENDS_ON, TRIGGERS)
Multi-language extraction (JS/TS → UI_COMPONENT)

See docs/ for design and planning notes.

Status

Alpha. The static extractor and runtime tracer work and are covered by tests. The LLM/MCP consumption layer — the part that turns the graph into better debugging answers — is in active development.

License

No license has been chosen yet. Until one is added, all rights are reserved by the author.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ContextAI

Why

What it does

Quickstart

Architecture

Pipeline

The graph schema

Analysis tools

Runtime tracing

Public API

Testing

Project structure

Roadmap

Status

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.devcontainer		.devcontainer
benchmarks/flask-tutorial		benchmarks/flask-tutorial
dashboard		dashboard
docs		docs
graph		graph
tests		tests
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run.py		run.py
schema.py		schema.py

Folders and files

Latest commit

History

Repository files navigation

ContextAI

Why

What it does

Quickstart

Architecture

Pipeline

The graph schema

Analysis tools

Runtime tracing

Public API

Testing

Project structure

Roadmap

Status

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages