Skip to content

shobhit0521/ContextAI

Repository files navigation

ContextAI

Code graph extraction for LLM-assisted debugging. Turn a codebase into a directed property graph so an LLM gets surgical, structured, bounded context — the broken node plus its neighbors — instead of the whole repository.

Python Tests Status


Why

LLMs fail on large codebases for three reasons:

  • Too much context → noise, confusion, hallucination
  • Too little context → blind spots, wrong fixes
  • No structure → the model can't see how components relate

Bugs don't live in isolation — they live in the relationships between components. ContextAI makes those relationships explicit and traversable by representing every meaningful piece of code as a node and every relationship as an edge. When debugging, you feed the LLM only the broken node, its adjacent nodes, and the connecting edges.


What it does

   source code ──▶ extraction pipeline ──▶ directed property graph ──▶ bounded LLM context
                   (AST · framework ·          (nodes + edges,           (a node + its
                    patterns · runtime)         fully typed schema)        neighborhood)

ContextAI scans a Python project — .py modules and .ipynb notebooks — and emits a typed graph (graph.json) plus an interactive visualization (graph.html). Each node carries its signature, side effects, error-handling profile, and complexity; each edge carries its contract, criticality, and failure behavior.


Quickstart

Requirements: Python 3.11+

# install
pip install -e .          # or: pip install -r requirements.txt

# extract a graph from any file or directory
contextai path/to/your/project/      # or: python3 run.py path/to/your/project/

# outputs (under out/, gitignored):
#   out/graph.json   — all nodes + edges
#   out/index.html   — dashboard: coverage · connectivity health · interactive graph
#   out/graph.html   — raw interactive graph (pyvis, self-contained)

Open out/index.html for the full picture — extraction coverage, connectivity health (fragmentation, isolated nodes, sinks/sources), node-type distribution, and the live graph with a code/location details panel, all in one self-contained file.

Run it against the bundled benchmark (the VS Code Flask tutorial):

python3 run.py python-sample-vscode-flask-tutorial/

Architecture

Layer Tools Gets you Status
1. AST + types Python ast (.py + .ipynb) functions, classes, imports, call sites, signatures, side effects ✅ Implemented
2. Framework conventions custom parsers route → handler, templates, static assets (Flask) ✅ Flask; others planned
2.5. Pattern matching regex on source DB tables, hardcoded URLs, cache ops, template refs 🟡 Partial
3. Runtime tracing sys.settrace + asyncio hooks actual call chains, dynamic dispatch, async fan-out ✅ Implemented
4. LLM pass Claude / GPT semantic intent, implicit relationships ⏳ Planned

No single method captures everything: static analysis sees what code says, runtime tracing sees what it does. ContextAI merges both into one graph.

Pipeline

run.py
  ├─ Pass 1: walk files → emit NODES   (ast_extractor, flask_convention_extractor)
  └─ Pass 2: resolve IDs → emit EDGES  (edge_extractor)

graph/extractors/runtime/   ← trace a running app and merge real call chains
  ├─ tracer.py              sys.settrace + asyncio monkey-patches
  ├─ call_log.py            captured calls → call_log.json
  ├─ script_runner.py       drive a target script/entry point under the tracer
  └─ edge_injector.py       merge runtime calls into the static graph

The graph schema

schema.py is the source of truth (Pydantic v2).

Nodes carry: id, type, name, location, code, signature (typed inputs/outputs), side_effects, error_handling, and metadata (complexity, test coverage, staleness).

Edges carry: id, type, from, to, direction, contract (input/output shape), criticality, on_failure (retry / default / throw / circuit-break), and performance.

Node types
  • Boundary: API_ENDPOINT, MESSAGE_CONSUMER, CRON_JOB
  • Logic: FUNCTION, CLASS, MIDDLEWARE, ROUTE_HANDLER
  • Data: SCHEMA, MODEL, DTO, DATABASE, TABLE, COLLECTION
  • Infra: MESSAGE_QUEUE, FILE_STORAGE, EXTERNAL_LIBRARY
  • Scaffolding: MODULE_INIT, ENTRY_POINT, FILE, TEMPLATE, STATIC_ASSET
Edge types
Category Edges
Call CALLS, CALLS_ASYNC, DELEGATES_TO
Dependency IMPORTS, IMPORTS_SIDE_EFFECT, INHERITS, IMPLEMENTS, INSTANTIATES, INJECTS
Data READS, WRITES, VALIDATES, TRANSFORMS, MAPS_TO, RETURNS
Communication HANDLES, GUARDS, PUBLISHES_TO, SUBSCRIBES_TO, CALLS_EXTERNAL, RENDERS, SERVES_STATIC
App wiring USES_APP_INSTANCE

Analysis tools

All default to graph.json in the current directory.

python3 tools/graph_connectivity.py     # health score, isolated nodes, islands
python3 tools/coverage_from_graph.py    # % of source lines covered by nodes
python3 tools/graph_duplicates.py       # duplicate / overlapping node ranges
python3 tools/diff_graph.py a.json b.json   # diff two graphs

Runtime tracing

python3 tools/run_with_tracing.py \
  --target your_app/main.py \
  --project-root your_app/ \
  --build-static \
  --output graph_runtime.json

Runs your app under the tracer, then merges observed call chains into the static graph (confirming static edges, filling gaps, and adding runtime-only edges).


Public API

graph/api.py is the only surface the MCP server (and any other client) should import — never reach into extractors or GraphStore directly.

from graph.api import (
    build_graph, load_graph, find_node, get_context, list_gaps, get_edge_path,
    run_trace, merge_trace, start_trace, stop_trace,
)

Querying. get_context(store, node_id, depth=2, direction="in") returns a bounded subgraph around a node. It is incoming-biased by default: deep predecessors (who calls this — the blast radius), one shallow successor hop, and a successor pull around gap nodes. Pass direction="out" / "both" to change the bias.

Runtime tracing has three capture modes, all converging on one merge:

Mode Entry point Use it for
One-shot script / IDE run run_trace(target, project_root, …) capture + merge a single script or entry point in one call
Long-running session start_trace(project_root)stop_trace(project_root, base_graph, output) a server or worker traced across many requests without restarting — every call in between is unioned into one capture
Per-request web TracingMiddleware / AsyncTracingMiddleware (tools/trace_middleware.py) trace one request at a time, triggered by an X-Trace: 1 header

All three feed merge_trace(base_graph, call_log, project_root, output) — the single seam that folds a runtime call log onto a base graph. The base is a parameter: pass the static graph to merge a single action, or a prior runtime graph to accumulate a sequence of actions (call counts sum, edges union). The static graph is never mutated — every merge writes a fresh overlay.


Testing

pip install pytest
pytest -q

The suite (153 tests) is built on an inductive strategy: every atomic extraction pattern — structural, web/API, data access, messaging, signatures, data flow, async runtime — has a minimal fixture and an exact-count assertion. If the extractor handles every base case, it handles their combinations.

tests/
  test_static_induction.py   structural · web · data · messaging · signatures · data flow
  test_phase2_edges.py       call resolution, dynamic dispatch, super(), properties
  test_phase3_runtime.py     tracer capture + edge-injector merge + end-to-end
  test_phase3_http.py        HTTP / async routing patterns
  test_runtime_api.py        public API: direction-aware context, merge/run/session tracing
  test_notebook_extractor.py .ipynb flattening + node/edge extraction across cells
  test_dashboard.py          metrics builder + self-contained dashboard generation
  fixtures/                  minimal atomic patterns per test

Project structure

schema.py                       NodeSchema + EdgeSchema (Pydantic, source of truth)
run.py                          entry point: extract → store → visualize (contextai cli)
graph/
  api.py                        public API consumed by the MCP server
  extractors/
    ast_extractor.py            Python AST → nodes (functions, classes, schemas, …)
    edge_extractor.py           all edge types
    notebook_extractor.py       .ipynb → flatten code cells → reuse AST pipeline
    flask_convention_extractor.py  templates + static assets
    runtime/                    sys.settrace tracer + call log + script runner + edge injector
  store/graph_store.py          NetworkX graph + JSON persistence + direction-aware neighbor traversal
  visualizer/visualizer.py      pyvis HTML output (out/graph.html)
tools/                          connectivity, coverage, duplicates, diff, tracing
benchmarks/flask-tutorial/      hand-authored ground-truth graph (diff target)
dashboard/                      self-contained dashboard → out/index.html
  metrics.py                    reuses the coverage + connectivity tools → one payload
  dashboard.py / template.html  embed graph + metrics into a single HTML file
out/                            generated artifacts (gitignored)
docs/                           design + planning notes
tests/                          inductive test suite + fixtures

Roadmap

  • Static extraction (AST) — nodes, edges, signatures, side effects
  • Flask framework conventions (routes, templates, static assets)
  • Runtime tracing (sync + async call chains)
  • Inductive test suite (153 tests)
  • Jupyter .ipynb notebook extraction — code cells → AST pipeline, with cell-aware locations
  • MCP server — expose the graph to LLM clients as a tool (docs/MCP_SERVER_PLAN.md)
  • LLM integration — neighborhood-context retrieval for debugging
  • More frameworks (Django, FastAPI), git/version metadata, derived impact edges (AFFECTS, DEPENDS_ON, TRIGGERS)
  • Multi-language extraction (JS/TS → UI_COMPONENT)

See docs/ for design and planning notes.


Status

Alpha. The static extractor and runtime tracer work and are covered by tests. The LLM/MCP consumption layer — the part that turns the graph into better debugging answers — is in active development.


License

No license has been chosen yet. Until one is added, all rights are reserved by the author.

About

Solving the issue of llms working on large codebases and destroying existing pipelines

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors