Skip to content

Code Intelligence

scarecr0w12 edited this page Jun 20, 2026 · 3 revisions

Code Intelligence (Codegraph)

CortexPrism's code intelligence system provides deep structural understanding of codebases using tree-sitter WASM parsing with FTS5-backed full-text search over a graph database of code symbols and their relationships.

Architecture

┌──────────────────────────────────────────────────────────────┐
│                    Code Intelligence                          │
│                                                               │
│  tree-sitter WASM (12 languages, 43+ extension mappings)      │
│          │                                                    │
│          ▼                                                    │
│  ┌──────────────────────────────────────┐                    │
│  │          AST Extraction               │                    │
│  │  - Functions, classes, interfaces     │                    │
│  │  - Imports, exports, calls            │                    │
│  │  - Decorators, HTTP calls, async      │                    │
│  │  - Complexity estimation              │                    │
│  └──────────────────────────────────────┘                    │
│          │                                                    │
│          ▼                                                    │
│  ┌──────────────────────────────────────┐                    │
│  │        Code Graph (memory.db)         │                    │
│  │  - 12 node labels (CodeFunction,      │                    │
│  │    CodeClass, CodeInterface, etc.)     │                    │
│  │  - 18 edge types (CALLS, IMPORTS,     │                    │
│  │    DEFINES, IMPLEMENTS, INHERITS)      │                    │
│  │  - FTS5 full-text search              │                    │
│  └──────────────────────────────────────┘                    │
│          │                                                    │
│          ▼                                                    │
│  ┌──────────────────────────────────────┐                    │
│  │        Codegraph Resolver             │                    │
│  │  - Exact symbol match                 │                    │
│  │  - Method on class resolution         │                    │
│  │  - Wildcard/relative import           │                    │
│  │  - Import-map resolution (package →   │                    │
│  │    local path fallthrough)            │                    │
│  │  - Type inference                     │                    │
│  │  - Fallback search                    │                    │
│  │  - FFI bridge detection (JNI, cgo)    │                    │
│  └──────────────────────────────────────┘                    │
│          │                                                    │
│          ▼                                                    │
│  ┌──────────────────────────────────────┐                    │
│  │           7 Agent Tools               │                    │
│  │  code_index    code_search_symbol     │                    │
│  │  code_trace_path  code_get_architecture│                   │
│  │  code_analyze_impact  code_list_projects│                  │
│  │  code_pilot                            │                   │
│  └──────────────────────────────────────┘                    │
│                                                               │
│  Web UI: D3.js force-directed graph                           │
│  ┌──────────────────────────────────────┐                    │
│  │  - Interactive dependency graph       │                    │
│  │  - Symbol search with FTS5            │                    │
│  │  - Impact analysis (blast radius)     │                    │
│  │  - Path tracer (caller ↔ callee)      │                    │
│  │  - Architecture panel (layers, FFI)   │                    │
│  │  - 4 bottom-panel tabs: Ownership,    │                    │
│  │    History, Q&A, Pilot                │                    │
│  │  - Incremental sync watcher (30s)     │                    │
│  └──────────────────────────────────────┘                    │
└──────────────────────────────────────────────────────────────┘

Supported Languages

Tree-sitter WASM parsers with lazy-loading from CDN (12 grammar parsers, 43+ file extension mappings):

Language File Extensions Parser
TypeScript .ts, .tsx tree-sitter-typescript
JavaScript .js, .jsx, .mjs, .cjs tree-sitter-javascript
Python .py, .pyi tree-sitter-python
Go .go tree-sitter-go
Rust .rs tree-sitter-rust
Java .java tree-sitter-java
Kotlin .kt, .kts tree-sitter-kotlin
C .c, .h tree-sitter-c
C++ .cpp, .cc, .cxx, .hpp, .hh tree-sitter-cpp
C# .cs tree-sitter-c_sharp
Ruby .rb tree-sitter-ruby
PHP .php tree-sitter-php
Swift .swift tree-sitter-swift
Scala .scala tree-sitter-scala
Lua .lua tree-sitter-lua
Bash .sh, .bash, .zsh tree-sitter-bash
SQL .sql tree-sitter-sql
Vue .vue tree-sitter-vue
Svelte .svelte tree-sitter-svelte
HTML .html tree-sitter-html
CSS .css, .scss, .less tree-sitter-css
JSON .json tree-sitter-json
YAML .yaml, .yml tree-sitter-yaml
TOML .toml tree-sitter-toml
XML .xml tree-sitter-xml
Markdown .md, .mdx tree-sitter-markdown
GraphQL .graphql, .gql tree-sitter-graphql
Protobuf .proto tree-sitter-protobuf
HCL .tf, .hcl tree-sitter-hcl
Dockerfile .dockerfile tree-sitter-dockerfile
CMake .cmake tree-sitter-cmake
Dart .dart tree-sitter-dart
R .r tree-sitter-r
Zig .zig tree-sitter-zig
Nim .nim tree-sitter-nim
Elixir .ex, .exs tree-sitter-elixir
Erlang .erl, .hrl tree-sitter-erlang
Haskell .hs tree-sitter-haskell
OCaml .ml, .mli tree-sitter-ocaml
Emacs Lisp .el tree-sitter-elisp

When a language has no tree-sitter grammar installed, the file is silently skipped (not counted as an error). Unsupported file extensions are also skipped.

Node Labels (12)

CodeProject, CodePackage, CodeFile, CodeModule, CodeFunction, CodeMethod, CodeClass, CodeInterface, CodeEnum, CodeType, CodeRoute, CodeResource

Edge Types (18)

CALLS, IMPORTS, DEFINES, DEFINES_METHOD, IMPLEMENTS, INHERITS, HTTP_CALLS, ASYNC_CALLS, DECORATES, USES_TYPE, USAGE, MEMBER_OF, CONTAINS_PACKAGE, CONTAINS_FILE, HANDLES, TESTS, CONFIGURES, DATA_FLOWS

Ingestion Pipeline

Indexing (code_index)

code_index(projectPath)
  1. Directory walk (max 200K files, depth 100)
  2. File hash comparison (skip unchanged)
  3. Language detection by extension
  4. WASM parser validation (integrity check)
  5. AST extraction per file
  6. Chunked bulk insert (BFS-batched: 2 queries/level instead of N+1)
  7. Edge insertion with INSERT OR IGNORE + orphan cleanup (PRAGMA foreign_keys = OFF)
  8. FTS5 index rebuild (explicit `rebuild` command syncing virtual table with content table)

Indexing returns diagnostics: nodeCount, edgeCount, fileCount, errorCount, and errorSample (first 5 error messages). Missing grammars and unsupported languages are silently skipped.

Incremental Sync (v0.45.3)

The Codegraph page starts a 30-second polling loop calling POST /api/codegraph/incremental-sync, which re-indexes only changed files (hash comparison) and auto-refreshes the graph visualization when new nodes or edges are found. The watcher stops when leaving the page or switching projects.

Resolver (7 strategies)

  1. Exact symbol match — direct name lookup
  2. Method on class — resolve ClassName.methodName
  3. Wildcard import — expand import * as X references
  4. Relative import path — resolve ./path/to/module
  5. Import-map resolution — package-level import maps resolved first, then fallback to generic candidate match with lower confidence
  6. Type inference — follow type declarations to methods
  7. Fallback search — FTS5 full-text lookup

Polyglot Cross-Language Analysis (v0.45.3)

The architecture endpoint runs detectFFIBridges on loaded nodes via src/codegraph/polyglot.ts. When FFI bridges are detected (JNI, cgo, ctypes, etc.), the architecture response includes an ffiBridges field listing the cross-language connections. The module provides AST node normalization across 15+ languages with cross-language call tracing and language family grouping.

Agent Tools

code_index

Full repository indexing with incremental sync.

Parameter Type Description
projectPath string Root directory to index
projectName string Optional project identifier
force boolean Re-index all files (skip hash check)

code_search_symbol

FTS5-backed symbol search across projects.

Parameter Type Description
query string Symbol name to search
project string Optional project filter
kind string Optional node label filter
limit number Max results (default 20)

code_trace_path

Bidirectional call graph traversal.

Parameter Type Description
symbolId string Starting symbol node ID
direction string "inbound", "outbound", or "both"
maxDepth number Maximum traversal depth (default 5)

code_get_architecture

System architecture diagram extraction.

Parameter Type Description
project string Project name

Returns layers, modules, dependency structure, and FFI bridge information.

code_analyze_impact

Blast radius analysis for a symbol.

Parameter Type Description
symbolId string Target symbol node ID

Returns direct/indirect callers, callees, dead code detection, and complexity metrics.

code_list_projects

Project registry with language statistics.

Returns per-project node counts, edge counts, language breakdowns, and last-indexed timestamps. When merging indexed codegraph projects with filesystem projects from the Projects page, non-indexed projects auto-index on first Codegraph load.

code_pilot

Token-optimized context builder with AST-aware pruning (v0.44.1, config wired in v0.45.3).

Parameter Type Description
project string Project name
tokenBudget number Max tokens to return (500–64000)
pruning string Pruning mode: "full", "signatures", or "imports"
includeTests boolean Include test files
filePattern string Optional glob filter (e.g. src/**/*.ts)

Reads indexed project files, applies AST-aware pruning and chunking, and returns optimized code chunks with dependency and symbol metadata. The tool loads saved pilot config from loadConfig() (token budget, pruning mode, include tests) and uses those values as defaults when arguments aren't explicitly provided.

Web UI (Codegraph Page)

  • D3.js force-directed graph — interactive zoomable dependency visualization; nodes sized by degree (connection count), hover tooltips show type/name/file/line
  • Symbol search — FTS5-powered search with type filters and cross-repo search (🌐 All repos)
  • Impact analysis — click a node to see blast radius (callers + callees)
  • Path tracer — trace calls between any two symbols
  • Architecture panel — live Node/Edge/Hotspot counts, FFI bridge listing
  • 4 bottom-panel tabs: Ownership (git-blame), History (commit log), Q&A (semantic search with citations), Pilot (token budget config)
  • Incremental sync — 30-second polling auto-refreshes the graph on file changes
  • Index/Re-index button — inline path prompt next to project selector; empty-state shows actionable "Index a Project" button

API Endpoints

Core

Method Path Description
GET /api/codegraph/projects List indexed projects (merged with filesystem projects)
POST /api/codegraph/index Start code indexing (returns nodeCount, edgeCount, fileCount, errorCount, errorSample)
POST /api/codegraph/incremental-sync Re-index changed files only
GET /api/codegraph/search?q=&project= Search symbols
POST /api/codegraph/impact Impact analysis (returns { nodes: [...] })
GET /api/codegraph/architecture?project= Architecture extraction (includes FFI bridges)
POST /api/codegraph/trace Path tracing (returns { paths: [[...]] })

Extended

Method Path Description
GET /api/codegraph/search-all?q=&language= Cross-repo FTS search across all projects
GET /api/codegraph/languages?project= Distinct languages per project
GET /api/codegraph/ownership?file= Git blame author attribution
GET /api/codegraph/history?file= Git log commit history
GET /api/codegraph/qa?q=&project= Semantic code Q&A with citations
GET /api/codegraph/fitness?project= Architecture fitness rule checks
GET /api/codegraph/pilot-config Codebase pilot token budget
PUT /api/codegraph/pilot-config Update pilot config

See Also

Clone this wiki locally