AST

Tree-sitter indexing, cross-reference storage, and AST-aware code skeletons for search and embeddings.

Languages and parsers

The AST subsystem covers 8 languages: C, C++, Python, Java, Kotlin, JavaScript, Rust, and TypeScript. It uses 7 tree-sitter parsers because C and C++ share one parser.

Indexing flow

AST indexing is a two-phase process:

Parse files and store the extracted symbols.
Link cross-references between definitions and usages.

Indexing runs in the background in batches, so large workspaces are processed incrementally instead of blocking the main request path.

Storage model

AST data is stored in LMDB. The key layout uses prefixes that separate concerns:

d| for definitions
c| for fuzzy lookup
u| for back-links
classes| for inheritance

Skeletonization

The skeletonizer builds abbreviated code views from declarations and selected members. These reduced snippets are used as embedding-friendly text, preserving structure while trimming implementation detail.

HTTP endpoints

The engine exposes AST-related HTTP endpoints under /ast-*, including:

/ast-file-symbols
/ast-status

These endpoints return symbol information and indexing status for the currently available AST service.

Uh oh!

AST

AST

Languages and parsers

Indexing flow

Storage model

Skeletonization

HTTP endpoints

Related links

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Start Here

Core Agentic

Engine Reference

GUI Reference