A type-safe query language for Tree-sitter.
Query in, typed data out.
Tree-sitter solved parsing. It powers syntax highlighting and code navigation at GitHub, drives the editing experience in Zed, Helix, and Neovim. It gives you a fast, accurate, incremental syntax tree for virtually any language.
The hard problem now is what comes after parsing: extracting structured data from the tree:
function extractFunction(node: SyntaxNode): FunctionInfo | null {
if (node.type !== "function_declaration") {
return null;
}
const name = node.childForFieldName("name");
const body = node.childForFieldName("body");
if (!name || !body) {
return null;
}
return {
name: name.text,
body,
};
}Every extraction requires a new function, each one a potential source of bugs that won't surface until production.
Plotnik extends Tree-sitter queries with type annotations:
(function_declaration
name: (identifier) @name :: string
body: (statement_block) @body
) @func :: FunctionInfoThe query describes structure, and Plotnik infers the output type:
interface FunctionInfo {
name: string;
body: SyntaxNode;
}This structure is guaranteed by the query engine. No defensive programming needed.
Tree-sitter already has queries:
(function_declaration
name: (identifier) @name
body: (statement_block) @body)The result is a flat capture list:
query.matches(tree.rootNode);
// → [{ captures: [{ name: "name", node }, { name: "body", node }] }, ...]The assembly layer is up to you:
const name = match.captures.find((c) => c.name === "name")?.node;
const body = match.captures.find((c) => c.name === "body")?.node;
if (!name || !body) throw new Error("Missing capture");
return { name: name.text, body };This means string-based lookup, null checks, and manual type definitions kept in sync by convention.
Tree-sitter queries are designed for matching. Plotnik adds the typing layer: the query is the type definition.
| Hand-written extraction | Plotnik |
|---|---|
| Manual navigation | Declarative pattern matching |
| Runtime type errors | Compile-time type inference |
| Repetitive extraction code | Single-query extraction |
| Ad-hoc data structures | Generated structs/interfaces |
Plotnik extends Tree-sitter's query syntax with:
- Named expressions for composition and reuse
- Recursion for arbitrarily nested structures
- Type annotations for precise output shapes
- Alternations: untagged for simplicity, tagged for precision (discriminated unions)
- Scripting: Count patterns, extract metrics, audit dependencies
- Custom linters: Encode your business rules and architecture constraints
- LLM Pipelines: Extract signatures and types as structured data for RAG
- Code Intelligence: Outline views, navigation, symbol extraction across grammars
Start simple—extract all function names from a file:
Functions = (program
{(function_declaration name: (identifier) @name :: string)}* @functions)Plotnik infers the output type:
type Functions = {
functions: { name: string }[];
};Scale up to tagged unions for richer structure:
Statement = [
Assign: (assignment_expression
left: (identifier) @target :: string
right: (Expression) @value)
Call: (call_expression
function: (identifier) @func :: string
arguments: (arguments (Expression)* @args))
]
Expression = [
Ident: (identifier) @name :: string
Num: (number) @value :: string
]
TopDefinitions = (program (Statement)+ @statements)This produces:
type Statement =
| { $tag: "Assign"; $data: { target: string; value: Expression } }
| { $tag: "Call"; $data: { func: string; args: Expression[] } };
type Expression =
| { $tag: "Ident"; $data: { name: string } }
| { $tag: "Num"; $data: { value: string } };
type TopDefinitions = {
statements: [Statement, ...Statement[]];
};Then process the results:
for (const stmt of result.statements) {
switch (stmt.$tag) {
case "Assign":
console.log(`Assignment to ${stmt.$data.target}`);
break;
case "Call":
console.log(
`Call to ${stmt.$data.func} with ${stmt.$data.args.length} args`,
);
break;
}
}For the detailed specification, see the Language Reference.
- CLI Guide — Command-line tool usage
- Language Reference — Complete syntax and semantics
- Type System — How output types are inferred from queries
- Runtime Engine — VM execution model (for contributors)
Plotnik bundles 15 languages out of the box: Bash, C, C++, CSS, Go, HTML, Java, JavaScript, JSON, Python, Rust, TOML, TSX, TypeScript, and YAML. The underlying arborium collection includes 60+ permissively-licensed grammars—additional languages can be enabled as needed.
Working now: Parser with error recovery, type inference, query execution, CLI tools (check, dump, infer, exec, trace, tree, langs).
Next up: CLI distribution (Homebrew, npm), language bindings (TypeScript/WASM, Python), LSP server, editor extensions.
Max Brunsfeld created Tree-sitter; Amaan Qureshi and other contributors maintain the parser ecosystem that makes this project possible.
This project is licensed under the Apache License (Version 2.0).
