# Expressive Language (`.rag`) The expressive language is TinyAgents' **declarative blueprint format**. A `.rag` file describes an agent graph — its start node, state channels, nodes, and routing — as compact, side-effect-free source text. It compiles into the **same** [`graph`](Graph-Runtime) and [`harness`](Harness) structures as hand-written Rust: the runtime never knows whether a graph came from a Rust builder or from a `.rag` source string. Crucially, `.rag` source can only **reference capabilities by name** — models, tools, subgraphs, routers, reducers — that Rust has already registered and allowed. It can never define behaviour or embed host code. That property is what makes `.rag` the **safe boundary for agent-authored plans**: a language model can emit a blueprint, and the same compiler + registry gate that validates a human-written file validates the model's output before anything runs. See [Recursion and RLM](Recursion-and-RLM) for where self-authoring fits in the recursive picture. Source lives in [`src/language/`](https://github.com/tinyhumansai/tinyagents/tree/main/src/language); the module spec is [`docs/modules/expressive-language/README.md`](https://github.com/tinyhumansai/tinyagents/blob/main/docs/modules/expressive-language/README.md). ## The pipeline A `.rag` file flows through four fixed phases. Each phase is a separate submodule so callers can stop at the level of safety they need: ```text source --> lexer --> tokens --> parser --> AST --> compiler --> Blueprint (lexer.rs) (parser.rs) (compiler.rs) ``` | Phase | Module | Input | Output | Validates | |----------|------------------------------|--------------|-------------------|--------------------------------------------| | Lex | `language::lexer` | `&str` | `Vec` | tokens, strings, numbers, `//` comments | | Parse | `language::parser` | tokens | `Program` (AST) | structure: well-formed blocks, expected tokens | | Compile | `language::compiler::compile`| `Program` | `Vec` | semantics: duplicate names, start/targets, routing | | Bind | `language::compiler` (resolver) | `Blueprint` + registry | `()` (checked) | capability references against the registry | The compiler then offers a final, optional phase — `build_graph` — that materialises a `Blueprint` into a runnable `CompiledGraph` using a Rust-supplied `NodeFactory`. Behaviour is always Rust's job; the blueprint only describes topology. ### 1. Lexer (`lexer.rs`) `tokenize(source)` turns text into [`SpannedToken`]s, each carrying a 1-based line/column [`Span`]. The token set is deliberately tiny: - identifiers / keywords: `[A-Za-z_][A-Za-z0-9_]*` (keywords like `graph`, `node`, `start` are lexed as identifiers and recognised contextually) - numbers: integer or decimal, optionally signed (`50`, `1.5`, `-3`) - double-quoted strings with `\n`, `\t`, `\r`, `\\`, `\"` escapes - punctuation: `{ } [ ] ,` and the arrow `->` - `//` line comments (skipped) Lexical errors (unterminated string, invalid escape, malformed number, stray character) surface as `TinyAgentsError::Parse` with the offending line/column. ### 2. Parser (`parser.rs`) `parse(&tokens)` (or the one-shot `parse_str(source)`) is a small hand-written recursive-descent parser. It performs **structural** validation only — expected-token checks and well-formed blocks — and produces a `Program` AST. It does *not* check whether names exist or routes are valid; that is the compiler's job. Errors are `TinyAgentsError::Parse` with the span of the offending token. ### 3. Compiler (`compiler.rs`) `compile(&program)` lowers each `graph` declaration into one serializable [`Blueprint`] and runs the semantic checks: - duplicate node names within a graph are rejected, - a graph must declare a `start` node, and it must be defined, - every `next`, route, and edge target must be a defined node or the reserved `END`, - a node may use static routing (`next` / an incident edge) **or** command routing (`routes`), never both. Failures are `TinyAgentsError::Compile`. Routing precedence when lowering a node is: explicit `routes` > `next` > top-level edge > terminal. A `next END` or an edge to `END` becomes [`Routing::Terminal`]. ### 4. Capability binding (the registry gate) A compiled `Blueprint` is inert until its references are checked against what Rust has registered. This is the safety boundary, and there are two paths: - **Minimal / manual** — `bind_capabilities(&blueprint, &resolver)` checks only `model` and `tool` references against a `CapabilityResolver` allowlist. Build one with `CapabilityResolver::from_lists(models, tools)` or the chaining `allow_model` / `allow_tool` helpers. - **Strict / registry-backed** — `bind_capabilities_with_registry(&blueprint, ®istry)` builds a fully populated resolver from a live [`CapabilityRegistry`](Registry) (models, tools, subgraphs, routers, reducers, plus the default node kinds) and runs `CapabilityResolver::bind_blueprint`. This additionally validates node `kind`s, subgraph/router references, and channel reducers. The default recognised node kinds are `agent`, `model`, `tool_executor`, `subgraph`, `graph`, `router`, `human` (`DEFAULT_NODE_KINDS`). Reference conventions used by the strict path: - `subgraph` / `graph`: the node's `model` field names a registered subgraph blueprint, - `router`: the `model` field names a registered router function, - everything else: the `model` field names a registered chat model. An unknown kind is a `Compile` error; the first unregistered model/tool/subgraph/ router/reducer reference is a `Capability` error. The convenience façade `compile_source(source, ®istry)` runs the whole chain — `parse -> compile -> registry-bind` — and returns validated blueprints in one call. ### Materialising into the runtime ```text Blueprint --build_graph(&blueprint, &factory)--> CompiledGraph ``` `build_graph` walks each `NodeSpec`, asks the caller's `NodeFactory::make` for a runnable handler, and wires routing into durable graph topology: - `Routing::Next(target)` → `GraphBuilder::add_edge` (a static successor), - `Routing::Conditional(_)` → `GraphBuilder::mark_command_routing` (the node decides its route at runtime by returning a `Command` `goto`), - `Routing::Terminal` → `GraphBuilder::set_finish` (route to `END`). The blueprint's `start` node becomes the graph entry. Because the factory is the only source of node *behaviour*, declarative source can never smuggle in arbitrary code. ## The `Blueprint` artifact A [`Blueprint`] is the inspectable, fully serializable output of the compiler. It can be stored, diffed, reviewed in a UI, and reloaded independently of the source text — which is exactly what the agent-authoring and review workflows need. ```text Blueprint { graph_id: String, // the graph name start: String, // validated start node channels: Vec, // { name, reducer } nodes: Vec, // { name, kind, model?, prompt?, tools, routing } edges: Vec, // { from, to } defaults: Vec<(String, Literal)>, } ``` `NodeSpec::routing` is one of `Next(target)`, `Conditional([(label, target), …])`, or `Terminal`. An unspecified node `kind` defaults to `"model"` during compilation. ## Grammar The implemented v1 grammar (what the parser actually accepts) is: ```text program = graph_decl* graph_decl = "graph" ident "{" graph_item* "}" graph_item = "start" ident | "defaults" "{" ( ident literal )* "}" | "channel" ident ident // channel | node_decl | edge_decl // ident "->" ident node_decl = "node" ident "{" node_item* "}" node_item = "kind" ident | "model" string | "system" string // alias for `prompt` | "prompt" string | "tools" "[" ( string ("," string)* )? "]" | "next" ident | "routes" "{" ( ident "->" ident )* "}" literal = string | number | ident ``` Notes: - `END` is a reserved terminal target; it is written as a bare identifier. - `system` and `prompt` both populate a node's prompt; `system` is accepted as an alias. - The broader spec sketches future primitives (`command`, `sends`, `join`, `interrupt`, `metadata`, `retry`, `timeout`, `steering`, capability allow-lists, state schemas). The v1 parser above is the safe subset that is actually implemented; the AST and `Blueprint` leave room for the rest. ## A worked example ```text // A support workflow with a tool loop. graph support_agent { start agent defaults { recursion_limit 50 backoff "exponential" checkpoint inherit } channel messages messages channel tool_calls append node agent { kind agent model "default" system "Resolve support requests using tools when useful." tools ["lookup_user", "create_ticket"] routes { tool_call -> tools final -> END } } node tools { kind tool_executor next agent } } ``` Compiling and binding it from Rust (see the runnable [`examples/rag_blueprint.rs`](https://github.com/tinyhumansai/tinyagents/blob/main/examples/rag_blueprint.rs)): ```text use tinyagents::language::parser::parse_str; use tinyagents::language::compiler::{compile, bind_capabilities, CapabilityResolver}; let program = parse_str(SUPPORT_AGENT)?; // lex + parse let blueprint = compile(&program)?.remove(0); // semantic compile let allow = CapabilityResolver::from_lists( ["default".to_string()], // allowed models ["lookup_user".to_string(), "create_ticket".to_string()], // allowed tools ); bind_capabilities(&blueprint, &allow)?; // registry gate ``` The compiled blueprint reports: - `start` = `agent` - channels `messages` (reducer `messages`) and `tool_calls` (reducer `append`) - node `agent` (kind `agent`, model `"default"`, tools `[lookup_user, create_ticket]`) with conditional routes `tool_call -> tools`, `final -> END` - node `tools` (kind `tool_executor`) with `next -> agent` This is the textbook agent loop: the agent node calls the model, routes to the tool executor when there is a tool call, and back, until it routes `final` to `END`. Run it: ```text cargo run --example rag_blueprint ``` ## Self-authoring: a model emits, compiles, and runs `.rag` The deepest recursion in TinyAgents is a model **writing the workflow it runs inside**. Because `.rag` is declarative and registry-bound, a model's output passes through the *exact same* `parse -> compile -> bind -> build_graph` path as a human-authored file — with the capability allowlist as the safety boundary. The model never executes code; it only produces source that a Rust-side `NodeFactory` materialises. [`examples/openai_self_blueprint.rs`](https://github.com/tinyhumansai/tinyagents/blob/main/examples/openai_self_blueprint.rs) demonstrates the full loop: 1. Ask the model (OpenAI, behind the `openai` feature) to output **only** `.rag` source, handing it the grammar plus a worked example in the system prompt. 2. Strip any ``` fences and feed the text to `parse_str` → `compile`. 3. Bind the blueprint against a `CapabilityResolver` allowlist — only allowlisted models/tools pass; anything else is rejected. 4. `build_graph` with a trivial `NodeFactory`, then run to `END`. If the model's output fails to parse, compile, or bind, the diagnostic and the offending source are surfaced — never executed. This is precisely why generated topology must flow through the compiler and policy checks instead of being installed directly: producing a graph never grants new capabilities. ```text cargo run --features openai --example openai_self_blueprint ``` ## Where `.rag` fits - `.rag` defines graph **topology and bindings** declaratively. - [`.ragsh`](REPL-Language-RAGSH) is the **imperative** counterpart: it inspects, scripts, and recursively orchestrates harness/graph runs, and it can draft, validate, compile, and (under policy) register `.rag` blueprints through this same compiler. - Both lower into the same [graph](Graph-Runtime) + [harness](Harness) runtime as hand-written Rust. ## Errors | Error | Phase | Examples | |--------------------------------|----------------------|-----------------------------------------------------| | `TinyAgentsError::Parse` | lexer / parser | unterminated string, invalid escape, unexpected token | | `TinyAgentsError::Compile` | `compile` / node kind | duplicate node, missing/undefined start, unknown target, mixed routing, unknown node kind | | `TinyAgentsError::Capability` | binding | unregistered model, tool, subgraph, router, or reducer | ## See also - [Graph Runtime](Graph-Runtime) — the durable runtime blueprints lower into. - [Registry](Registry) — the capability catalog `.rag` binds against by name. - [REPL Language (.ragsh)](REPL-Language-RAGSH) — the imperative RLM/CodeAct surface. - [Recursion and RLM](Recursion-and-RLM) — self-authoring and the recursive model.