Graph Runtime

The graph runtime (src/graph/) is TinyAgents' durable, typed workflow engine — a LangGraph-style superstep executor where nodes read a committed state snapshot, return partial updates, and the runtime merges, routes, checkpoints, and (when needed) pauses for human input. It is the second of the five surfaces in the Architecture and the layer that makes recursion durable: a graph node can embed another compiled graph, and a model standing inside a graph can author one. See Recursion and RLM for the full recursive picture.

The graph runtime owns structure and durability. It does not call models or tools itself — nodes call into the Harness when they need a model, a tool loop, or a sub-agent. This separation keeps execution deterministic, provider-neutral, and testable.

Mental model

A graph is a set of named nodes wired by edges, executed in supersteps over a typed State. Two virtual nodes bracket every run:

START ("__start__") — the synthetic entry; set_entry(node) is sugar for add_edge(START, node).
END ("__end__") — the synthetic terminal; set_finish(node) is sugar for add_edge(node, END).

flowchart TD
    Start((START)) --> A[agent]
    A -->|needs_tool| T[tool]
    A -->|done| End((END))
    T --> A

Each superstep takes the active node set, runs every active node against the same committed snapshot, folds their results through a reducer at the step boundary, persists a checkpoint, then selects the next active set. The loop stops when the active set empties, all branches reach END, an interrupt pauses the run, or the recursion limit is hit.

Building a graph: `GraphBuilder<State, Update>`

GraphBuilder (graph::builder) is generic over two types:

State — the committed graph state (must be Clone + Send + Sync + 'static).
Update — the partial-update type each node returns, merged into State by a StateReducer.

For the common "each node returns the whole next state" case, use GraphBuilder::<State, State>::overwrite(), which installs the OverwriteStateReducer:

let graph = GraphBuilder::<AgentState, AgentState>::overwrite()
    .add_node("agent", agent_node)
    .add_node("tool", tool_node)
    .set_entry("agent")
    .add_conditional_edges(
        "agent",
        |s: &AgentState| if s.needs_tool { "tool".into() } else { "done".into() },
        [("tool", "tool"), ("done", END)],
    )
    .add_edge("tool", "agent")
    .compile()?;

Key builder methods:

Method	Effect
`add_node(id, handler)`	Register an async node returning `Result<NodeResult<Update>>`.
`add_edge(from, to)`	Static, unconditional edge.
`set_entry(node)` / `set_finish(node)`	`START → node` / `node → END`.
`add_conditional_edges(from, router, routes)`	Router closure → label table.
`mark_command_routing(node)`	Node routes only via `Command.goto`.
`set_reducer(r)`	Install a custom `StateReducer`.
`with_recursion_limit(n)`	Cap supersteps (default 50).
`with_parallel(true)`	Run multi-node supersteps concurrently (see fan-out).
`with_graph_id(id)`	Override the generated graph id.
`compile()`	Validate topology and freeze into a `CompiledGraph`.

compile() enforces the invariants: an entry must exist, START cannot route straight to END and cannot be an edge target, END cannot be an edge source, a node cannot mix static and conditional edges, and a mark_command_routing node cannot also declare static/conditional edges. Failures return TinyAgentsError::Validation, MissingStart, MissingNode, etc.

Nodes and `NodeContext`

A node handler has the shape Fn(State, NodeContext) -> impl Future<Output = Result<NodeResult<Update>>>. It receives a clone of the committed state and a per-task NodeContext:

node_id, run_id, optional thread_id, and the 1-based step.
resume: Option<serde_json::Value> — None on a normal run; set to the resume payload when an interrupted node is re-run (see interrupts).
fork: Option<ForkId> — the branch identity when this node runs as one fork of a concurrent superstep (None in sequential mode or single-node steps).

Nodes never mutate shared state directly; they return a NodeResult, and the runtime owns all state changes through the reducer.

Node results: updates, commands, interrupts

NodeResult<Update> (graph::command) is one of three outcomes:

NodeResult::Update(update) — a partial update merged through the reducer.
NodeResult::Command(Command) — combine an optional update with explicit routing and/or a resume value.
NodeResult::Interrupt(Interrupt) — pause for human-in-the-loop input.

A Command<Update> has three orthogonal fields:

pub struct Command<Update> {
    pub update: Option<Update>, // applied before routing
    pub goto: Vec<NodeId>,      // overrides static/conditional edges
    pub resume: Option<serde_json::Value>, // paired with an interrupt resume
}

goto is the mechanism for dynamic routing and for Send-style fan-out: returning multiple targets schedules them all in the next superstep. Pair goto with mark_command_routing(node) so the compiler knows the node owns its own routing.

Routing precedence

At each step boundary the executor resolves the next targets for every completed node in this order (route in graph::compiled):

Command goto — explicit targets win.
Static edge — the single add_edge target.
Conditional edge — the router closure returns a label, resolved against the route table; an unknown label is TinyAgentsError::MissingRoute.
Sink — no routing configured: that branch ends.

Targets equal to END are dropped from the next active set; when the set empties the run completes.

Typed state, reducers, and channels

Reducers (graph::reducer) decide how writes merge at the step boundary. Two traits exist:

StateReducer<State, Update> — merges a partial Update into the whole State. This is the contract the executor uses.
Reducer<T> — merges two values of the same channel type (channel-style state, one merge policy per key).

Built-in markers: OverwriteStateReducer / OverwriteReducer (last write wins), AppendReducer (concatenate vectors), SetUnionReducer (union, dedup, first-seen order), MinReducer / MaxReducer, and the closure-backed ClosureStateReducer / ClosureReducer for custom merges. Because the executor folds branch updates in deterministic active-set index order, a reducer is also the fan-in / join for parallel supersteps — the merged state is reproducible regardless of which branch finishes first.

The superstep executor (`graph::compiled`)

compile() produces a CompiledGraph<State, Update> — immutable, cheap to clone (heavy fields are Arc-shared), and safe to run concurrently. Three entry points run it:

run(state) — run to completion or to an interrupt, with no thread (no checkpoints are persisted even if a checkpointer is attached).
run_with_thread(thread_id, state) — run under a thread id, persisting a checkpoint at every superstep boundary when a checkpointer is configured.
resume(thread_id, command) — resume an interrupted run from its latest checkpoint.

Each run returns a GraphExecution<State>:

Field	Meaning
`state`	Final committed state.
`visited`	Ordered list of executed nodes (may repeat across steps).
`steps`	Number of supersteps executed.
`interrupts`	Interrupts that paused the run (`is_interrupted()` is `!interrupts.is_empty()`).
`status`	A compact `GraphRunStatus` snapshot.
`checkpoint_id`	Latest persisted checkpoint id, if any.

Inside a step the executor: schedules the active nodes, runs each handler against its own state clone, folds results (fold_result) into updates / goto map / the lowest-index interrupt, applies the reducer at the boundary, persists a checkpoint, then selects the next active set. Exceeding the recursion limit is a deterministic TinyAgentsError::RecursionLimit.

Parallel fan-out

By default a step runs its active nodes sequentially, short-circuiting on the first error or interrupt. Compile with with_parallel(true) and a step with more than one active node runs every branch concurrently via futures::future::join_all. Each branch executes on its own cloned snapshot and a distinct ForkId { branch_index, node }. All branches are driven to completion before the boundary; results are then folded in active-set index order, so:

the reducer resolves conflicting writes deterministically (lower index first);
the lowest-index branch that errors or interrupts is the step's terminal outcome — an error aborts the run, an interrupt persists a checkpoint whose pending nodes are that branch and every later active node;
because branches never share mutable state, concurrency is data-race free.

Run status snapshots (`graph::status`)

GraphRunStatus is a compact, observable summary at an execution boundary — not a checkpoint. It answers "is this run active?", "which node is executing?", "which interrupt is waiting?" without deserializing full state. Crucially for recursion it carries root_run_id, parent_run_id, and checkpoint_namespace, so nested subgraph/sub-agent runs roll up into a single observable run tree.

Checkpoints, durability, and time travel (`graph::checkpoint`)

Checkpoints are graph-runtime persistence, separate from harness memory. They are written at superstep boundaries only — never mid-node — because re-running a node from its start is far easier to reason about than suspending an async Rust stack, and it matches interrupt/resume exactly.

A Checkpoint<State> records the thread_id lineage, this checkpoint_id and its parent_checkpoint_id, the namespace (for nested subgraphs), the committed state, the next_nodes to run on resume, the completed_tasks, any pending_writes, pending interrupts, and free-form metadata (source, step). CheckpointMetadata is the lightweight listing form that avoids deserializing full state.

The Checkpointer<State> trait (put / get / list) abstracts storage; InMemoryCheckpointer ships for tests and single-process runs. Attach one with with_checkpointer(...). Because every boundary is a parent-linked checkpoint, the lineage is a tree you can list, inspect, and resume from any point — this is the foundation for time travel: replay or fork a run from an earlier checkpoint by resuming the desired thread_id / checkpoint.

Interrupts and resume

An Interrupt { id, node, payload } is a human-in-the-loop pause. Interrupts require a checkpointer. When a node returns NodeResult::Interrupt, the executor applies the updates collected so far, persists a checkpoint whose next_nodes are the interrupted node plus every later active node, and returns control with GraphExecution::is_interrupted() == true.

To continue, call resume(thread_id, Command { resume: Some(value), .. }). The executor loads the latest checkpoint, re-runs the pending nodes with NodeContext::resume set to your value, and proceeds. Resume without a checkpointer, or with no pending nodes, returns TinyAgentsError::Resume.

Subgraphs: graph-level recursion (`graph::subgraph`)

A CompiledGraph can be embedded as a node in a parent graph — this is the graph-level recursion mechanism, the structural counterpart to the harness' sub-agents. Two adapters wrap a child graph into a node handler:

shared_subgraph_node(child) — parent and child share the same State/Update channel. The child runs over the parent's state; its final state becomes the parent update.
adapter_subgraph_node(child, to_child, from_child) — parent and child use different state shapes. to_child projects parent state into the child input; from_child folds the child's final state back into a parent update.

Both adapters append the embedding node id to the child's checkpoint namespace, so parent and child checkpoint ids never collide — nested runs stay independently inspectable. Combined with with_recursion_limit, subgraphs let a graph run a graph (and, transitively, a model author a graph that runs inside the graph it is executing in). The module spec (docs/modules/graph/subagents-recursion.md) extends this with a graph-level SubAgentNode / RecursionPolicy design — depth tracking, root_run_id / parent_run_id preservation, and child usage/cost rollups — for embedding a harness agent as a graph node. See Recursion and RLM.

Streaming and events (`graph::stream`)

Attach a GraphEventSink with with_event_sink(...) to observe execution. The executor emits low-level GraphEvents: StepStarted / StepCompleted, TaskScheduled, NodeStarted / NodeCompleted / NodeFailed, StateUpdated, RouteSelected, CheckpointSaved, and InterruptEmitted. NoopSink discards them; CollectingSink buffers them for assertions in tests.

StreamMode mirrors LangGraph's high-level projections — Values, Updates, Messages, Debug, Interrupts, Custom — as a selection enum over the same event stream.

Topology export and visualization (`graph::export`)

A GraphTopology is a serializable, behavior-free description of a graph's structure: graph_id, entry, recursion_limit, parallel, nodes (NodeInfo), edges (EdgeInfo), conditional_edges (ConditionalEdgeInfo / RouteInfo), finish_nodes, and channels (ChannelInfo). All collections are sorted, so exports are deterministic regardless of HashMap order.

Extract one with CompiledGraph::topology() or GraphBuilder::topology(), or from a .rag Blueprint via blueprint_to_topology. Render it with:

to_json / from_json — round-trippable JSON for snapshots and tooling.
to_mermaid — a Mermaid diagram for docs and UIs.
blueprint_to_json / blueprint_to_mermaid — straight from a blueprint.

This makes a graph — whether hand-written or agent-authored — inspectable before it ever runs.

The harness boundary

flowchart LR
    GraphNode[Graph Node] --> Harness[AgentHarness]
    Harness --> Model[ChatModel]
    Harness --> Tool[ToolRegistry]
    Harness --> Events[EventSink]

A node may call an AgentHarness, but the graph does not know or care whether the harness uses OpenAI, Anthropic, Ollama, a mock model, a tool loop, or a sub-agent. Nondeterministic work stays inside nodes; routing, durability, and topology stay explicit and inspectable.

When to use a graph

Reach for the graph runtime when a workflow needs explicit state transitions, branch routing, guarded loops, checkpointing/resume, inspectable topology, deterministic tests around model calls, sub-agent orchestration, or human review points. For a single model call, use the Harness directly. Both .rag and .ragsh (see Recursion and RLM) lower into exactly these graph types — a model can build the same workflow you can.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Graph Runtime

Graph Runtime

Mental model

Building a graph: `GraphBuilder<State, Update>`

Nodes and `NodeContext`

Node results: updates, commands, interrupts

Routing precedence

Typed state, reducers, and channels

The superstep executor (`graph::compiled`)

Parallel fan-out

Run status snapshots (`graph::status`)

Checkpoints, durability, and time travel (`graph::checkpoint`)

Interrupts and resume

Subgraphs: graph-level recursion (`graph::subgraph`)

Streaming and events (`graph::stream`)

Topology export and visualization (`graph::export`)

The harness boundary

When to use a graph

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TinyAgents

Clone this wiki locally

Uh oh!

Graph Runtime

Graph Runtime

Mental model

Building a graph: GraphBuilder<State, Update>

Nodes and NodeContext

Node results: updates, commands, interrupts

Routing precedence

Typed state, reducers, and channels

The superstep executor (graph::compiled)

Parallel fan-out

Run status snapshots (graph::status)

Checkpoints, durability, and time travel (graph::checkpoint)

Interrupts and resume

Subgraphs: graph-level recursion (graph::subgraph)

Streaming and events (graph::stream)

Topology export and visualization (graph::export)

The harness boundary

When to use a graph

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TinyAgents

Clone this wiki locally

Building a graph: `GraphBuilder<State, Update>`

Nodes and `NodeContext`

The superstep executor (`graph::compiled`)

Run status snapshots (`graph::status`)

Checkpoints, durability, and time travel (`graph::checkpoint`)

Subgraphs: graph-level recursion (`graph::subgraph`)

Streaming and events (`graph::stream`)

Topology export and visualization (`graph::export`)