Skip to content
Kadyapam edited this page Jun 11, 2026 · 16 revisions

noetl-tools

Shared tool registry for the NoETL Rust runtime. Hosts the concrete Tool implementations the CLI and worker dispatch through, plus the supporting auth resolver, template engine, and result envelope types.

What this crate is

The execution layer of the Rust runtime split:

  • noetl-tools (this crate) — concrete tool implementations. Each tool ships a ToolConfig schema, an async execute() method, and a ToolResult envelope.
  • noetl-executor — shared utilities + types the CLI and worker call into (template rendering, condition evaluation, capability validation, event envelope, tool dispatch bridge).
  • noetl-worker — NATS pull consumer.
  • noetl (cli) — local-mode runner + workspace member for noetl-executor.

The CLI and worker both depend on noetl-tools directly (alongside noetl-executor). The dispatch flow:

noetl CLI's tree walker      worker's NATS pull loop
        │                            │
        └─── noetl-executor ─────────┘
                  │
                  └─── noetl-tools::ToolRegistry
                              │
                              ├── RhaiTool         ┐
                              ├── ShellTool        │
                              ├── HttpTool         │
                              ├── DuckdbTool       ├── concrete tool kinds
                              ├── PostgresTool     │
                              ├── SnowflakeTool    │
                              ├── TransferTool     │
                              ├── PythonTool       │
                              └── ScriptTool       ┘

Tool kinds

Tool Source Purpose
RhaiTool tools::rhai Embedded Rhai scripting; HTTP helpers, sleep, get_gcp_token, log/print/parse_json.
ShellTool tools::shell Spawn shell commands; capture stdout/stderr/exit_code.
HttpTool tools::http Direct reqwest calls; JSON / form / binary bodies; GCP ADC bearer auth.
DuckdbTool tools::duckdb In-process DuckDB query execution; SELECT returns {columns, rows, row_count}; non-SELECT returns {affected_rows}.
PostgresTool tools::postgres Connection-pooled PG; same envelope shape as DuckDB.
SnowflakeTool tools::snowflake Snowflake-specific connection params (account / warehouse / database / schema).
NatsTool tools::nats NATS JetStream / KV / Object Store operations (kv_get, kv_put, kv_delete, kv_keys, kv_purge, object_get, object_put, object_delete, object_list, object_info, js_publish, js_get_msg, js_stream_info). No subscriptions per execution-model.md.
McpTool tools::mcp MCP (Model Context Protocol) JSON-RPC bridge: tools/call, tools/list, health probe, passthrough for other JSON-RPC methods. Endpoint resolved from config or NOETL_MCP_<SERVER>_ENDPOINT env var. SSE and plain-JSON response bodies both handled. Added in 2.16.0.
TransferTool tools::transfer Database-to-database transfer (snowflake / postgres / http source → snowflake / postgres / duckdb target). Modes: append / replace / upsert.
PythonTool tools::python Spawn Python subprocesses (script execution).
ScriptTool tools::script Generic script-as-string execution.
ResultFetchTool tools::result_fetch Playbook-step surface for explicit cross-step / cross-node fetches of a noetl://execution/<eid>/result/<step>/<id> URI. Tries Arrow Flight (via noetl-arrow-flight-client) first, falls back to HTTP GET /api/result/resolve. Output mirrors DuckDB / Postgres: {data: {rows, columns, row_count}}. Added in 2.11.0; see agents/rules/execution-model.md for the policy-vs-infrastructure boundary it sits on. 2.12.0derive_flight_endpoint emits http:// / https:// instead of the broken grpc://. 2.13.0 (R-2.3 Phase C2.3) — adds bearer_token (keychain alias preferred per execution-model.md; literal fallback for back-compat with the postgres-tool pattern) + tls_ca_path playbook config fields. 2.14.0 (R-2.3 Phase C2.4) — adds client_cert_path + client_key_path for mTLS (worker-filesystem paths to the PEM pair the worker presents on the TLS handshake when the server is configured with NOETL_FLIGHT_CLIENT_CA). Partial pair (cert without key, or vice versa) raises FlightFetchError::Transport at build_flight_config time so misconfigured playbooks fail fast. All four auth fields independently opt-in; threaded into the Flight client via FlightConfig::new().tls(FlightTlsConfig::new().ca_certificate(...).identity(cert, key)).auth(FlightAuth::bearer(...)).

Supporting modules

Module Purpose
registry ToolRegistry, Tool trait, ToolConfig, AuthConfig + AuthType enum.
context ExecutionContext — per-execution variable + secret + worker-id state. execution_id: i64 mirrors Python noetl.event.execution_id.
auth AuthResolver + concrete providers. GcpAuth uses the gcp_auth crate; supports workload identity on GKE + gcloud fallback on dev hosts.
template TemplateEngine{{ workload.x }} substitution; render_value(json, ctx) for nested templates. 2.12.0StepResultProxy minijinja Object impl aliases .result back to the step dict (matches Python noetl/core/dsl/render.py's StepResultProxy fall-through), so {{ producer.result.reference.ref }} resolves to producer.reference.ref when there's no real result key. Cross-runtime parity — playbooks written for the Python worker work unchanged on the Rust worker. Literal result keys still take precedence (Python convention).
result ToolResult + ToolStatus (Success / Error / Timeout).
error ToolError enum covering all tool-side failures.
arrow_codec Apache Arrow encode/decode (R-2 data plane). As of 2.10.0 also exposes try_encode_tabular_json(value) -> Option<TabularEncoding> for the worker's R-2.2 shm-cache fast path: detects DuckDB / Postgres / Snowflake {columns, rows} JSON (top-level or nested under data), infers per-column Arrow types (Int64 / Float64 / Boolean / Utf8 with mixed-type fallback to Utf8), and encodes as a Feather V2 / IPC stream tagged with ARROW_STREAM_MEDIA_TYPE = "application/vnd.apache.arrow.stream" (mirrors Python's noetl.core.storage.arrow_ipc.ARROW_STREAM_MEDIA_TYPE).

Multi-tool steps (task_sequence)

A playbook step whose tool: value is a list of named sub-tools is dispatched through tools::task_sequence::TaskSequenceTool, which runs the sub-tools in order through a fresh registry and aggregates their results as a label-keyed map ({label1: <data1>, label2: <data2>, ...}).

Context sharing across sub-tools, in order of how it accumulates:

  • set: on a sub-tool publishes rendered values into the running context for later sub-tools (forward-only). The just-executed tool's own result data is available to its set: expressions as {{ output.<field> }}.
  • Sibling references — a later sub-tool reads an earlier sibling's output directly by label: {{ <label>.<field> }}. Each sub-tool's result is injected into the running context under its label as soon as it completes, so subsequent sub-tools (and a later python sub-tool's stdin variables) can reference it without any set: plumbing. The result is exposed under both {{ label.field }} and {{ label.data.field }} — a synthetic .data self-reference is added when the result object lacks its own data key, matching the single-tool build_context shape. Added in 3.1.1 (noetl/ai-meta#87): before this, a sibling reference rendered empty, which was masked in quoted positions ('{{ ... }}' → valid '') but produced a syntax error at or near "," in unquoted numeric SQL positions.

The pipeline short-circuits on the first sub-tool failure (or a spec.policy.rules do: fail), returning the partial labeled_results with the failed task index.

Pages

  • Tool kinds overview — per-tool config schema + result envelope reference.
  • Consumers — who calls into this crate and how (CLI tree walker via noetl-executor::tools_bridge, worker via noetl-executor::dispatch_via_registry).
  • NatsTool — KV / Object Store / JetStream-publish config reference + worked example.
  • McpTool — MCP JSON-RPC bridge config reference + endpoint resolution + parity table vs the Python tool.

Versioning

Currently published on crates.io as noetl-tools = "3.1.1". 3.1.1 fixes multi-tool sibling references in task_sequence (noetl/ai-meta#87 — a later sub-tool can now read an earlier sibling's output via {{ <label>.<field> }}); 3.0.0 was a breaking change to TaskSequenceTool (data-binding only); 2.16.0 added McpTool (R-3 Phase B-2, noetl/ai-meta#39); 2.15.0 added NatsTool (R-3 Phase B-1, noetl/ai-meta#38); 2.11.0 added the ResultFetchTool (R-2.3 playbook surface for cross-step / cross-node fetches); 2.10.0 added arrow_codec::try_encode_tabular_json (R-2.2 tabular shm encoding). Both noetl-worker and noetl-executor pin ^3. See CHANGELOG.md for the full history.

Related

Clone this wiki locally