-
Notifications
You must be signed in to change notification settings - Fork 0
executor crate architecture
Since v2.17.1+ the CLI is a Cargo workspace. The root crate
(noetl, producing the noetl and ntl binaries) is unchanged;
a new workspace-member crate noetl-executor hosts the shared
execution utilities and types that both this CLI and the
noetl-worker daemon (R-1.3) depend on.
This page documents the executor crate's module layout, what
moved out of playbook_runner.rs and why, and the deliberate
architectural decision that the CLI keeps its own control loop
rather than sharing one with the worker.
repos/cli/
Cargo.toml # workspace root + noetl binary
src/
main.rs
playbook_runner.rs # CLI tree walker (control loop stays here)
...
executor/ # workspace member: noetl-executor
Cargo.toml
src/
lib.rs
playbook.rs # YAML playbook types
template.rs # render_template + Rhai/JSON helpers
condition.rs # evaluate_condition + evaluate_rhai_condition
capabilities.rs # validate_capabilities + ValidationReport
runtime.rs # ExecutionContext + CredentialResolver trait
events.rs # EventSink trait + ExecutorEvent + EventEmitter
tools_bridge.rs # noetl-tools registry bridge (scaffold)
worker/
mod.rs
source.rs # Command + CommandSource (worker-only)
| Module | Purpose | Used by |
|---|---|---|
playbook |
Pydantic-like YAML types: Playbook, Step, Tool, NextFormat, RuntimeCapabilities, etc. All field accessors pub. |
CLI + worker |
template |
render_template, render_template_with_result, get_json_path, json_to_rhai, rhai_to_json_string. Takes &HashMap<String, String> views of the per-execution variables + step results so each binary owns its own context shape. |
CLI + worker |
condition |
evaluate_condition (simple {{ a == b }} / 'in' / truthy) and evaluate_rhai_condition (full Rhai expression eval). Same context-view contract as template. |
CLI + worker |
capabilities |
validate_capabilities + ValidationReport / ValidationError. Pure function: returns the report rather than bail!ing so the CLI can format errors with playbook_path and the worker with execution_id. |
CLI + worker |
runtime |
ExecutionContext (executor-side variant with async step_results + Arc<dyn CredentialResolver>); CredentialResolver trait. |
worker (CLI uses its own ExecutionContext in playbook_runner.rs) |
events |
ExecutorEvent (mirrors the Python noetl.runtime.events.report_event envelope), EventSink trait, NoopSink, EventEmitter. |
CLI + worker |
tools_bridge |
Scaffold for replacing the CLI's inline tool implementations with calls into the noetl-tools registry. Filled in incrementally per Strategy B (one tool kind per sub-PR). |
CLI (worker already uses noetl-tools directly) |
worker::source |
Command envelope + CommandSource trait. Worker-only: the CLI's tree walker doesn't consume this. |
worker (R-1.3) |
| Where it lives | Why | |
|---|---|---|
The CLI's recursive tree walker (run, execute_step, execute_next_steps, execute_router_arcs) |
repos/cli/src/playbook_runner.rs |
Natural fit for local YAML execution; flattening into a pull-model iterator would lose local-debug clarity |
The CLI's inline tool implementations (execute_tool, execute_shell_command, execute_http_request, execute_duckdb_query, etc.) |
repos/cli/src/playbook_runner.rs (today) |
Migrated incrementally to noetl-tools registry via tools_bridge per Strategy B; ~870 LoC of inline tool dispatch |
| The worker's NATS pull loop | repos/worker/src/ |
Different shape than tree walker; tied to NATS durable-consumer semantics |
RunOutcome (the CLI's JSON output envelope) |
repos/cli/src/playbook_runner.rs |
Not a YAML input type; worker has a different output envelope (event-log writes) |
Mid-implementation discovery, documented in § H.10 of the global hybrid cloud blueprint:
- The CLI is a recursive tree walker. It loads the YAML, walks the workflow, evaluates
nextarcs /caseconditions /thenblocks in place, dispatches each step inline. Control flow is the call stack. - The worker is a pull-model consumer. It subscribes to a NATS durable consumer, pulls one command at a time, executes it, emits events, repeats. No tree. No recursion.
- The original "unified
CommandSourcetrait — both binaries plug in their own impl" was the wrong abstraction for the CLI. Flattening the tree walker into a pull-model iterator loses local-debug clarity, complicatescase/thenstate management, and breaks integration tests written against the tree shape.
noetl-executor was re-scoped from a control-loop crate to a utilities-and-types crate. The CLI's tree walker stays. The worker's pull loop stays. Both call into the executor for the same template rendering, condition evaluation, credential resolution, capability validation, and event shape.
| Topic | Where |
|---|---|
| The migration roadmap | Appendix H of the global hybrid cloud blueprint |
| The architectural finding | § H.10 of the same doc |
The Polars-pattern endpoint (pip install noetl ships Rust runtime + Python wrapper) |
§ H.9 |
| Apache Arrow data plane | § H.4 + § H.11 (local-mode Feather buffer) |
| Tracking issues | noetl/cli#19 (this CLI's R-1.1 sub-issue) · noetl/ai-meta#30 (umbrella) |
| Sub-PR | Scope | PR |
|---|---|---|
| R-1.1 PR-1 | Crate skeleton (lib.rs, runtime, events, source/dispatch placeholders) | #20 |
| R-1.1 PR-2a | YAML types → noetl-executor::playbook
|
#21 |
| R-1.1 PR-2b | Utilities (template + condition + capabilities); § H.10 restructure (placeholder LocalPlaybookSource removed; CommandSource → worker::source) |
#22 |
| R-1.1 PR-2c-1 |
noetl-tools = "2.8.7" dep + tools_bridge scaffold |
#23 |
| R-1.1 PR-2c-2 |
tools_bridge adapters: BridgeContext, to_tools_context, to_tools_config (all 8 Tool variants), from_tools_result; dispatch_via_registry becomes async with per-tool-kind match arms |
#24 |
| R-1.1 PR-2c-3 |
Tool::Rhai arm wired through noetl-tools::RhaiTool; new to_tools_context_for_rhai helper groups CLI flat variables into nested Maps for Rhai field access; CLI's inline execute_rhai_script + rhai_http_request deleted (~220 LoC) |
#25 |
| R-1.1 PR-2c-4 |
Tool::Shell arm wired through noetl-tools::ShellTool; per-command dispatch loop preserves CLI's "fresh bash invocation per command" semantics; new shell_command_config(&str) helper; CLI's inline execute_shell_command deleted (~79 LoC) |
#26 |
| R-1.1 PR-2c-5 |
Tool::Http arm wired through noetl-tools::HttpTool; new resolve_auth_to_bearer helper (CLI AuthConfig → Bearer token via noetl-tools::auth::GcpAuth); new http_tool_config helper that injects the Bearer header into request headers; new reshape_http_result helper that maps noetl-tools' {status_code, headers, body} envelope back to the CLI's pre-existing {status, body} shape so playbook steps keep branching on <step>.body.status. Tool::Auth arm also routes through resolve_auth_to_bearer so both paths share the GCP ADC code path. CLI's inline execute_http_request + get_auth_token deleted (~148 LoC) |
#27 |
| R-1.1 PR-2c-6 |
Tool::DuckDb arm wired through noetl-tools::DuckdbTool; new duckdb_tool_config helper (translates CLI's Vec<String> params to noetl-tools' Vec<serde_json::Value> and maps db → db_path); new reshape_duckdb_result helper unwraps noetl-tools' {columns, rows, row_count} envelope back to the CLI's pre-existing rows-array shape (and {affected_rows} back to {"status": "ok"} for non-SELECT). Empty / whitespace-only query short-circuits to an empty outcome, mirroring the CLI's existing guard. Path resolution + mkdir -p stay at the CLI call site since the bridge has no knowledge of the playbook directory. CLI's inline execute_duckdb_query deleted (~55 LoC). Feature gain: params that were silently ignored (_params: &[String]) are now bound at ? placeholders |
#28 |
| R-1.1 PR-2c-7 | Codifies the § H.10 finding for Tool::Playbook — sub-playbook execution stays in the CLI's tree walker (PlaybookRunner::new(path).run() is the recursion case; the bridge cannot replace it without re-opening § H.10). Replaces the silent BridgeOutcome::empty() stub with anyhow::bail! naming § H.10 so accidental misuse is loud rather than silent. Variable preparation (parent merge + DSL v2 input: / DSL v1 args: rendering + workload. prefix) DID move into a new noetl_executor::tools_bridge::prepare_sub_playbook_vars helper — pure, reusable, testable. CLI's call site uses the helper but keeps its PlaybookRunner recursion inline. No semantic divergences (no execution path changed) |
#29 |
| R-1.1 PR-2c-8 | Final substantive sub-PR. Codifies that Tool::Auth and Tool::Sink stay inline by design but extracts the shareable pure logic into bridge helpers: auth_context_updates(provider, token, project) (replaces inline set_variable calls; preserves CLI's pre-PR-2c-8 ordering), format_sink_payload(format, raw) (json passthrough / yaml dump / csv conversion), and json_to_csv(json_str) (lifted verbatim from CLI). Dispatch arms for both kinds bail loudly with helper-pointing messages. CLI's inline json_to_csv deleted (~42 LoC). Removes the dispatch_via_registry_returns_empty_for_unwired_kind test — every Tool variant now either dispatches through the registry, bails with a § H.10 finding, or bails as unsupported. GCS → object_store migration tracked as a separate follow-up (R-2.x scope, not R-1.x). No semantic divergences |
#30 |
After PR-2c-8, the remaining R-1.1 work is PR-2d: integration-test pass + final docs + closing noetl/cli#19.
Each tool replacement that crosses a behaviour line documents the deltas in the PR body and here.
| PR | Surface | CLI behaviour (pre-replacement) |
noetl-tools behaviour |
User-visible impact |
|---|---|---|---|---|
| #25 | rhai timestamp()
|
chrono::Local::now().format("%H:%M:%S") → "14:23:45"
|
chrono::Utc::now().timestamp().to_string() → "1716847425"
|
Scripts that displayed timestamp() for human reading need updating |
| #25 | rhai HTTP helpers (http_get, http_post, http_delete, *_auth) |
curl subprocess |
Direct reqwest calls |
Same surface; different error shape on network failures |
| #25 | rhai get_gcp_token
|
gcloud auth print-access-token shellout |
gcp_auth crate (workload-identity aware) |
Better on GKE pods; equivalent on hosts with gcloud configured |
| #26 | shell stdout streaming | Line-by-line to terminal as command runs | Collected; returned at end | Breaks real-time output UX for long-running shell steps — users see nothing until command completes |
| #26 | shell result envelope | Captured stdout string |
data: {exit_code, stdout, stderr} JSON |
Transparent — bridge unwraps data["stdout"] and trims trailing \n
|
| #27 | http transport |
curl subprocess |
reqwest direct |
Same envelope on success; different error shape on transport failure (anyhow message vs curl exit code) |
| #27 | http auth (GCP ADC) |
gcloud auth print-access-token shellout |
gcp_auth crate |
Better on GKE pods (workload-identity aware); equivalent on hosts with gcloud configured |
| #27 | http JSON body | Sent verbatim via curl -d; caller had to set Content-Type |
Auto-detected as JSON; reqwest sets Content-Type | Most callers were already sending JSON with Content-Type: application/json; transparent |
| #27 | http result envelope | {"status": int, "body": <json-or-string>} |
data: {"status_code": int, "headers": {...}, "body": <json>} |
Transparent — bridge's reshape_http_result maps back to the CLI's {status, body} shape so 4xx/5xx come back inside the envelope (not as anyhow errors) |
| #27 | Tool::Auth GCP token |
gcloud auth print-access-token shellout (separate code path from rhai get_gcp_token) |
gcp_auth crate via shared resolve_auth_to_bearer helper |
Both Tool::Http and Tool::Auth now share the GCP ADC code path — eliminates the divergence between the two |
| #28 | duckdb SELECT result envelope | JSON array of row objects (pretty-printed) | data: {"columns": [...], "rows": [...], "row_count": N} |
Transparent — bridge's reshape_duckdb_result maps back to the CLI's rows-array shape |
| #28 | duckdb non-SELECT result envelope | Literal string {"status": "ok"}
|
data: {"affected_rows": N} |
Transparent — bridge maps back; affected_rows dropped (CLI never exposed it) |
| #28 | duckdb params binding |
_params: &[String] accepted but silently ignored |
Bound as JSON values at ? placeholders |
Feature gain — playbooks that intended their params: field would now see them applied (no breakage for playbooks with stale params + no ? placeholders since DuckDB ignores extra params) |
| #28 | duckdb path resolution + mkdir |
resolve_duckdb_path + fs::create_dir_all(parent) in execute_duckdb_query
|
Open as-given, no mkdir | CLI keeps owning these at the call site before dispatch; bridge has no knowledge of the playbook directory |
-
playbook_runner.rs: 2,688 → 1,606 lines (-1,082 net across PR-2a + PR-2b + PR-2c-3 + PR-2c-4 + PR-2c-5 + PR-2c-6 + PR-2c-7 + PR-2c-8) - New code in
noetl-executor: 7 modules + 1 worker submodule, ~3,400 LoC -
noetl-executorunit tests: 0 → 80 across PR-1 + PR-2a + PR-2b + PR-2c-1 + PR-2c-2 + PR-2c-3 + PR-2c-4 + PR-2c-5 + PR-2c-6 + PR-2c-7 + PR-2c-8 - Workspace-wide tests: 162 passing (80 noetl-executor + 41 noetl + 41 ntl)
NoETL CLI
Contexts
- Context model
context addcontext init --from-gatewaycontext updatecontext port-forwardcontext list / use / current / delete
Auth
Architecture
Cross-wiki