docs(deployment-spec): NOETL_SIDE_EFFECT_BARRIER env var (#104 Phase E)
16afd6c
docs(deployment-spec): NOETL_RESULT_TIER_DR env var (#104 Phase F)
3d77005
docs(deployment-spec): result-tier flags incl. Phase D mint-authoritative (#104)
33bf7e6
docs(deploy-spec): off-server state builder shadow env vars + metrics (RFC #115 Phase 4)
NOETL_STATE_BUILDER_SHADOW (+ _STREAM/_BATCH/_TIMEOUT_MS/_IDLE_SLEEP_MS) and the
noetl_worker_state_builder_* metrics (wal_events_total WAL-read proof,
event_scans_total no-scan proof, builds_total{outcome}, chain_hops). System pool
only, observation-only, default off; drive cutover staged behind the server's
NOETL_STATE_BUILDER=offserver. Lockstep with noetl/worker#118.
c1af68c
deployment-spec: scrape-path note — kind=VMServiceScrape vs prod=GMP PodMonitoring (the lag gauge is blind on prod without it)
0dcbbd4
deployment-spec: document materializer-lag gauge + materializer counters (CQRS #103 flip guardrail)
0030f30
deployment-spec: require memory-backed /dev/shm for the Arrow IPC cache
The worker SIGBUSes (exit 135) and crash-loops when the Arrow IPC cache
(NOETL_IPC_CACHE_BUDGET_BYTES, default 256 MB) writes past the k8s default
64 MiB /dev/shm tmpfs. Document the required memory-backed /dev/shm
(emptyDir medium: Memory, sizeLimit 320Mi >= budget), the coherent
memory limit (768Mi), and the three values that must move together.
Fixed in noetl/ops#193. Tracks noetl/ai-meta#112.
eb78f3b
docs(deployment-spec): NOETL_EVENT_RESULT_CONTEXT_MAX_BYTES (results-by-reference budget)
Refs noetl/ai-meta#101, noetl/worker#89
a1d3b09
worker-credentials: pre-dispatch terminal-vs-retryable call.error (noetl/ai-meta#78)
987abba
deployment-spec: sealed credential delivery + NOETL_SEALED_CREDENTIALS (Phase 5c)
Documents the worker-side opt-in for sealed credential responses:
NOETL_SEALED_CREDENTIALS env gate, X25519 keypair registered in the
register payload's runtime JSON blob, zeroize on the resolved bytes.
noetl/worker#58 (Secrets Wallet Phase 5c, noetl/ai-meta#61).
05679e3
deployment-spec: worker mTLS client env (Phase 4b)
NOETL_TLS_CLIENT_CERT / NOETL_TLS_CLIENT_KEY (present a client cert) +
NOETL_TLS_CA (trust a private-CA server) in a Transport-security subsection;
rustls-tls backend; https NOETL_SERVER_URL note; wait-for-api init-container
mTLS caveat (Phase 4c). noetl/worker#56 (Secrets Wallet Phase 4b, noetl/ai-meta#61).
0caa246
wiki: deployment-specification page (env-var catalogue + runtime contract)
New top-level page covering the deployment shape for noetl-worker:
runtime contract, NATS layout, KEDA scaling, resources, health
probes, FULL env-var catalogue with the why behind each one,
secrets handling, snowflake node-id derivation, observability,
kind-validation procedure.
Captures the existing env surface (WORKER_ID, WORKER_POOL_NAME,
WORKER_HEARTBEAT_INTERVAL, WORKER_MAX_CONCURRENT,
WORKER_METRICS_BIND, WORKER_NATS_LAG_POLL_INTERVAL,
NOETL_SERVER_URL, NATS_URL/USER/PASSWORD/STREAM/CONSUMER/SUBJECT/
FILTER_SUBJECT, NOETL_SNOWFLAKE_NODE_ID, NOETL_SHARD_ID,
NOETL_NODE_ID, NODE_NAME, NOETL_SNOWFLAKE_EPOCH_MS,
NOETL_KEYCHAIN_ENV_VARS, NOETL_IPC_CACHE_BUDGET_BYTES, HOSTNAME,
RUST_LOG). Several of these had only inline rustdoc coverage
before; this page is the durable single-source-of-truth.
Sidebar gains an "Operations" section linking the new page;
Home references it in the Pages list.
Going forward this page is the single source of truth for env
vars + ports + dependencies; any code change that touches
std::env::var or envy::from_env must update it in the same
change set per the new agents/rules/wiki-maintenance.md Rule 2a
(landing separately on ai-meta).
6fe72b5
wiki: document credential alias resolution in worker-credentials
Adds a "Two ways playbooks reference a credential" section to the
worker-credentials page, covering both the inline `auth:` struct
shape (already supported) and the bare alias string shape that
noetl-worker 5.10+ resolves at dispatch time via the new
`auth_alias` helper (noetl/ai-meta#48).
Documents:
- The 4 supported credential types (postgres, bearer, api_key,
basic) and how each maps onto the tool config.
- The error surface (missing alias → clear name-bearing error,
unsupported type → named error).
- The override semantics (playbook overrides win, unprefixed keys
accepted for back-compat).
Ships in lockstep with noetl/worker PR per Rule 2b.
Refs noetl/ai-meta#48
4832d48
docs(worker): add nats + mcp tool kinds page + update crate dep in Home
Documents the nats (2.15.0) and mcp (2.16.0) tool kinds from the
worker's dispatch perspective: playbook config shape, credential wiring
via NOETL_KEYCHAIN_ENV_VARS, endpoint resolution (mcp), and the
dynamic registry dispatch path.
Cross-linked from Home.md (Pages section) and _Sidebar.md (Architecture
section). Also bumps the noetl-tools version in the dependency graph
diagram from "2.11" to "2.16".
Tracks noetl/ai-meta#40.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
6e8413d
wiki: worker-credentials page (NOETL_KEYCHAIN_ENV_VARS, noetl/worker#35)
378f461
wiki: bump noetl-tools 2.8.7 -> 2.11 + noetl-executor 0.2 -> 0.3 in graph
Reflects the actual deps the worker carries after noetl/worker#32
(picks up the result_fetch tool kind from noetl-tools 2.11).
noetl-executor was already at 0.3.0 since R-1.2 PR-2d-1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4e44c4d
docs(adoption): noetl/worker#31 (producer-side credential scrubbing)
Extends the landing-history table with PR #31 (5.6.0): a new
`src/scrub.rs` module runs at the top of `build_call_done_result`
so all three emit paths (inline `result.context`, durable PUT, shm
cache) ride a scrubbed clone. Mirrors Python's
`producer_scrub_payload` shape — sensitive-key matching with case +
separator normalisation, value-pattern matching for Bearer / Basic /
JWT / private-key blocks + vendor prefixes, recursive walk replacing
matches with `[REDACTED]`.
Updates the "in flight or pending" section: credential scrubbing is
no longer an open follow-up. Only remaining open work before the
next major Appendix H phase (R-2.3 Arrow Flight) is live kind-cluster
validation of R-2.1 + R-2.2 + scrub end-to-end.
87/87 lib tests pass (15 new in scrub::tests + 2 new in
executor::command::tests).
4baeec4
docs(adoption): noetl/worker#30 (R-2.2 tabular outputs as Arrow IPC)
Extends the landing-history table with PR #30 (5.5.0): the worker
now stages tabular tool outputs (DuckDB / Postgres / Snowflake
rowsets, `{columns, rows}` shape) as Arrow IPC stream bytes in
the shm cache instead of JSON when they exceed the inline budget.
Colocated consumers switch on `IpcHint.media_type =
"application/vnd.apache.arrow.stream"` to pick the Arrow decoder;
non-tabular outputs stay on the JSON fallback (#28).
The 5-row fallback chain from #29 is preserved — only the shm
payload encoding changes per-context. Durable PUT stays JSON
regardless so cross-node consumers see no behaviour change.
Updates the "in flight or pending" section: R-2.1 + R-2.2 are
both fully landed worker-side. Next major Appendix H phase is
R-2.3 (Arrow Flight gRPC endpoint for tabular cross-node fetch);
open follow-ups are credential-scrubbing-before-durable-PUT and
live kind-cluster validation of R-2.1 + R-2.2 together.
f10255b
docs(adoption): noetl/worker#24 closed — cross-node durable result-store slice
Extends the landing-history table with the fifth + final PR in the
`call.done` payload series:
5. #29 (5.4.0) — durable result-store via
`ControlPlaneClient::put_result` → `ResultRef` reference with
optional nested `ipc: IpcHint`. Cross-node consumers can now
fetch the bytes via `GET /api/result/resolve`; colocated
consumers keep the shm fast path.
`build_call_done_result` is now a 5-row fallback chain:
- inline `context` (≤ 100 KB)
- durable + ipc (both succeed)
- durable only (shm fails)
- ipc only (durable fails — matches #28 degraded mode)
- status only (everything fails — visible drop)
Updates the "in flight or pending" section: #24 is fully closed; the
R-2.1 worker-side application has landed in both colocated AND
cross-node forms. Next major Appendix H phase is R-2.2 (tabular
tool outputs via `arrow_codec::encode_record_batch`).
021ae4d
docs(adoption): noetl/worker#24 call.done reference-only payload series
Documents the four-PR progression (#25 → #26 → #27 → #28) aligning the
worker's `call.done` payload with the Python broker's
`_validate_reference_only_payload` contract:
1. #25 (5.1.3) — minimum-viable `{status}`-only emit (forward-progress
fix; strips all tool output).
2. #26 (5.2.0) — restores data flow via inline `result.context` for
in-budget outputs.
3. #27 (5.2.1) — Rust-side pre-check against
`NOETL_EVENT_RESULT_CONTEXT_MAX_BYTES` so over-budget tool output
is visibly dropped rather than silently truncated.
4. #28 (5.3.0) — colocated-consumer slice: over-budget outputs stage
in the same-node Arrow IPC cache (`noetl-arrow-cache` 0.1.0,
R-2.1) and ride the event as `result.reference`.
Also updates the "Sub-PRs in flight or pending" section: the
colocated slice is now ✅; the durable `result_store` cross-node path
is the next open #24 slice; the next major Appendix H phase is R-2.2
(tabular tool outputs via `arrow_codec::encode_record_batch`).
a281cdd
noetl-executor-adoption: kind-validation fixes row (5.1.1 + 5.1.2)
Add a R-1.2 sub-PR landing history row for the three latent
worker bugs surfaced + fixed during the kind-validation pass:
- noetl/worker#19 — NATS URL inline auth (5.1.1)
- noetl/worker#21 — registration payload (5.1.2)
- noetl/worker#23 — command_id numeric JSON (5.1.2)
These had never fired before because the Rust worker had only
been exercised against MockSource in unit tests + anonymous-NATS
dev configs. Kind validation against the real Python broker
caught all three in sequence.
Refs noetl/ai-meta#30 validation summary.
a33a2ef
noetl-executor-adoption: NATS consumer-lag landed (#17); R-1.2 chunk done
Add the consumer-lag row to the R-1.2 sub-PR landing history
table (noetl-worker 5.1.0 / noetl/worker#17):
- New src/nats/lag_poller.rs with periodic poll task spawned
from Worker::run. Updates two gauges:
noetl_worker_nats_consumer_pending and
noetl_worker_nats_consumer_ack_pending (both labeled
stream + consumer).
- Cadence WORKER_NATS_LAG_POLL_INTERVAL env (default 5s,
clamped ≥1s).
- KEDA can now drive worker-pool autoscaling off the worker's
own /metrics endpoint.
- Behaviour-additive (feat:) → 5.0.0 → 5.1.0.
Strike through the final "in flight" bullet and add a closing
note: the R-1.2 worker chunk + every documented follow-up is
done; next Appendix H phase is R-2.1 (noetl-arrow-cache crate).
Refs noetl/ai-meta#30.
49a841b
noetl-executor-adoption: meta.attempts landed (#15); both EE-3 follow-ups done
Add the meta.attempts row to the R-1.2 sub-PR landing history
table (noetl-worker 5.0.0 / noetl/worker#15):
- Threads Command.attempts: u32 (executor 0.3.0) through every
emitted envelope's meta field.
- meta: Some({"attempts": N}) populated ALWAYS — even attempts=0
is a meaningful signal; projector reads uniformly.
- 7 helpers + CommandExecutor::emit_event signatures gained
attempts: u32 → feat: + BREAKING CHANGE: → 4.0.0 → 5.0.0.
Both EE-3 follow-ups now ✅:
- App-side snowflake event_id (#14, 4.0.0)
- meta.attempts propagation (#15, 5.0.0)
Remaining pending bullet: NATS consumer lag metric (PR-2e
follow-up; unrelated to EE).
Refs noetl/ai-meta#30.
38be9c1
noetl-executor-adoption: app-side snowflake event_id landed (#14)
Add a new R-1.2 sub-PR row for the snowflake event_id work in
noetl-worker 4.0.0:
- Layout mirrors noetl.core.common.get_snowflake_id 1:1 — same
41/10/12 bit split, same env knobs (NOETL_SNOWFLAKE_EPOCH_MS,
NOETL_SNOWFLAKE_NODE_ID, NOETL_SHARD_ID).
- Cross-stack compat test (id_layout_matches_python_helper_formula)
reconstructs the Python helper's bit-packing formula from a
Rust-generated id.
- Constructor signatures changed → feat: + BREAKING CHANGE: →
semantic-release auto-bumped 3.0.0 → 4.0.0.
Update the "in flight or pending" bullet to strike through the
app-side snowflake entry (done) and link noetl/worker#13 as the
remaining EE-3 follow-up (meta.attempts propagation).
Refs noetl/ai-meta#30.
7332aa9
noetl-executor-adoption: PR-EE-3 landed; EE series complete
After noetl/worker#11 merged (noetl-worker 3.0.0):
- Add EE-3 row to the R-1.2 sub-PR landing history table.
- Replace the "in flight or pending" EE bullet with the four-PR
complete status (✅ EE-1 / EE-2 / EE-3 / EE-4).
- Surface two follow-ups that fell out of EE-3 scope:
- App-side snowflake event_id generation (observability.md
Principle 3 — current code passes None and the server's
gen_snowflake() DB default fires).
- meta.attempts propagation (Command.attempts exists on the
executor 0.3.0 shape but outgoing events leave meta: None).
Refs noetl/ai-meta#30 — EE umbrella.
a581bf1
noetl-executor-adoption: EE series progress (3 of 4 landed)
Expand the cross-repo EE bullet under "Sub-PRs in flight or
pending" so worker contributors can see the full status before
picking up PR-EE-3:
- PR-EE-1 ✅ noetl/cli#37 (noetl-executor 0.3.1)
- PR-EE-2 ✅ noetl/server#6 + pipeline-fix #7 (noetl-server 2.0.1)
- PR-EE-4 ✅ noetl/noetl#639 (noetl 3.0.0)
- PR-EE-3 ⏳ here — replace WorkerEvent with ExecutorEvent
re-export; sends new wire shape. Both servers accept either
form via aliases so this is safe to land any time.
Drop the duplicate envelope-reconciliation bullet (the longer
one consolidates it). Cross-link to noetl/server wiki
event-envelope page.
ac8dc75
docs(wiki): record PR-2e (observability harness) + EE-1 progress
PR-2e (noetl/worker#8) landed; noetl-worker now at 2.1.0 with
Prometheus /metrics endpoint + 7 metrics covering every boundary
(pulls, dispatch, event emit, concurrent dispatches).
Updates:
- R-1.2 sub-PR landing history extended with PR-2e row covering
the 7-metric inventory, the dedicated 9090 port, the
observability.md Principle 2 satisfaction, and the
noetl/ai-meta#32 closure.
- Sub-PRs-in-flight: PR-2e removed (landed); event envelope
reconciliation roadmap updated to reflect PR-EE-1 landed on
noetl/cli (executor 0.3.1) and PR-EE-3 (worker switch) is the
remaining worker-side step waiting for server-side
PR-EE-2 + PR-EE-4.
- NATS consumer lag metric noted as a follow-up from PR-2e.
Refs noetl/ai-meta#30 -- Appendix H umbrella.
Refs noetl/worker#8 -- PR-2e.
Refs noetl/cli#37 -- PR-EE-1 (which the worker switch will
eventually adopt).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2617ea9
docs(wiki): record R-1.2 PR-2d-2 worker adoption (1.1.0 → 2.0.0)
Updates noetl-executor-adoption.md to reflect that PR-2d-2 landed
(was 'planned' in the previous version):
- "What the worker imports" section: drops 'Planned' framing;
documents each surface (CommandSource, ClaimOutcome, Pulled<H>,
Command) with its actual worker call site.
- New "AckHandle design" sub-section explains why NatsAckHandle is
NatsAckHandle { message, notification } not bare Message --
ClaimOutcome doesn't carry execution_id on AlreadyClaimed /
RetryLater / Failed variants, so the notification metadata rides
in the ack handle for observability.md Principle 4 compliance.
- New "Lossless WorkerCommand → ExecutorCommand translation" table
documents each field mapping; references the
translate_carries_full_context_as_input_including_cases test that
locks in the contract.
- R-1.2 sub-PR landing history extended with PR-2d-2 + the two CI
fixes (PRs #4 and #5) so the version cascade 1.1.0 -> 1.1.2 ->
2.0.0 is auditable.
- Sub-PRs-in-flight section: PR-2d-2 removed (landed); PR-2e
(Prometheus metrics harness, noetl/ai-meta#32) added as the
next worker-side R-1.2 deliverable.
- Adds worker-source link reference.
Refs noetl/ai-meta#30 -- Appendix H umbrella.
Refs noetl/worker#6 -- the PR this row documents.
Refs noetl/ai-meta#32 -- the metrics harness follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fe5753f
docs(wiki): update noetl-executor-adoption for 0.3.0 trait surface
PR-2d-1 (noetl/cli#35) landed the breaking CommandSource trait
redesign for noetl-executor 0.3.0. Worker wiki updated to:
- Document the new types the worker will import from
noetl_executor::worker::source in PR-2d-2: CommandSource trait
(with ack lifecycle + associated AckHandle), ClaimOutcome enum
(4-state — Claimed/AlreadyClaimed/RetryLater/Failed), Pulled<H>
wrapper, and the enriched Command (now carries render_context +
attempts).
- Replace the placeholder "Planned for PR-2d" section with a
concrete table mapping each new type to its worker call site.
- Add R-1.2 PR-2d-1 to the companion-PRs table on noetl/cli with
a deep-link to the cli wiki's design-decisions section.
- Bump docs.rs link versions 0.2.1 -> 0.3.0 throughout.
- Rename "PR-2d" -> "PR-2d-2" in the "in flight" section since
the planned work split into two PRs (cli trait redesign +
worker adoption).
Refs noetl/ai-meta#30 -- Appendix H umbrella.
Refs noetl/cli#35 -- the PR that landed the 0.3.0 surface this
worker page now documents.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
34930c8
docs(wiki): scaffold worker wiki (Home + noetl-executor-adoption + release-pipeline + sidebar)
First substantive content for the noetl-worker wiki. Replaces the
"Welcome to the worker wiki!" stub with a Home page covering crate
architecture, module layout, and pages.
- **Home.md**: dependency graph, module layout table (per-module
link to GitHub source), distribution channels, related repo
cross-links.
- **noetl-executor-adoption.md**: companion to the cli wiki's
executor-crate-architecture page from the worker side. Lists
what the worker imports from noetl-executor today (Operator,
Condition, evaluate_structured_condition), what stays
worker-local (pull-loop control flow per § H.10), and the R-1.2
sub-PR landing history (PR-2c shipped; PR-2d planned).
- **release-pipeline.md**: documents the semantic-release →
release-worker flow, the GitHub-Actions safeguard gotcha
(GITHUB_TOKEN-created tags don't fire workflows), and the fix
pattern noetl/cli uses (#32) which noetl/worker#4 will adopt.
- **_Sidebar.md**: navigation; cross-links to cli + tools +
server + noetl + ops wikis.
Cross-references all use the noetl/<repo>#<NN> format per the
ai-meta cross-repo issue-linking discipline.
Refs noetl/ai-meta#30 -- Appendix H umbrella.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
337dd2a