Skip to content

Umbrella Rust Worker Migration

Kadyapam edited this page Jun 2, 2026 · 3 revisions

Umbrella — Rust Worker Migration (Appendix H) — CLOSED 2026-06-02

ai-task: noetl/ai-meta#30 (CLOSED) · Opened: 2026-05-30 · Closed: 2026-06-02 · Status: ✅ R-1 / R-2 / R-3 work landed; R-4 superseded by system-playbook approach · Successor: Umbrella: System Pool Design — the rest of migration goes via system playbooks on the system worker pool, not via more compiled Rust binaries · Blueprint: Appendix H of the global hybrid cloud architecture

Closed 2026-06-02 with R-1 + R-2 + R-3 done. R-4 was a "6-month checkpoint to decide whether to port _handle_event_inner to Rust"; superseded by the architecture decision that remaining migration goes via compilable + pluggable system playbooks. Three R-3 follow-up gaps remain in their own umbrellas: #43, #47, #48. See the closing comment on the issue for the full audit trail.

Goal

Migrate the runtime hot path from Python to Rust while protecting the production Python platform. Phased rollout so each step is independently shippable + rollback-able.

Phases

R-0 — Blueprint update ✅ (2026-05-29)

Appendix H drafted in noetl/docs#174. Landed 2026-05-29.

R-1 — Shared noetl-executor crate ✅ (2026-05-27 → 2026-05-31)

CLI and worker depend on the same execution-engine crate; ≥80% shared code path.

Sub-task Done Status
R-1.1 PR-1 — crate skeleton 2026-05-27
R-1.1 PR-2a — extract YAML types 2026-05-27
R-1.1 PR-2b — extract parser + command-gen 2026-05-28
R-1.1 PR-2c — replace inline tools with noetl-tools (8 sub-PRs) 2026-05-29 → 2026-05-30 ✅ closes noetl/cli#19
R-1.1 PR-2d — integration tests + docs 2026-05-30
R-1.2 PR-1 — publish executor 0.1.0 to crates.io 2026-05-30
R-1.2 PR-2a — i64 execution_id alignment 2026-05-30
R-1.2 PR-2b/c — worker condition module adopts executor 2026-05-30
R-1.2 PR-2d — worker NatsCommandSource impl 2026-05-31
R-1.2 PR-2e — worker observability harness 2026-05-31 ✅ closes #32

R-2 — Arrow + Flight ✅ (2026-05-28 → 2026-06-01)

Sub-task Done Status
R-2.1 — arrow-rs in noetl-tools 2026-05-28
R-2.2 PR-A/B — tabular encoder fallback 2026-05-29
R-2.3 Phase A — pyarrow.flight.FlightServerBase on Python server 2026-05-30
R-2.3 Phase B — noetl-arrow-flight-client crate 2026-05-31
R-2.3 Phase C1 — get_flight_info discovery 2026-05-31
R-2.3 Phase C2 — mTLS + token auth 2026-06-01 ✅ closes #33

R-3 — Tool-kind parity 🔄 (2026-06-01 → in flight)

Rust worker reaches tool-kind parity for the production playbook mix. KEDA scales Rust and Python pools off the same NATS subjects (per-pool routing).

Sub-task Date Status Tracking
R-3 Phase B-1 — nats tool kind in noetl-tools 2026-06-02 ✅ closed #38
R-3 Phase B-2 — mcp tool kind in noetl-tools 2026-06-02 ✅ closed #39
R-3 Phase B-3 — bump worker noetl-tools ^2.16 2026-06-02 ✅ closed #40
R-3 Phase B-4 — KEDA dual-scaling for Rust pool 2026-06-02 ✅ closed #41
R-3 Phase C-1 — agent tool kind routing decision 2026-06-02 ✅ closed #42
R-3 Phase C-2 — container tool kind callback design 2026-06-02 → in flight 🔄 design #43
Per-pool NATS subject routing (PRs 1-6) 2026-05-31 → 2026-06-01 #42
KEDA multi-trigger scaler 2026-06-01 noetl/noetl#657
Multi-arch image publishing (worker v5.9.0) 2026-06-02 ✅ closed #44
Routing flag flip on prod-shaped env 2026-06-01 noetl/ops#138
task_sequence tool kind 2026-06-02 surfaced ⏳ Ready to pick up #47
Credential alias resolution 2026-06-02 surfaced ⏳ Ready to pick up #48

R-4 — _handle_event_inner decision point

Six-month review based on production load data — port to Rust if handle_event p90 is materially constraining, otherwise declare hybrid the durable design. Not yet reached.

Recent activity

Date Event
2026-06-02 Kind validation against worker v5.9.0 surfaced #47 + #48 (regression gaps). Routing scheme proven end-to-end. Session details.
2026-06-02 Worker v5.9.0 republished as native multi-arch via noetl/worker#39 (replaces failed QEMU build).
2026-06-02 R-3 Phase B/C-1 issues all closed: #38-#42 (5 closures).
2026-06-01 Ops PR-5 routing flag flip merged via noetl/ops#138. Server publishes start landing on noetl.commands.shared.<eid> / noetl.commands.python.<eid>.
2026-05-31 Rust worker observability harness PR-2e merged; /metrics endpoint + snowflake event_id.
2026-05-30 PR-2c noetl-tools registry sweep complete (8 sub-PRs); noetl/cli#19 closed.
2026-05-29 Appendix H blueprint landed via noetl/docs#174.

Next concrete steps

  1. Pick up #47 (task_sequence tool kind). Read Python implementation in repos/noetl/noetl/worker/process.py first to decide between Option A (implement in noetl-tools) and Option B (route to Python pool).
  2. Pick up #48 (credential alias resolution). Add ControlPlaneClient::get_credential if not present; wire into tool dispatch path so auth: "<alias>" resolves before reaching the tool.
  3. Continue #43 container tool kind callback design.
  4. Re-run kind regression after #47 + #48 land; target ≥45/53 COMPLETED (today's 17/53 + the ~35 gap fixes).

Related

NoETL Dashboard

Active Umbrellas

Closed Umbrellas

Conventions

Per-repo wikis

Clone this wiki locally