Skip to content
Kadyapam edited this page Jun 1, 2026 · 13 revisions

NoETL Ops

This wiki is the operational + deployment companion to noetl/ops. Topics that live in this repository — Kubernetes manifests, deployment playbooks, CI/CD, infrastructure automation — have their reference pages here.

For NoETL application documentation (Python API, DSL semantics, the v2 distributed-runtime spec, etc.) see the noetl/noetl wiki.

Where the manifests live. As of the Scope B consolidation (May 2026), all NoETL operational manifests live exclusively in noetl/ops/ci/manifests/. The previous parallel copy at noetl/noetl/ci/manifests/ was deleted; only a MOVED.md breadcrumb remains there. The automation/development/noetl.yaml playbook reads from local ci/manifests/... paths (no more cross-repo $NOETL_REPO/ci/manifests/...).

Pages

Manifests (ci/manifests/)

Page What
KEDA Scaler Worker-pool autoscaling via NATS JetStream consumer lag. Install Helm chart + apply ScaledObject.
NATS Supercluster Multi-cluster JetStream topology with gateway-meshed clusters. Apply 2-cluster reference manifest.

Automation playbooks (automation/)

Page What
GKE Helm install Install + upgrade NoETL on GKE via the Helm chart + Cloud SQL + PgBouncer + chart-templated KEDA. The GKE deploy path the project supports.
Firestore MCP agent Firestore document, event-log, replay, and batch helper methods used by domain playbooks.

(The automation/development/noetl.yaml kind playbook is currently documented inline in the noetl/noetl wiki under operational sections — will migrate here in a future Scope B refactor.)

Kind-cluster validation rigs (automation/development/)

Reproducible end-to-end smoke tests for individual feature surfaces. Each rig is <feature>-validation.yaml + validate-<feature>.sh + validate-<feature>.sql.

Rig What it exercises Worker pool
rust-worker-r2-validation R-2.1 cross-node durable PUT, R-2.1 colocated shm cache, R-2.2 Arrow IPC encoding, producer-side credential scrub. Same PIN_RUST_WORKER=1 auto-pinning shape as the other Rust-only rigs (scales Python worker pool → 0, waits for drain, restores on exit via cleanup trap). SQL probes filter by execution_id = :exec_id passed via psql -v since worker_id only lands on command.claimed events under the post-EE-4 schema. Rust
result-fetch-validation result_fetch tool kind (noetl-tools 2.11+) — producer over-budget Arrow IPC ref → fetch_via_flight + fetch_via_http via the playbook surface. Scales the Python worker pool to 0 + waits for full pod drain to pin commands to the Rust worker (Phase A over-budget branch only fires on the Rust side). Rust
flight-tls-validation R-2.3 Phase C2 full trust boundary — server TLS (C2.1) + client TLS (C2.2) + bearer-token middleware (C2.3) + mTLS (C2.4) all on, talking through the result_fetch tool kind. Companion generate-flight-tls.sh bootstraps the certs + Secrets via openssl (private tmpdir, no repo leakage); --off reverts. Production swap to cert-manager is drop-in — Secret shape stays the same. Rust

Each .sh is self-contained: registers the playbook, kicks off the execution, polls completion, runs the SQL probes, samples the worker /metrics. Pairs with agents/rules/deployment-validation.md — anything that ships in a container image MUST run through one of these before GKE rollout.

Cross-references

Clone this wiki locally