-
Notifications
You must be signed in to change notification settings - Fork 0
mtls rust stack
Mutual TLS on the control-plane API between noetl-server-rust and
noetl-worker-rust — the transport that authenticates + encrypts the
worker→server credential channel (GET /api/credentials/<alias>) so a resolved
secret no longer travels plaintext on the wire.
Part of the Secrets Wallet umbrella (noetl/ai-meta#61), Phase 4:
| Phase | Where | What |
|---|---|---|
| 4a | noetl/server#103 (v2.30.0) | Server opt-in TLS/mTLS listener — NOETL_TLS_CERT / NOETL_TLS_KEY / NOETL_TLS_CLIENT_CA. |
| 4b | noetl/worker#56 (v5.12.0) | Worker mTLS client — NOETL_TLS_CLIENT_CERT / NOETL_TLS_CLIENT_KEY / NOETL_TLS_CA. |
| 4c | noetl/ops#163 | cert-manager issues the certs in-cluster; manifests wire the deployments. |
The certs are minted in-cluster by cert-manager — no manual openssl /
kubectl create secret, nothing secret in git.
ci/manifests/noetl/tls/ in noetl/ops:
| File | What |
|---|---|
certificates.yaml |
self-signed Issuer → CA Certificate → CA Issuer → noetl-server-tls (serverAuth, SAN = service DNS + localhost) + noetl-worker-tls (clientAuth). cert-manager materializes both Secrets (tls.crt / tls.key / ca.crt). |
server-rust-mtls-patch.yaml |
mounts noetl-server-tls, sets the server NOETL_TLS_* env + https public URL, swaps probes to tcpSocket. |
worker-rust-mtls-patch.yaml |
mounts noetl-worker-tls, sets the worker NOETL_TLS_CLIENT_* env + https NOETL_SERVER_URL, rewrites wait-for-api to curl the mTLS endpoint with the client cert. |
# cert-manager (once)
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.16.2/cert-manager.yaml
kubectl -n cert-manager rollout status deploy/cert-manager-webhook --timeout=180s
# issue the certs
kubectl apply -f ci/manifests/noetl/tls/certificates.yaml
kubectl -n noetl wait --for=condition=Ready certificate/noetl-server-tls certificate/noetl-worker-tls --timeout=120s
# flip the rust deployments to mTLS
kubectl -n noetl patch deploy noetl-server-rust --type strategic --patch-file ci/manifests/noetl/tls/server-rust-mtls-patch.yaml
kubectl -n noetl patch deploy noetl-worker-rust --type strategic --patch-file ci/manifests/noetl/tls/worker-rust-mtls-patch.yamlVerify: server logs Server listening (TLS) tls=true mtls=true; worker logs
TLS enabled mtls=true ca=true → Worker registered. The full runbook
(verify + revert) is in the manifest directory's README.md.
A client-cert-requiring listener rejects the certless handshake that Kubernetes liveness/readiness probes and curl-based init containers do by default. Two consequences this setup handles:
-
Server probes →
tcpSocket. AnhttpGet/HTTPS probe can't present a client cert, so mTLS fails it → the pod never goes Ready. A TCP port-open probe is the pragmatic fix; a separate non-mTLS health port is the production-grade alternative. -
Worker
wait-for-apiinit → mTLS curl. The init container's plain-HTTPcurlcan't complete against an mTLS server → the pod hangs inInit. It's rewritten to curlhttps://…/api/healthwith the mounted client cert.
Phase 4d — automation/helm/noetl/ now exposes the same mTLS shape as a
values-gated opt-in (noetl/ops#165,
Secrets Wallet Phase 4 GKE flip). Disabled by default — when off the chart
renders byte-identical to pre-Phase-4 deployments.
# bootstrap CA (non-prod / smoke) — chart provisions a self-signed CA in-cluster
helm upgrade --install noetl ./automation/helm/noetl/ \
--set workerPool.enabled=true \
--set tls.enabled=true \
--set tls.certManager.bootstrap.enabled=true
# production — point at a ClusterIssuer backed by GCP CAS or SPIRE/SPIFFE
helm upgrade --install noetl ./automation/helm/noetl/ \
-f my-values.yaml \
--set tls.enabled=true \
--set tls.certManager.issuerRef.name=gcp-cas-issuer \
--set tls.certManager.issuerRef.kind=ClusterIssuerThe chart's tls: values block (excerpt):
tls:
enabled: false # master toggle — off renders no Cert/Issuer
mtls:
enabled: true # require + verify client certs (else server-only TLS)
certManager:
enabled: true # let chart issue Certificate resources
issuerRef:
name: noetl-ca-issuer
kind: Issuer # Issuer or ClusterIssuer
bootstrap:
enabled: false # provision a self-signed CA Issuer (non-prod)
caCertificateName: noetl-ca
caSecretName: noetl-ca-tls
server: { secretName: noetl-server-tls, dnsNames: [], duration: 8760h, renewBefore: 720h }
worker: { secretName: noetl-worker-tls, duration: 8760h, renewBefore: 720h }What the chart wires (under tls.enabled: true):
-
templates/tls-certificates.yaml→ cert-managerCertificateresources for the server (serverAuth, SAN = service DNS) + the worker pool (clientAuth), optionally bootstrapping a self-signed CA Issuer for non-prod. -
server-deployment.yaml→ mountsnoetl-server-tls, setsNOETL_TLS_CERT/KEY(+ CLIENT_CAwhenmtls.enabled: true),httpsNOETL_PUBLIC_SERVER_URL, and switches probes totcpSocketunder mTLS (httpGet HTTPSunder server-only TLS). -
worker-pool-deployment.yaml→ mountsnoetl-worker-tls, setsNOETL_TLS_CLIENT_CERT/KEY+NOETL_TLS_CA, overridesNOETL_SERVER_URLtohttps://, and rewrites thewait-for-apiinit container to curl the mTLS endpoint with the client cert.
The env contract is the Rust noetl-server / noetl-worker contract; under a Python image these env vars are no-ops. Helm-managed cutover to the Rust binaries is a separate concern.
The kind reference (ci/manifests/noetl/tls/, Phase 4c) and the Helm
values-gated path (above, Phase 4d) cover the same shape. Production GKE
should point tls.certManager.issuerRef at a ClusterIssuer backed by GCP
Certificate Authority Service or a SPIRE/SPIFFE federation, not the
self-signed bootstrap Issuer.
- GKE Helm install
- System worker pool
- Server deployment-specification (§ Transport security)
- Worker deployment-specification (§ Transport security)