Skip to content

mtls rust stack

Kadyapam edited this page Jun 6, 2026 · 2 revisions

mTLS for the Rust noetl stack

Mutual TLS on the control-plane API between noetl-server-rust and noetl-worker-rust — the transport that authenticates + encrypts the worker→server credential channel (GET /api/credentials/<alias>) so a resolved secret no longer travels plaintext on the wire.

Part of the Secrets Wallet umbrella (noetl/ai-meta#61), Phase 4:

Phase Where What
4a noetl/server#103 (v2.30.0) Server opt-in TLS/mTLS listener — NOETL_TLS_CERT / NOETL_TLS_KEY / NOETL_TLS_CLIENT_CA.
4b noetl/worker#56 (v5.12.0) Worker mTLS client — NOETL_TLS_CLIENT_CERT / NOETL_TLS_CLIENT_KEY / NOETL_TLS_CA.
4c noetl/ops#163 cert-manager issues the certs in-cluster; manifests wire the deployments.

The certs are minted in-cluster by cert-manager — no manual openssl / kubectl create secret, nothing secret in git.

Manifests

ci/manifests/noetl/tls/ in noetl/ops:

File What
certificates.yaml self-signed Issuer → CA Certificate → CA Issuer → noetl-server-tls (serverAuth, SAN = service DNS + localhost) + noetl-worker-tls (clientAuth). cert-manager materializes both Secrets (tls.crt / tls.key / ca.crt).
server-rust-mtls-patch.yaml mounts noetl-server-tls, sets the server NOETL_TLS_* env + https public URL, swaps probes to tcpSocket.
worker-rust-mtls-patch.yaml mounts noetl-worker-tls, sets the worker NOETL_TLS_CLIENT_* env + https NOETL_SERVER_URL, rewrites wait-for-api to curl the mTLS endpoint with the client cert.

Enable (kind)

# cert-manager (once)
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.16.2/cert-manager.yaml
kubectl -n cert-manager rollout status deploy/cert-manager-webhook --timeout=180s

# issue the certs
kubectl apply -f ci/manifests/noetl/tls/certificates.yaml
kubectl -n noetl wait --for=condition=Ready certificate/noetl-server-tls certificate/noetl-worker-tls --timeout=120s

# flip the rust deployments to mTLS
kubectl -n noetl patch deploy noetl-server-rust --type strategic --patch-file ci/manifests/noetl/tls/server-rust-mtls-patch.yaml
kubectl -n noetl patch deploy noetl-worker-rust --type strategic --patch-file ci/manifests/noetl/tls/worker-rust-mtls-patch.yaml

Verify: server logs Server listening (TLS) tls=true mtls=true; worker logs TLS enabled mtls=true ca=trueWorker registered. The full runbook (verify + revert) is in the manifest directory's README.md.

The two probe/health caveats

A client-cert-requiring listener rejects the certless handshake that Kubernetes liveness/readiness probes and curl-based init containers do by default. Two consequences this setup handles:

  1. Server probes → tcpSocket. An httpGet/HTTPS probe can't present a client cert, so mTLS fails it → the pod never goes Ready. A TCP port-open probe is the pragmatic fix; a separate non-mTLS health port is the production-grade alternative.
  2. Worker wait-for-api init → mTLS curl. The init container's plain-HTTP curl can't complete against an mTLS server → the pod hangs in Init. It's rewritten to curl https://…/api/health with the mounted client cert.

Helm chart (GKE)

Phase 4d — automation/helm/noetl/ now exposes the same mTLS shape as a values-gated opt-in (noetl/ops#165, Secrets Wallet Phase 4 GKE flip). Disabled by default — when off the chart renders byte-identical to pre-Phase-4 deployments.

# bootstrap CA (non-prod / smoke) — chart provisions a self-signed CA in-cluster
helm upgrade --install noetl ./automation/helm/noetl/ \
  --set workerPool.enabled=true \
  --set tls.enabled=true \
  --set tls.certManager.bootstrap.enabled=true

# production — point at a ClusterIssuer backed by GCP CAS or SPIRE/SPIFFE
helm upgrade --install noetl ./automation/helm/noetl/ \
  -f my-values.yaml \
  --set tls.enabled=true \
  --set tls.certManager.issuerRef.name=gcp-cas-issuer \
  --set tls.certManager.issuerRef.kind=ClusterIssuer

The chart's tls: values block (excerpt):

tls:
  enabled: false                  # master toggle — off renders no Cert/Issuer
  mtls:
    enabled: true                 # require + verify client certs (else server-only TLS)
  certManager:
    enabled: true                 # let chart issue Certificate resources
    issuerRef:
      name: noetl-ca-issuer
      kind: Issuer                # Issuer or ClusterIssuer
    bootstrap:
      enabled: false              # provision a self-signed CA Issuer (non-prod)
      caCertificateName: noetl-ca
      caSecretName: noetl-ca-tls
    server: { secretName: noetl-server-tls, dnsNames: [], duration: 8760h, renewBefore: 720h }
    worker: { secretName: noetl-worker-tls,                  duration: 8760h, renewBefore: 720h }

What the chart wires (under tls.enabled: true):

  • templates/tls-certificates.yaml → cert-manager Certificate resources for the server (serverAuth, SAN = service DNS) + the worker pool (clientAuth), optionally bootstrapping a self-signed CA Issuer for non-prod.
  • server-deployment.yaml → mounts noetl-server-tls, sets NOETL_TLS_CERT/KEY (+ CLIENT_CA when mtls.enabled: true), https NOETL_PUBLIC_SERVER_URL, and switches probes to tcpSocket under mTLS (httpGet HTTPS under server-only TLS).
  • worker-pool-deployment.yaml → mounts noetl-worker-tls, sets NOETL_TLS_CLIENT_CERT/KEY + NOETL_TLS_CA, overrides NOETL_SERVER_URL to https://, and rewrites the wait-for-api init container to curl the mTLS endpoint with the client cert.

The env contract is the Rust noetl-server / noetl-worker contract; under a Python image these env vars are no-ops. Helm-managed cutover to the Rust binaries is a separate concern.

Scope + GKE

The kind reference (ci/manifests/noetl/tls/, Phase 4c) and the Helm values-gated path (above, Phase 4d) cover the same shape. Production GKE should point tls.certManager.issuerRef at a ClusterIssuer backed by GCP Certificate Authority Service or a SPIRE/SPIFFE federation, not the self-signed bootstrap Issuer.

Related

Clone this wiki locally