Skip to content

manifests keda

Kadyapam edited this page May 24, 2026 · 2 revisions

KEDA Scaler

KEDA-driven autoscaling for NoETL worker pools, keyed off NATS JetStream consumer lag. Operational guide for the static sample manifest committed at ci/manifests/keda/.

Profile note. Two KEDA artifacts exist with different NATS values:

  • This page covers ci/manifests/keda/scaledobject-worker-cpu-01.yaml — the kind-cluster artifact (account: NOETL, nats.nats.svc.cluster.local:8222). Applied via the automation/development/noetl.yaml playbook. Pinned by a drift-guard test against the noetl.core.runtime.keda generator.
  • The GKE-cluster artifact is rendered by the Helm chart at automation/helm/noetl/templates/worker-keda-scaledobject.yaml with account: $G and nats-headless.nats.svc.cluster.local:8222. See GKE Helm install for that path.

The two profiles are separate by design — kind runs a single-account NATS, GKE runs the Helm NATS chart's default-jetstream account.

For the underlying Python generator (ScaledObjectSpec / build_worker_scaledobject / dump_scaledobject_yaml in noetl/core/runtime/keda.py), see the KEDA Scaler page on the noetl/noetl wiki.

What's in ci/manifests/keda/

File Purpose
scaledobject-worker-cpu-01.yaml ScaledObject for the existing single-pool noetl-worker Deployment. Generated verbatim by the snippet documented in the file's header comment.
README.md Quick-start: install KEDA + apply this manifest + verify.

Architecture

Each worker pool consumes a NATS JetStream subject. When pending messages grow faster than workers can drain them, KEDA scales the target Deployment up. When the backlog clears, KEDA scales back down to minReplicaCount.

KEDA reads consumer lag from the NATS monitoring API at nats.nats.svc.cluster.local:8222/jsz and exposes it as an external metric that drives a regular HorizontalPodAutoscaler underneath the ScaledObject wrapper.

Sample manifest

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: noetl-worker-scaler-worker-cpu-01
  namespace: noetl
  labels:
    app: noetl-worker
    worker-pool: worker-cpu-01
    managed-by: noetl
spec:
  scaleTargetRef:
    name: noetl-worker
  minReplicaCount: 1
  maxReplicaCount: 20
  pollingInterval: 10
  cooldownPeriod: 30
  triggers:
  - type: nats-jetstream
    metadata:
      natsServerMonitoringEndpoint: nats.nats.svc.cluster.local:8222
      account: NOETL
      stream: NOETL_COMMANDS
      consumer: noetl_worker_pool
      lagThreshold: '10'
      activationLagThreshold: '1'
      useHttps: 'false'

Critical: account: NOETL matches the account where the noetl user lives (see ci/manifests/nats/nats.yaml on the noetl/noetl repo). KEDA's nats-jetstream scaler filters the monitoring API by account; pointing at the wrong account (e.g. the global $G) returns num_pending: 0 silently and breaks scaling. The Python generator default is locked at NOETL and pinned by a unit-test assertion.

Install + verify

KEDA install is a manual one-off cluster setup. It is deliberately not bundled into noetl k8s deploy so the diff stays small + reviewable.

One-time KEDA install

helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda \
  --namespace keda \
  --create-namespace \
  --version 2.15.0

kubectl rollout status deployment/keda-operator -n keda

Apply the worker-pool scaler

# From the noetl/ops repo root
kubectl apply -f ci/manifests/keda/scaledobject-worker-cpu-01.yaml

Verify

kubectl get scaledobject -n noetl noetl-worker-scaler-worker-cpu-01
kubectl get hpa -n noetl
kubectl describe scaledobject noetl-worker-scaler-worker-cpu-01 -n noetl

KEDA creates a regular HorizontalPodAutoscaler behind the scenes — that's the actual driver. The ScaledObject is the KEDA-flavored wrapper that lets you trigger off arbitrary external metrics (NATS lag in our case).

Drive a smoke test

To watch a real scale-up loop:

# Pause the consumer so messages accumulate as lag instead of draining
nats --server nats://noetl:noetl@<nats>:4222 consumer pause -f \
  NOETL_COMMANDS noetl_worker_pool "$(date -u -v+1H +'%Y-%m-%d %H:%M:%S')"

# Publish a burst — pick a count > lagThreshold * maxReplicas to
# saturate scale-up
for i in $(seq 1 200); do
  nats --server nats://noetl:noetl@<nats>:4222 pub noetl.commands "smoke-$i"
done

# Watch the HPA TARGETS climb past 10/replica and replicas scale up
kubectl get hpa -n noetl -w

# Resume the consumer; workers drain, lag goes to 0
nats --server nats://noetl:noetl@<nats>:4222 consumer resume -f \
  NOETL_COMMANDS noetl_worker_pool

# After cooldownPeriod + HPA stabilization window (~5 min default),
# replicas scale back to minReplicaCount.

Validation in the local kind cluster showed: 200-message burst → HPA TARGETS jumped from 0/10 to 200/10 (avg) → replicas scaled 1 → 4 → 8 → 16 → 20 (capped) → drain → back to 1 in ~5 min total.

Tuning

Knob Default Rule of thumb
lagThreshold 10 Set roughly equal to one worker's steady-state in-flight count. Higher → fewer scale-ups; lower → more aggressive.
activationLagThreshold 1 Leave at 1 unless using minReplicaCount: 0, in which case set high enough to filter noise (e.g. 5).
pollingInterval 10s Lower → faster reaction, higher KEDA + NATS monitoring load.
cooldownPeriod 30s Higher → fewer scale-down churns. Tune against work burstiness.
minReplicaCount 1 Set to 0 for true scale-to-zero; combine with a higher activationLagThreshold.
maxReplicaCount 20 Cap. Should respect cluster resource budget and downstream-dep limits (Postgres conns, NATS message rate).

Multi-cluster + per-tenant accounts

account: NOETL is the current default (matches the deployment in ci/manifests/nats/). A future out-of-phase round will derive per-tenant accounts from the URN tenant segment, paired with the NATS Supercluster topology so each tenant gets its own account on each cluster. The account-aware scaler extension is catalog-era work.

Related

Clone this wiki locally