feat(jobs): deploy_idle_scaler — scale-to-zero idle descheduler (#54)#94
Merged
Conversation
…54) Worker half of scale-to-zero. New periodic job (every 2 min) that patches idle, healthy, not-pinned deployments to replicas=0 (~$0 compute) — reversible via the api wake endpoint. Sibling to deployment_expirer (idle ≠ expired). Flag-gated behind DEPLOY_SCALE_TO_ZERO_ENABLED (default OFF): Work() short-circuits at DEBUG when off — no k8s patch, no DB write (proven by TestDeployIdleScaler_FlagOffNoOp: zero SQL issued, zero scale calls). Fail-open when k8s is unreachable (nil client → WARN per tick, other jobs unaffected). Idle SIGNAL (stated honestly): deployments.last_activity_at (api migration 068) — stamped at create, bumped on deploy/redeploy/wake. v1 idle = "no deploy/redeploy/wake for N min" (default 30, floored at 5), NOT per-HTTP traffic, because the api/worker are not in the request path and no nginx-ingress request scrape is wired yet. Follow-up noted in the job header to lift this to traffic-based idle. CAS double-guard: candidate SELECT and the scaled_to_zero UPDATE share the healthy + not-zeroed + not-always-on predicate, so a row that raced into a woken/pinned/redeployed/expired state between SELECT and UPDATE is skipped (0 rows), never wrongly slept. NotFound Deployment = skip (torn down), not fail. Metric (rule 25): instant_deploy_scaled_to_zero_total{outcome} (scaled_down | woke_up | wake_failed | scale_failed) + instant_deploy_idle_apps gauge, primed in metrics_test.go. Alert+tile+catalog ship in the infra PR. Tests: flag-off no-op, nil-k8s no-op, scale-down happy path (+counter+gauge), CAS-race skip, NotFound skip, scale-error → scale_failed, idle-minutes floor, provider_id→namespace derivation, real k8sDeployScaleClient vs fake clientset. make gate GREEN. Awaiting operator enable of DEPLOY_SCALE_TO_ZERO_ENABLED to verify real scale-down in prod. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Closes the 100%-patch gaps on deploy_idle_scaler.go: Kind(), the no-config cluster-constructor error path (CI-only, gated like the status client), list-query error, scan error, foreign provider_id skip, db-flip error, gauge-sample error. Work() and the SQL helpers now 100%. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e + AddWorker wiring Closes the 6 uncovered changed lines flagged by the patch-coverage gate: - config: DEPLOY_SCALE_TO_ZERO_IDLE_MINUTES parse branch (valid override + sub-5/non-numeric floor-to-30) now exercised directly. - deploy_idle_scaler: introduce newDeployScaleClientset package-var seam so NewK8sDeployScaleClientFromCluster's success return is testable without a reachable cluster; add a success test alongside the NoConfig error test. - workers: extract the idle-scaler k8s-client wiring (the AddWorker else-branch) into a unit-testable buildIdleScaleK8s helper; cover both the success and the fail-open (nil) branches via the seam. Flag remains default-OFF; behaviour unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Worker half of scale-to-zero / idle descheduling (Task #54). New periodic job (every 2 min) that patches idle, healthy, not-pinned deployments to
replicas=0(~$0 compute) — reversible via the api wake endpoint. Sibling todeployment_expirer(idle ≠ expired).Safety
DEPLOY_SCALE_TO_ZERO_ENABLED(default OFF):Work()short-circuits at DEBUG when off — no k8s patch, no DB write (TestDeployIdleScaler_FlagOffNoOpasserts zero SQL + zero scale calls).scaled_to_zeroUPDATE share thehealthy + not-zeroed + not-always-onpredicate, so a row that raced into a woken/pinned/redeployed/expired state is skipped, never wrongly slept. NotFound Deployment = skip (torn down), not fail.The idle SIGNAL (stated honestly)
deployments.last_activity_at(api migration 068) — stamped at create, bumped on deploy/redeploy/wake. v1 idle = "no deploy/redeploy/wake for N min" (default 30, floored at 5), NOT per-HTTP traffic, because the api/worker are not in the request path and no nginx-ingress request scrape is wired yet. Follow-up to lift to traffic-based idle is noted in the job header.Metric (rule 25)
instant_deploy_scaled_to_zero_total{outcome}(scaled_down | woke_up | wake_failed | scale_failed) +instant_deploy_idle_appsgauge, primed inmetrics_test.go. Alert+tile+catalog ship in the infra PR.Tests
flag-off no-op, nil-k8s no-op, scale-down happy path (+counter+gauge), CAS-race skip, NotFound skip, scale-error → scale_failed, idle-minutes floor, provider_id→namespace derivation, real
k8sDeployScaleClientvs fake clientset.make gateGREEN.Companion PRs
feat/scale-to-zero-observabilityAwaiting operator enable of
DEPLOY_SCALE_TO_ZERO_ENABLEDto verify real scale-down in prod.🤖 Generated with Claude Code