Skip to content

feat(deploy): scale-to-zero — Scale() compute method + wake endpoint (#54)#261

Merged
mastermanas805 merged 10 commits into
masterfrom
feat/deploy-scale-to-zero
Jun 5, 2026
Merged

feat(deploy): scale-to-zero — Scale() compute method + wake endpoint (#54)#261
mastermanas805 merged 10 commits into
masterfrom
feat/deploy-scale-to-zero

Conversation

@mastermanas805
Copy link
Copy Markdown
Member

@mastermanas805 mastermanas805 commented Jun 5, 2026

No description provided.

…(Task #54)

API half of scale-to-zero (idle descheduling). Flag-gated behind
DEPLOY_SCALE_TO_ZERO_ENABLED (default OFF) — fully inert when off.

- migration 068: deployments.last_activity_at / scaled_to_zero / always_on
  (+ partial idle-candidate index; backfill last_activity_at from updated_at).
- compute.Provider.Scale(appID, replicas): k8s patches Deployment replicas in
  place (NotFound = no-op so a stale row can't wedge the scaler; idempotent on
  already-at-target); noop logs + no-ops.
- POST /deploy/:id/wake: explicit fast wake — scales back to 1 + clears sleep
  state. 501 when flag off (no scale, no DB write — proven by flag-off test).
  Documented cold-start contract (api is not in the request path; transparent
  wake-on-request needs an activator, out of scope for v1).
- model helpers: MarkDeploymentScaledToZero (CAS: healthy + not-zeroed +
  not-always-on), WakeDeployment, SetDeploymentAlwaysOn; redeploy
  (MarkDeploymentBuilding) clears scaled_to_zero + bumps last_activity_at.
- deploymentToMap surfaces scaled_to_zero/always_on; OpenAPI documents /wake.

Tests: k8s Scale (down/wake/idempotent/notfound/get+update errors), noop Scale,
wake flag-off 501-inert (panicking provider proves compute is never reached),
model CAS/wake/pin/redeploy-clears (DB-gated, run in CI).

Awaiting operator enable of DEPLOY_SCALE_TO_ZERO_ENABLED to verify real
scale-down in prod.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mastermanas805 mastermanas805 enabled auto-merge (squash) June 5, 2026 14:16
mastermanas805 and others added 8 commits June 5, 2026 19:49
sqlmock-driven coverage for the flag-ON Wake branches (happy path scale+flip+
re-read, not-found, cross-team 404, scale-failure 503, DB-flip 503) so the
100%-patch gate is satisfied on deploy_wake.go's handler body. The flag-off
501-inert path stays in deploy_wake_test.go.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
FindActiveDeploymentByTeamEnvName / ByAppID now scan 33 columns after mig 068
added last_activity_at, scaled_to_zero, always_on to deployments. The redeploy
in-place mock rows still provided 30, so scanDeployment failed with
"expected 30 destination arguments in Scan, not 33" and reded
TestDeployNew_Redeploy_WrongTeam_DefenceInDepth and
TestDeployNew_Redeploy_UpdateStatusError_StillAccepts.

Extend deploymentColumnsList + every AddRow tuple in this file (7 mock rows
across the ByTeamEnvName and ByAppID query paths) with the 3 new columns in
the model's Scan order: sql.NullTime{}, false, false.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The patch-coverage gate (100% of changed lines) flagged 6 uncovered lines added
by the scale-to-zero feature commits:

- config.go 517-518: the DEPLOY_SCALE_TO_ZERO_ENABLED=true branch — add
  TestLoad_DeployScaleToZeroEnabled (truthy + falsy table, mirrors the
  DeploySourceGitEnabled test) and register the key in allKeys().
- deploy_wake.go 57-58 / 67 / 101-105: the requireTeam-error arm, the generic
  GetDeploymentByAppID driver-error (503 fetch_failed) arm, and the post-write
  re-read-failure fallback (scale+DB already succeeded → 200, not 5xx). Add
  TestWake_RequireTeamFails, TestWake_FetchDriverError503,
  TestWake_ReReadFailureFallsBack + a no-auth app helper.
- deployment.go 591-592 / 610-611 / 626-627: the fmt.Errorf error returns of
  MarkDeploymentScaledToZero / WakeDeployment / SetDeploymentAlwaysOn. Add
  sqlmock-driven *_DriverError tests (the happy + CAS/RowsAffected paths are
  already covered by the real-DB tests).

Test-only; no production code change. Flag remains default-OFF.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rkBuilding SQL

Two more mock sites still on the pre-068 shape, surfaced by the model/handler
*_Branches + redeploy-CASMiss tests:

- coverage_provision_gate_test.go: deploymentMockCols() + deploymentMockRow()
  (the shared full-deployment mock used by every models *_Branches test) now
  include last_activity_at / scaled_to_zero / always_on (33 cols, matching
  scanDeployment) — fixes the "expected 30 destination arguments, not 33" Scan
  errors in TestGetDeploymentByAppID/ByID/ByTeam/...Branches.
- MarkDeploymentBuilding's SQL now also sets scaled_to_zero=false +
  last_activity_at=now() on redeploy (a redeploy is activity + brings replicas
  back to 1). Update the ExpectExec regexes that pinned the old SQL: 2 in
  coverage_deployment_test.go, 5 in deploy_redeploy_inplace_mock_test.go.

Completes the rule-16 enumeration of every deployments-row mock site rippled by
migration 068. Test-only.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The scale-to-zero wake route is live in the router but had no routeTestMap row,
so TestDoneBar_EveryRouteCovered failed ("route POST /deploy/:id/wake has no
mapped test and no exemption"). Map it to TestWake_HappyPath — the flag-ON
handler suite (deploy_wake_mock_test.go) drives the route through requireTeam +
the scale + DB-flip + re-read contract, which the guard's ../handlers AST scan
recognises as a covering integration test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d codes

TestErrorCode_HasAgentAction caught the two new wake-path error codes
(deploy_wake.go) lacking codeToAgentAction entries.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
# Conflicts:
#	internal/config/config.go
#	internal/config/config_test.go
@mastermanas805 mastermanas805 merged commit 197bd02 into master Jun 5, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant