feat(chart): opt-in Argo Rollouts (blue-green) + pre-deploy migration Job#3653
Open
nicacioliveira wants to merge 3 commits into
Open
feat(chart): opt-in Argo Rollouts (blue-green) + pre-deploy migration Job#3653nicacioliveira wants to merge 3 commits into
nicacioliveira wants to merge 3 commits into
Conversation
Contributor
There was a problem hiding this comment.
5 issues found across 6 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="deploy/helm/studio/templates/deployment.yaml">
<violation number="1" location="deploy/helm/studio/templates/deployment.yaml:1">
P1: Enabling Argo Rollouts removes the Deployment, but HPA still targets Deployment, leaving autoscaling with an invalid target.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| @@ -1,3 +1,4 @@ | |||
| {{- if not (and .Values.argoRollouts .Values.argoRollouts.enabled) }} | |||
Contributor
There was a problem hiding this comment.
P1: Enabling Argo Rollouts removes the Deployment, but HPA still targets Deployment, leaving autoscaling with an invalid target.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At deploy/helm/studio/templates/deployment.yaml, line 1:
<comment>Enabling Argo Rollouts removes the Deployment, but HPA still targets Deployment, leaving autoscaling with an invalid target.</comment>
<file context>
@@ -1,3 +1,4 @@
+{{- if not (and .Values.argoRollouts .Values.argoRollouts.enabled) }}
apiVersion: apps/v1
kind: Deployment
</file context>
nicacioliveira
added a commit
that referenced
this pull request
Jun 8, 2026
Five issues caught in PR review #3653, fixed without changing the opt-in defaults or expanding scope outside the chart: 1. P1 — HPA targeted Deployment even when argoRollouts.enabled (so the Deployment didn't exist and autoscaling was silently broken). hpa.yaml now switches `scaleTargetRef` to `argoproj.io/v1alpha1 Rollout` when rollouts are on. 2. P1 — Migration Job (PreSync, sync-wave -1) ran before its envFrom ConfigMap/Secret on first install/sync. Dropped the helm pre-install/pre-upgrade hook (helm hooks run before non-hook resources, which caused the ordering bug on first install), kept only ArgoCD's PreSync + sync-wave -1. ConfigMap and Secret now get `argocd.argoproj.io/sync-wave: "-2"` annotations conditionally when migrationJob.enabled, so they materialize before the Job. For plain `helm install` (no ArgoCD), documented that `--wait` should be used so helm waits for both Job completion and Deployment readiness together. 3. P2 — Strategy selection silently fell back to blueGreen when both blueGreen.enabled and canary.enabled were false. Added an explicit `fail` in rollout.yaml requiring exactly one to be true (matching the existing both-on fail). 4. P2 — `terminationGracePeriodSeconds` used a truthiness check that silently dropped explicit `0` (immediate termination). Switched to `hasKey` so an explicit `0` is honored. 5. P2 — `--skip-migrations` was unconditionally appended to any custom `image.command` when migrationJob.enabled, which would break or be meaningless on non-deco entrypoints (e.g. debug containers running `sleep infinity`). Added `migrationJob.injectSkipMigrationsFlag` (default true) so the auto-append can be turned off when the user overrides `image.command`. Verification: - `helm lint deploy/helm/studio` clean - Default render unchanged (no Rollout / no preview Service / no Job / no --skip-migrations / pod command identical) - With `argoRollouts.enabled=true autoscaling.enabled=true`: HPA targets the Rollout (argoproj.io/v1alpha1) instead of the missing Deployment - With `migrationJob.enabled=true`: ConfigMap + Secret both carry `sync-wave: "-2"`; Job carries PreSync + `sync-wave: "-1"` - With `argoRollouts.enabled=true` + both strategies disabled: helm template fails with a clear message - With `migrationJob.injectSkipMigrationsFlag=false`: `--skip-migrations` is NOT appended to the pod command Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… Job
Adds two opt-in flags to the studio Helm chart so consumers can switch from
the default Deployment to an Argo Rollouts Rollout with blue-green strategy,
and move DB migrations out of pod startup into a dedicated pre-sync Job.
Both default to off — existing installs keep the exact same Deployment with
the on-startup migration path. No selector / label changes, no PVC churn.
## What's new
- `argoRollouts.enabled` — when true, render a `Rollout` (argoproj.io/v1alpha1)
instead of the `Deployment`. Pod template is shared via the new
`chart-deco-studio.podTemplate` helper so the two workload kinds describe an
identical pod surface. Supports `blueGreen` (default) and `canary` strategies.
- `migrationJob.enabled` — when true, render a Job that runs
`bun run --cwd=apps/mesh migrate` ONCE before pods start. Carries BOTH
`helm.sh/hook: pre-install,pre-upgrade` and
`argocd.argoproj.io/hook: PreSync` annotations so it sequences correctly
whether installed via `helm upgrade` directly or synced by ArgoCD. The
runtime pod command gets `--skip-migrations` appended (the studio CLI
already exposes this flag — see `apps/mesh/src/cli.ts`), eliminating the
race between N replicas migrating concurrently and giving a clear
pre-deploy gate: migration Job fails → release aborted.
## New / modified files
- `templates/_pod-template.tpl` (new) — shared pod template helper + the
`podCommand` helper that appends `--skip-migrations` when migrationJob is on.
- `templates/deployment.yaml` — now wraps in `{{- if not argoRollouts.enabled }}`
and references the helper; lifts the entire `spec.template` body out.
- `templates/rollout.yaml` (new) — gated on `argoRollouts.enabled`, mirrors
the Deployment via the same helper, picks blueGreen or canary based on
values. Mutual-exclusion `fail` for both-on configs.
- `templates/migration-job.yaml` (new) — gated on `migrationJob.enabled`.
Sync-wave -1, dual hooks, bounded backoffLimit/activeDeadlineSeconds/TTL.
- `templates/service-preview.yaml` (new) — rendered only for blue-green;
Argo Rollouts manages its selector to point at the preview ReplicaSet.
- `values.yaml` — adds `argoRollouts` and `migrationJob` blocks; both off
by default, defaults preserve current behavior.
## Why opt-in
The chart is open-source and not everyone has the argo-rollouts controller
installed. Defaulting to Deployment keeps zero requirements on the consumer's
cluster. Internal CD (deco-apps-cd) flips both flags on for the deco-studio /
deco-studio-stg releases — that's a separate change.
## Migration discipline note
Blue-green amplifies the schema/code overlap window. The Job moves migrations
to a single execution point BEFORE the new ReplicaSet probes, but it does NOT
make destructive DDL safe — the old (blue) ReplicaSet still serves traffic
during the overlap window with the migrated schema. Destructive changes
(DROP/RENAME/type changes) still require expand-contract discipline at the
migration code level. This is independent of the chart and is being handled
team-side as a code/review practice.
## Verification
- `helm lint deploy/helm/studio` passes
- `helm template deco-studio deploy/helm/studio` (default) renders identical
workload surface to before — Deployment with `bun run deco --no-local-mode`,
no Rollout, no preview Service, no migration Job
- `helm template ... --set argoRollouts.enabled=true --set migrationJob.enabled=true`
renders Rollout with blueGreen, preview Service, migration Job with the
PreSync hooks, and `--skip-migrations` appended to the pod command
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Five issues caught in PR review #3653, fixed without changing the opt-in defaults or expanding scope outside the chart: 1. P1 — HPA targeted Deployment even when argoRollouts.enabled (so the Deployment didn't exist and autoscaling was silently broken). hpa.yaml now switches `scaleTargetRef` to `argoproj.io/v1alpha1 Rollout` when rollouts are on. 2. P1 — Migration Job (PreSync, sync-wave -1) ran before its envFrom ConfigMap/Secret on first install/sync. Dropped the helm pre-install/pre-upgrade hook (helm hooks run before non-hook resources, which caused the ordering bug on first install), kept only ArgoCD's PreSync + sync-wave -1. ConfigMap and Secret now get `argocd.argoproj.io/sync-wave: "-2"` annotations conditionally when migrationJob.enabled, so they materialize before the Job. For plain `helm install` (no ArgoCD), documented that `--wait` should be used so helm waits for both Job completion and Deployment readiness together. 3. P2 — Strategy selection silently fell back to blueGreen when both blueGreen.enabled and canary.enabled were false. Added an explicit `fail` in rollout.yaml requiring exactly one to be true (matching the existing both-on fail). 4. P2 — `terminationGracePeriodSeconds` used a truthiness check that silently dropped explicit `0` (immediate termination). Switched to `hasKey` so an explicit `0` is honored. 5. P2 — `--skip-migrations` was unconditionally appended to any custom `image.command` when migrationJob.enabled, which would break or be meaningless on non-deco entrypoints (e.g. debug containers running `sleep infinity`). Added `migrationJob.injectSkipMigrationsFlag` (default true) so the auto-append can be turned off when the user overrides `image.command`. Verification: - `helm lint deploy/helm/studio` clean - Default render unchanged (no Rollout / no preview Service / no Job / no --skip-migrations / pod command identical) - With `argoRollouts.enabled=true autoscaling.enabled=true`: HPA targets the Rollout (argoproj.io/v1alpha1) instead of the missing Deployment - With `migrationJob.enabled=true`: ConfigMap + Secret both carry `sync-wave: "-2"`; Job carries PreSync + `sync-wave: "-1"` - With `argoRollouts.enabled=true` + both strategies disabled: helm template fails with a clear message - With `migrationJob.injectSkipMigrationsFlag=false`: `--skip-migrations` is NOT appended to the pod command Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…louts only This PR's stated scope was opt-in Argo Rollouts blue-green for the studio chart. The migration Job was scope creep on my part — the team is handling migration discipline (expand-contract) at the migration code level, not via chart-side orchestration. Removed: - templates/migration-job.yaml (deleted) - `--skip-migrations` auto-append helper in _pod-template.tpl - `migrationJob` block in values.yaml - sync-wave annotations on ConfigMap and Secret (reverted to origin/main) - Cross-references to migrationJob in argoRollouts and Rollout comments Kept (legitimate Argo Rollouts support): - templates/_pod-template.tpl (shared pod spec helper for Deployment + Rollout) - templates/rollout.yaml (blueGreen + canary, mutual-exclusion + neither-set fails) - templates/service-preview.yaml (preview Service for blueGreen) - templates/deployment.yaml (wrapped with `if not argoRollouts.enabled`) - templates/hpa.yaml (targets Rollout when rollouts on; otherwise Deployment) - templates/_pod-template.tpl: terminationGracePeriodSeconds uses hasKey so explicit `0` is honored - `argoRollouts` block in values.yaml This addresses cubic findings 1 (HPA target), 3 (strategy neither-set fail), 4 (terminationGracePeriodSeconds 0). Findings 2 (migration hook ordering) and 5 (--skip-migrations on custom commands) are no longer applicable since the migration Job is gone entirely. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2b3f3a0 to
b78e0e3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two opt-in flags in the studio Helm chart so consumers can switch from the default Deployment to an Argo Rollouts Rollout (blue-green or canary), and move DB migrations out of pod startup into a dedicated pre-sync Job.
Both default to off — existing installs keep the exact same Deployment with the on-startup migration path. No selector / label changes, no PVC churn.
What's new
`argoRollouts.enabled` — render a `Rollout` (argoproj.io/v1alpha1) instead of `Deployment`. Pod template is shared via the new `chart-deco-studio.podTemplate` helper so the two workload kinds describe an identical pod surface. Supports `blueGreen` (default) and `canary` strategies.
`migrationJob.enabled` — render a Job that runs `bun run --cwd=apps/mesh migrate` ONCE before pods start. Carries BOTH `helm.sh/hook: pre-install,pre-upgrade` and `argocd.argoproj.io/hook: PreSync` annotations so it sequences correctly whether installed via `helm upgrade` directly or synced by ArgoCD. The runtime pod command gets `--skip-migrations` appended (the studio CLI already exposes this flag — see `apps/mesh/src/cli.ts`), eliminating the race between N replicas migrating concurrently and giving a clear pre-deploy gate: migration Job fails → release aborted.
Files
Why opt-in
The chart is open-source and not every consumer has the argo-rollouts controller installed. Defaulting to Deployment keeps zero requirements on the consumer's cluster. Internal CD (deco-apps-cd) flips both flags on for the deco-studio / deco-studio-stg releases — that's a separate change.
Important: migration discipline note
Blue-green amplifies the schema/code overlap window. The Job moves migrations to a single execution point BEFORE the new ReplicaSet probes, but it does NOT make destructive DDL safe — the old (blue) ReplicaSet still serves traffic during the overlap window with the migrated schema. Destructive changes (DROP/RENAME/type changes) still require expand-contract discipline at the migration code level. This is independent of the chart and is being handled team-side as a code/review practice.
Test plan
Summary by cubic
Adds opt-in Argo Rollouts (blue-green/canary) to the
studioHelm chart via a shared pod template. Off by default; keeps the Deployment path unchanged and drops the previously proposed migration Job.New Features
argoRollouts.enabled: renders aRollout(argoproj.io/v1alpha1) instead of aDeployment; blue-green by default, canary supported._pod-template.tplso Deployment and Rollout use the same pod spec.Bug Fixes
Rolloutwhen rollouts are enabled; otherwise theDeployment.terminationGracePeriodSeconds: 0viahasKeycheck.Written for commit b78e0e3. Summary will update on new commits.