Skip to content

feat: allow configuring the Deployment update strategy#328

Merged
ChrisJBurns merged 1 commit intobackstage:mainfrom
briferz:feat/deployment-strategy
May 4, 2026
Merged

feat: allow configuring the Deployment update strategy#328
ChrisJBurns merged 1 commit intobackstage:mainfrom
briferz:feat/deployment-strategy

Conversation

@briferz
Copy link
Copy Markdown
Contributor

@briferz briferz commented Apr 23, 2026

Summary

Adds a new backstage.strategy value that maps to the Deployment spec.strategy field so operators can opt into Recreate (or RollingUpdate with maxSurge: 0) and guarantee that only one Backstage pod runs at a time.

Default behavior is unchanged: when backstage.strategy is unset (the default {}), no strategy block is rendered and Kubernetes keeps applying its own default (RollingUpdate with maxSurge: 25% / maxUnavailable: 25%).

Motivation

Backstage plugins (e.g. catalog, auth) run knex database migrations during plugin initialization on every pod startup. Under the current default RollingUpdate strategy, a new pod starts while the old pod is still running, and both race the knex_migrations_lock row. If either pod is killed mid-migration (OOM, node drain, probe failure during the rollout), the lock stays set (is_locked = 1) and every subsequent pod fails catalog startup with:

MigrationLocked: Migration table is already locked
If you are sure migrations are not running you can release the lock manually
by running 'knex migrate:unlock'

Recovery requires manual DB intervention — not great on production.

Allowing operators to pin the strategy to Recreate (or RollingUpdate with maxSurge: 0) removes the concurrent-migration race entirely for single-replica deployments (which is the common case for Backstage).

Usage

backstage:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1

or

backstage:
  strategy:
    type: Recreate

Changes

  • values.yaml: add backstage.strategy (default {}) with helm-docs comment.
  • values.schema.tmpl.json: add JSON schema entry with examples; values.schema.json regenerated by the jsonschema-dereference pre-commit hook.
  • templates/backstage-deployment.yaml: conditionally render spec.strategy from .Values.backstage.strategy using common.tplvalues.render so templated values (e.g. global settings) are supported.
  • ci/strategy-values.yaml: new CI test case (RollingUpdate with maxSurge: 0).
  • Chart.yaml: bump chart version 2.6.32.7.0 (minor — new additive feature, backward-compatible).
  • README.md: regenerated by helm-docs.

Test plan

  • pre-commit run --all-files passes (helm-docs, jsonschema-dereference, codespell).
  • ct lint --charts charts/backstage passes against all ci/*-values.yaml files including the new strategy-values.yaml.
  • helm template with default values produces no strategy block in the rendered Deployment (preserves existing behavior).
  • helm template with ci/strategy-values.yaml correctly renders the strategy block with maxSurge: 0, maxUnavailable: 1.
  • Commit is GPG-signed and DCO signed-off per CONTRIBUTING.md.

Adds a new `backstage.strategy` value that maps to the Deployment
`spec.strategy` field, so operators can opt into `Recreate` (or
`RollingUpdate` with `maxSurge: 0`) to guarantee a single Backstage
pod runs at a time.

Motivation: Backstage plugins (catalog, auth, etc.) run knex database
migrations during plugin initialization. Under the default
RollingUpdate strategy (maxSurge 25%) a new pod starts while the old
one is still running, and both race the `knex_migrations_lock` row.
If a pod is killed mid-migration, the lock stays set (`is_locked=1`)
and every subsequent pod fails catalog startup with
`MigrationLocked: Migration table is already locked`, requiring manual
DB intervention. Letting operators pin the strategy to Recreate (or
maxSurge 0) prevents concurrent migration attempts entirely.

The field defaults to `{}` and is only rendered into the manifest when
set, preserving the current behavior (Kubernetes default RollingUpdate
25%/25%) for existing installs.

Signed-off-by: Luis Brito <luis.brito@digitalfemsa.com>
@briferz briferz requested a review from a team as a code owner April 23, 2026 04:11
@ChrisJBurns ChrisJBurns merged commit 38a8122 into backstage:main May 4, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants