Skip to content

feat(ci): add recovery-mode to npmjs-release workflow#8650

Merged
lokesh-bitgo merged 1 commit intomasterfrom
zahinmohammad/WCN-308-recovery-mode
Apr 29, 2026
Merged

feat(ci): add recovery-mode to npmjs-release workflow#8650
lokesh-bitgo merged 1 commit intomasterfrom
zahinmohammad/WCN-308-recovery-mode

Conversation

@zahin-mohammad
Copy link
Copy Markdown
Contributor

Summary

  • Adds a recovery-mode boolean input to npmjs-release.yml that publishes only the versions missing from npm when a prior release left npm in a partial-publish state (e.g., npm 503 mid-batch).
  • New get-recovery-context preview job resolves rel/latest HEAD into a pinned SHA and writes the planned publish list to the run summary before the env-gated publish job is dispatched.
  • release-bitgojs checkout is pinned to that SHA, so the publish cannot drift from what the env reviewer approved.
  • Skips version bumping, master→rel/latest merge, and GPG signing in recovery mode; runs yarn lerna publish from-package --yes instead of the full release.
  • Verifies post-publish that every non-private package version on rel/latest is reachable on the registry; fails the run if any are still missing.
  • Adds top-level concurrency: { group: npmjs-release, cancel-in-progress: false } so a recovery and a normal release cannot race against rel/latest.

Why

On Apr 28 a partial-publish failure (npm 503 mid-batch) left bitgo@50.34.0 and @bitgo/account-lib@27.21.0 unpublished while their tags and chore(root): publish modules commit had already been pushed to rel/latest. The full release flow could not be re-run because lerna's conventional bump would re-attempt versions that already had tags. Recovery mode hits lerna publish from-package, which is idempotent and only publishes what is missing.

Existing behavior is unchanged when recovery-mode == false.

Validated with parallel correctness + adversarial reviewers across five rounds — race window between preview and publish closed via SHA pinning, silent no-op closed via context-job assertion, post-publish verification added.

Test plan

  • Trigger workflow with recovery-mode: true against rel/latest (currently HEAD 36fbed6c); verify it publishes bitgo@50.34.0 and @bitgo/account-lib@27.21.0 and skips the 93 already-published versions.
  • Verify the post-publish verifier reports all public versions present.
  • Verify Express docker job runs after recovery and pushes bitgo/express:15.27.0.
  • Verify recovery-mode: false still triggers the normal release path (compare against the most recent successful run for shape, not output).
  • recovery-mode: true && dry-run: true produces a preview without publishing.

Ticket: WCN-308

@linear
Copy link
Copy Markdown

linear Bot commented Apr 28, 2026

Adds a `recovery-mode` workflow_dispatch input that publishes only the
versions missing from npm (via `lerna publish from-package`) when a
prior release left npm in a partial-publish state. Pins the checkout
to a SHA captured by a pre-flight job so the publish cannot drift from
what the env reviewer approved, asserts the SHA up front, and verifies
post-publish that every public package version on rel/latest is now on
the registry.

Ticket: WCN-308
@zahin-mohammad zahin-mohammad force-pushed the zahinmohammad/WCN-308-recovery-mode branch from 278b2c0 to c0dbca6 Compare April 28, 2026 22:48
@zahin-mohammad zahin-mohammad marked this pull request as ready for review April 28, 2026 22:48
@zahin-mohammad zahin-mohammad requested review from a team as code owners April 28, 2026 22:48
@lokesh-bitgo lokesh-bitgo merged commit 911965e into master Apr 29, 2026
22 checks passed
zahin-mohammad added a commit that referenced this pull request Apr 29, 2026
#8650 introduced an `always()` chain on `release-bitgojs` so it can
run when one of its two context jobs is skipped (get-release-context
in recovery mode, get-recovery-context in normal mode). The Express
jobs (get-express-release-context, publish-express-to-docker-hub)
relied on GHA's implicit `success()` against `release-bitgojs`, which
propagates the skipped context-job status as not-success along the
chain. Express therefore skips on both modes even though release-
bitgojs itself succeeded.

Override with explicit `always() && needs.<>.result == 'success'` on
both Express jobs so the docker publish runs whenever release-bitgojs
actually succeeded, regardless of which context job ran.

Ticket: WCN-357
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants