Skip to content

feat(migrate): --up-if-clean on up + atlas Executor recover wrapper#14

Merged
intel352 merged 2 commits into
mainfrom
feat/up-if-clean-and-atlas-recover
May 2, 2026
Merged

feat(migrate): --up-if-clean on up + atlas Executor recover wrapper#14
intel352 merged 2 commits into
mainfrom
feat/up-if-clean-and-atlas-recover

Conversation

@intel352
Copy link
Copy Markdown
Contributor

@intel352 intel352 commented May 2, 2026

Summary

Two upstream fixes for staging-deploy blockers in GoCodeAlone/core-dump:

  1. core-dump#150--up-if-clean flag was only on the repair-dirty subcommand in v0.3.6. core-dump's Dockerfile.migrate CMD passes it to up, which cobra rejects. Add the flag to up as an idempotency hint — same exit-0 behavior when no migrations are pending, but now accepted.
  2. workflow#513 — atlas Executor.Execute panics with index out of range [0] with length 0 against core-dump's migrations corpus. Defensive recover() wrapper converts the panic into a typed error naming the phase.

Changes

  • pkg/cli/root.go: register --up-if-clean on up subcommand; thread through; introduce buildDriverAndRequestForTest test seam.
  • internal/atlas/driver.go: introduce atlasExecutor interface + openForTest + newAtlasExecutorForTest test seams; runWithRecover() helper; wrap ExecuteN/Pending calls in Up/Status to convert panics → typed errors.
  • pkg/cli/root_test.go: flag-acceptance test + idempotency-on-clean-DB test + applies-when-pending test.
  • internal/atlas/driver_recover_test.go: 3 runWithRecover unit tests + TestUp_RecoversAtlasExecutorPanic using panicking fake executor (no real DB needed).
  • CHANGELOG.md: Unreleased entry for both fixes.

Test plan

  • go test -short ./... — all packages green (db-requiring tests skipped in short mode, covered by existing CI postgres workflow)
  • TDD regression invariant proven: reverting runWithRecover in Up() causes TestUp_RecoversAtlasExecutorPanic to fail with the original panic: runtime error: index out of range [0] with length 0
  • CI green
  • After release v0.3.7 + core-dump's Dockerfile.migrate pin bump: pre_deploy migrate runs against staging without process death; if atlas still panics on core-dump corpus, the wrapper catches it + surfaces the phase name.

Regression invariant (TDD proof)

With fix reverted (runWithRecover replaced by direct ex.ExecuteN call):

FAIL: TestUp_RecoversAtlasExecutorPanic
panic: runtime error: index out of range [0] with length 0 [recovered, repanicked]

With fix restored:

PASS: TestUp_RecoversAtlasExecutorPanic (0.00s)

🤖 Generated with Claude Code

Two upstream fixes for staging-deploy blockers:

1. core-dump#150 — --up-if-clean flag was only on repair-dirty, not on
   up. core-dump's Dockerfile.migrate CMD invokes
   workflow-migrate up ... --up-if-clean which cobra rejects in v0.3.6.
   Add the flag to up as an idempotency hint; effective behavior is
   the same as plain up (already exits 0 when no migrations pending),
   but the flag is now accepted so deploy CMDs that pass it succeed.

2. workflow#513 — atlas Executor.Execute panics with
   runtime error: index out of range [0] with length 0 against the
   core-dump migrations corpus. Defensive recover wrapper around all
   atlas Executor calls converts the panic into a typed error containing
   the phase name (atlas-execute panic / atlas-pending panic) so
   callers can identify which atlas operation panicked. Root-cause
   investigation deferred; this is the must-have defensive fix.

Driver-level seam (atlasExecutor interface + newAtlasExecutorForTest +
openForTest package vars) lets tests inject a panicking fake to validate
the recover wrapper without a real database. Same pattern used by
buildDriverAndRequestForTest on the CLI side.

TDD: each new test was written first and verified to fail before
implementation. Regression invariant: reverting runWithRecover in Up()
causes TestUp_RecoversAtlasExecutorPanic to fail with the original panic.

See workflow/docs/plans/2026-05-02-staging-deploy-blockers-design.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 2, 2026 13:12
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses two staging-deploy blockers by (1) making the up subcommand accept --up-if-clean (so deploy CMDs that pass it don’t fail cobra parsing) and (2) adding a defensive recover() wrapper around Atlas executor calls to prevent process-killing panics and surface phase-labeled errors instead.

Changes:

  • Register --up-if-clean on migrate up and thread it through the command’s execution path (plus a test seam for driver construction).
  • Wrap Atlas executor ExecuteN and Pending calls with runWithRecover and add seams/interfaces to allow panic-injection tests.
  • Add unit tests for the new CLI flag behavior and Atlas panic recovery; update the changelog.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pkg/cli/root.go Adds --up-if-clean to up, prints a distinct message when set, and introduces a test seam for driver/request construction.
pkg/cli/root_test.go Adds CLI-level tests ensuring up accepts --up-if-clean and remains a no-op on clean DB / applies when pending (via fake driver).
internal/atlas/driver.go Introduces atlasExecutor interface, test seams, and runWithRecover to convert Atlas panics during Up/Status into returned errors.
internal/atlas/driver_recover_test.go Adds unit tests for runWithRecover plus a fake panicking executor test for Driver.Up.
CHANGELOG.md Documents the new flag acceptance and Atlas panic recovery behavior under Unreleased.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread CHANGELOG.md Outdated
Comment thread internal/atlas/driver_recover_test.go
Comment thread internal/atlas/driver.go
Comment thread internal/atlas/driver.go Outdated
Address 4 Copilot nits on PR #14:

- Replace "typed error" with "wrapped error" in runWithRecover doc-comment,
  CHANGELOG.md Unreleased entry, and driver_recover_test.go comments. The
  implementation uses fmt.Errorf("%s panic: %v", ...) which produces a plain
  formatted error, not a custom Go error type.
- Drop dead `_ = drv` blank assignment in Status() — drv is consumed by the
  next line (newAtlasExecutorForTest call) so the blank assignment was
  unreachable noise.

No behavior change. All tests green.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@intel352 intel352 merged commit 6ddb03e into main May 2, 2026
6 checks passed
@intel352 intel352 deleted the feat/up-if-clean-and-atlas-recover branch May 2, 2026 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants