Skip to content

Allow sst refresh to fail without blocking deploy#614

Merged
willwashburn merged 1 commit intomainfrom
fix/sst-refresh-continue-on-error
Mar 21, 2026
Merged

Allow sst refresh to fail without blocking deploy#614
willwashburn merged 1 commit intomainfrom
fix/sst-refresh-continue-on-error

Conversation

@willwashburn
Copy link
Member

@willwashburn willwashburn commented Mar 21, 2026

Summary

  • Adds continue-on-error: true to the refresh step
  • sst refresh exits non-zero when it detects state changes (even when it succeeds), which was blocking the deploy step

Test plan

  • Trigger deploy and verify it proceeds past refresh to deploy

🤖 Generated with Claude Code


Open with Devin

sst refresh exits non-zero when it detects state changes, even on
success. Adding continue-on-error so deploy still proceeds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@willwashburn willwashburn merged commit 2be5353 into main Mar 21, 2026
22 of 31 checks passed
@willwashburn willwashburn deleted the fix/sst-refresh-continue-on-error branch March 21, 2026 13:01
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 1 additional finding in Devin Review.

Open in Devin Review

aws-region: ${{ env.AWS_REGION_INPUT }}

- name: Refresh SST state
continue-on-error: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 sst deploy proceeds unconditionally after sst refresh failure, risking production deployment on stale/corrupt state

Adding continue-on-error: true to the "Refresh SST state" step means that if sst refresh --stage production fails (e.g., due to state corruption, resource drift, or permission errors), the workflow silently continues to sst deploy --stage production at line 74. Since sst refresh synchronizes the state file with actual cloud infrastructure, deploying against a stale or inconsistent state can cause resource conflicts, duplicate resources, or failed deployments that are harder to recover from. The deploy step has no if: condition checking the refresh outcome — it runs regardless. At minimum, the refresh step should have an id and the deploy step should either gate on the refresh outcome or log a prominent warning.

Prompt for agents
In .github/workflows/deploy-web.yml, instead of unconditionally swallowing the sst refresh failure, give the refresh step an id and add a conditional warning + gating logic:

1. On line 61, add `id: refresh` to the "Refresh SST state" step (keep continue-on-error: true).
2. Before the "Deploy SST app" step (line 69), add a new step that checks the refresh outcome and emits a warning:
   - name: Warn on refresh failure
     if: steps.refresh.outcome == 'failure'
     run: echo '::warning::SST refresh failed — deploying with potentially stale state'
3. Alternatively, if a refresh failure should block deployment in most cases, remove continue-on-error: true and instead add retry logic (e.g., using nick-fields/retry action) to handle transient failures while still failing on genuine state corruption.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant