feat: Cancel pending deployments on successful production deployment#1124
Merged
Conversation
Implement automatic cancellation of pending workflow runs when a production deployment succeeds. This prevents multiple redundant deployments and saves CI/CD resources. Changes: - Add 'actions: write' permission to workflow and deploy-production job - New 'Cancel pending deployments' step in deploy-production job - Runs after smoke test to ensure only successful deployments trigger cancellation - Implements pagination for large numbers of queued runs - Includes error handling with try-catch for API resilience - Detailed logging showing commit SHA and branch for each cancelled run Addresses issue #149
Address blocking issues identified by GPT 5.5 and Opus 4.6 code reviews: Critical Fixes: - Replace hardcoded 'Build-Test-And-Deploy.yml' with context.workflow Eliminates maintenance trap and silent failure mode on file renames - Add outer try-catch around listWorkflowRuns API call Prevents transient API errors from blocking production deployment tagging - Fix pagination check: runs.length < 100 instead of hasMore flag Clearer logic, prevents extra empty page fetch - Add debug logging for skipped current run Improves auditability and debugging Improvements: - Better comments explaining purpose and context - Improved error messages with action suggestions - More detailed logging with created_at timestamp - Outer error handling preserves deployment success despite API issues Note: Remaining consideration from review - filtering by commit age could be added in future to only cancel older commits (not newer ones). This is a design choice pending business requirements clarification.
Contributor
There was a problem hiding this comment.
Pull request overview
Adds post-deployment cleanup to the production deployment workflow so that, after a successful smoke test, older queued workflow runs are cancelled to reduce wasted CI/CD work and avoid deployment confusion.
Changes:
- Grants
actions: writepermission to enable cancelling workflow runs via the GitHub Actions API. - Adds a GitHub Script step after the production smoke test to enumerate and cancel queued runs of the same workflow.
- Logs cancellation attempts and continues deployment even if cancellation fails.
Comments suppressed due to low confidence (2)
.github/workflows/Build-Test-And-Deploy.yml:251
workflow_idforlistWorkflowRunsmust be a workflow numeric id or the workflow file name/path (e.g.Build-Test-And-Deploy.yml), butcontext.workflowis the workflow name (here: "Build, Test, and Deploy EssentialCSharp.Web"). This will return 404/no results and prevent cancellations. Consider resolving the workflow id vialistRepoWorkflows(matching bycontext.workflow) or parsing the workflow file fromGITHUB_WORKFLOW_REF/context.workflowRef.
owner: context.repo.owner,
repo: context.repo.repo,
workflow_id: context.workflow,
status: 'queued',
per_page: 100,
.github/workflows/Build-Test-And-Deploy.yml:283
- Pagination here is vulnerable to skipping runs because the script cancels runs while paginating by
page. Since cancelling changes each run’s status fromqueued, the queued-run list shrinks and later items can shift to earlier pages; incrementingpagecan then miss remaining queued runs (especially when there are >100). Prefer collecting all queued run ids across pages first (read-only), or repeatedly fetchingpage: 1until no queued runs remain.
// Last page has fewer results than requested (pagination)
if (runs.length < 100) {
break;
}
page++;
}
Address two critical issues identified in PR review: 1. Fix API Response Handling (Comment 3254800774) GitHub Actions API returns response object with data.workflow_runs array, not an array directly. Updated all three locations (lines 253, 255, 259, 279) to correctly destructure: const runs = data.workflow_runs; This fixes pagination and iteration logic that would have failed silently. 2. Remove Over-Scoped Workflow Permissions (Comment 3254800781) Removed actions: write from workflow-level permissions to follow least-privilege. Only deploy-production job needs this permission; it remains at job level.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Implements automatic cancellation of pending GitHub Actions workflow runs when a production deployment succeeds. This resolves the issue where multiple concurrent deployments could pile up and waste CI/CD resources.
Why This Matters
When developers push code frequently, GitHub Actions queues up multiple workflow runs. Previously, if older code deployed to production while newer code was still building, both would complete—wasting resources and potentially causing confusion about which version is deployed.
With this change, once a production deployment succeeds (confirmed by the smoke test), any pending workflow runs are automatically cancelled, ensuring only necessary CI/CD work runs.
How It Works
Key Features
Code Review History
This implementation was reviewed by independent agents (GPT 5.5 and Opus 4.6) and addresses all critical findings:
context.workflow(resilient to renames)Testing Recommendations
Closes #149