Skip to content

Promote dev to main: fix(triggers) respond-to-ci on mixed-state SHA#1242

Merged
zbigniewsobiecki merged 1 commit intomainfrom
dev
Apr 30, 2026
Merged

Promote dev to main: fix(triggers) respond-to-ci on mixed-state SHA#1242
zbigniewsobiecki merged 1 commit intomainfrom
dev

Conversation

@zbigniewsobiecki
Copy link
Copy Markdown
Member

Promotes 1 commit from dev:

What changed

Closes the long-standing bug where a fast-failing sibling check_suite (e.g. E2B Template Rebuild) on a PR head SHA caused respond-to-ci to be permanently lost — because the failure event arrived before the rest of CI completed (correctly deferring), and the eventual success event only ever dispatched review. The success handler now queries aggregate state and forks: if any check on the SHA failed, dispatch respond-to-ci instead of review.

Live incident: ucho PR #176 on 2026-04-30 — full root-cause + fix history in #1241.

Risk

Low. Pure trigger-handler change, fully unit-tested. Behavior change is additive: the new code path only fires when an existing situation (success-event-after-mixed-failure) currently silently drops a respond-to-ci dispatch. All existing happy paths preserved.

CI on dev

  • CI (lint + test 1-4 + Docker validate)
  • Build and Deploy (Dev)
  • Push on dev

🤖 Generated with Claude Code

…uite-success (#1241)

GitHub fires `check_suite.completed` once per workflow. When workflow A's
suite fails fast (e.g. the E2B template-rebuild on ucho/PR#176 — 38s) and
workflow B's suite is still running, the failure handler correctly defers
with "not all complete yet". When workflow B's suite eventually completes
with `conclusion=success`, only the success handler fires — and it
unconditionally dispatches `review`. That review is silently skipped at
worker time (`pollWaitForChecks` sees `allPassing=false`), and no later
event with `conclusion=failure` ever wakes the failure handler back up.
Net: `respond-to-ci` is permanently lost for that SHA.

Fix: in the success handler, after the author + base gates, query the
aggregate `getCheckSuiteStatus` and fork. When `allComplete && anyFailed`,
dispatch `respond-to-ci` (gated by its own trigger config + attempt limit)
instead of `review`. Otherwise, current behavior — dispatch review with
`waitForChecks: true` so the worker polls if checks are still in progress.

Single-source the dispatch envelope: extract `dispatchRespondToCi` plus
`fixAttempts` / `MAX_ATTEMPTS` / `resetFixAttempts` into a new shared
module `respond-to-ci-dispatch.ts`. Both handlers converge on it. The
failure handler keeps its early `gateTriggerEnabled` so disabled projects
don't burn GitHub API calls; the helper re-checks the same gate so the
success-handler fork is also guarded.

Tests: 5 new TDD cases on the success handler — failure conclusion,
timed_out conclusion, in-progress checks (defers to worker), all-passing
(review), and the "already-reviewed at HEAD with CI failure" case (CI
failure trumps prior approval). Adjusted the existing pre-existing
"getCheckSuiteStatus not called" assertions to match the new flow:
removed where the call now happens (post-base-gate), kept where the gate
short-circuits before it.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 30, 2026

Codecov Report

❌ Patch coverage is 98.94737% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/triggers/github/respond-to-ci-dispatch.ts 98.46% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@zbigniewsobiecki zbigniewsobiecki merged commit 674906a into main Apr 30, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant