fix(triage-dependabot): force-close prerelease PRs + preserve cooldown on mark-done flakes#38
Merged
zkoppert merged 2 commits intoJun 15, 2026
Conversation
The previous close-prerelease outcome only posted '@dependabot close' as a comment and relied on Dependabot to act on it. Dependabot has historically ignored that directive for hours - on github-community-projects/contributors#496 the hourly cron posted the comment 12+ times between 00:05 and 20:08 UTC on 2026-06-15 before the PR actually closed, spamming the PR with duplicate directives. I now follow the comment with a direct 'gh pr close --delete-branch' call so the PR shuts on the first cron tick. The comment stays in place so Dependabot's own tracking still sees the directive. Only the narrow 'already closed / not found' race on the close call is swallowed (e.g. a parallel run, or Dependabot acting on the comment between the two calls). Every other failure - auth, rate limit, timeout, transient API error, branch-deletion failure - propagates so the outer run loop records it in stats.errors and skips the mark_thread_done + cooldown that would otherwise hide the still-open PR until the next cron tick. Without this propagation the fix would silently re-introduce the spam pathology it is trying to escape. Tests: - do_dependabot_close_posts_correct_comment updated to assert both calls (comment + close with --delete-branch). - New do_dependabot_close_swallows_already_closed_error verifies the benign-race path. - New do_dependabot_close_propagates_real_close_failure verifies rate limit / auth / API failures propagate so cooldown does not engage. - New do_dependabot_close_propagates_timeout verifies TimeoutExpired propagates for the same reason. - README.md and SKILL.md updated to describe the two-step close. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Zack Koppert <zkoppert@github.com>
Previously, every post-action branch in the run loop called
mark_thread_done() BEFORE setting state[pr_url] = now. If
mark_thread_done threw (transient notifications-API failure), control
jumped to the outer 'action failed' except handler and the cooldown
was never written - so the next hourly cron tick would re-attempt the
already-completed action (re-merge, re-close, re-comment), producing
exactly the spam pathology this skill is built to avoid.
I added a _safe_mark_thread_done(thread_id, *, dry_run, stats, context)
helper that swallows CalledProcessError + TimeoutExpired, appends to
stats.errors, logs a warning, and returns bool. Every post-action call
site (merge, label-and-merge, close-prerelease, terminal-skip,
branch-protected, archived-repo) was refactored to:
do_action()
stats.<counter> += 1
state[pr_url] = now # cooldown set FIRST
_safe_mark_thread_done(...)
_cleanup_stale_entries(...)
so a flaky mark-done never undoes the cooldown.
The excluded-dep paths (open + closed) were deliberately left with
their existing inline try/except. Those have DIFFERENT semantics: the
'action' there IS the mark-done, so cooldown must NOT engage on
mark-done failure or the next tick could not retry. Reviewer (Opus)
caught a stated-invariant mismatch on the archived-repo branch, which
actually had the SAME semantics as the post-action branches; that
branch is now also unified on the helper. Reviewer (GPT) caught that
terminal-skip never wrote cooldown at all, even pre-refactor, so a
flaky mark-done on a closed PR would loop forever; terminal-skip now
sets cooldown before the safe-mark-done call.
Tests:
- 5 new run-loop regression tests assert state[pr_url] IS set when
mark_thread_done throws, for merge, label-and-merge, close-prerelease,
and terminal-skip outcomes.
- 2 new unit tests for the helper (no-op on empty thread_id; records to
stats.errors and returns False on failure).
All 213 tests pass.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Zack Koppert <zkoppert@github.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes two related bugs in the triage-dependabot skill that caused comment spam and silent action loss. Bug 1 added a direct gh pr close --delete-branch call after the @dependabot close comment to force-close prerelease PRs on the first cron tick (previously Dependabot ignored the comment for hours, causing 12+ duplicate comments). Bug 2 introduced a _safe_mark_thread_done helper and reordered call sites so the per-PR cooldown is written BEFORE the mark-done call, preventing a flaky notifications API from unwinding the cooldown and causing replayed actions.
Changes:
- Added
_is_already_closed_stderrhelper and a directgh pr close --delete-branchcall insidedo_dependabot_close, swallowing only the benign "already closed" race and propagating all other failures. - Introduced
_safe_mark_thread_donewrapper and refactored all post-action call sites (merge, label-and-merge, close-prerelease, terminal-skip, archived-repo, branch-protected) to write cooldown before calling mark-done, so a flaky notifications API cannot undo the cooldown. - Added 8 new tests covering both bugs' failure/propagation paths, plus updated 1 existing test and updated documentation in SKILL.md and README.md.
Show a summary per file
| File | Description |
|---|---|
.copilot/skills/triage-dependabot/triage_dependabot.py |
Core bug fixes: _is_already_closed_stderr, force-close in do_dependabot_close, _safe_mark_thread_done helper, and reordered cooldown/mark-done at all call sites |
.copilot/skills/triage-dependabot/tests.py |
8 new tests + 1 updated test covering force-close behavior, error propagation, cooldown preservation, and the safe helper |
.copilot/skills/triage-dependabot/SKILL.md |
Documentation updated to reflect the two-step close behavior |
.copilot/skills/triage-dependabot/README.md |
Decision table and action descriptions updated for force-close |
Copilot's findings
- Files reviewed: 4/4 changed files
- Comments generated: 0
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Two related bugs in the triage-dependabot skill caused comment spam and silent action loss.
Bug 1: prerelease close was a comment-only directive.
do_dependabot_closeposted an@dependabot closecomment and relied on Dependabot to actually shut the PR. Dependabot has historically ignored that directive for hours, and the hourly cron keeps re-encountering the still-open PR. On github-community-projects/contributors#496 the cron posted the directive 12+ times between 00:05 and 20:08 UTC on 2026-06-15 before Dependabot finally acted, spamming the PR with duplicate "@dependabot close" comments. The 1-hourACTION_COOLDOWN_SECONDSdoes not save us: the cron fires once per hour, so(now - last)lands right at the 3600s boundary andin_cooldownreturns False on the very next tick.Bug 2: a flaky mark-done would unwind the cooldown for every post-action branch. Latent issue uncovered by the multi-model review of the bug 1 fix. The run loop called
mark_thread_done()BEFORE settingstate[pr_url] = now(the cooldown). Ifmark_thread_donethrew a transientCalledProcessError/TimeoutExpired, control jumped to the outer "action failed" except handler and the cooldown was never written. Next cron tick would re-attempt the already-completed action - re-merge, re-close, re-comment. Same silent-loss family as bug 1, hidden behind a different code path.What changed
Bug 1 (force-close prerelease PRs): I now follow the
@dependabot closecomment with a directgh pr close <num> --repo <repo> --delete-branchcall so the PR shuts on the first cron tick. The comment stays in place so Dependabot's own tracking still records the directive. Only the narrow "already closed / not found" race on the close call is swallowed (e.g. a parallel run, or Dependabot acting on the comment between the two calls). Every other failure - auth, rate limit, timeout, transient API error, branch-deletion failure - propagates so the outer run loop records it instats.errorsand skips the cooldown that would otherwise hide the still-open PR until the next cron tick.Bug 2 (preserve cooldown across flaky mark-done): Added
_safe_mark_thread_done(thread_id, *, dry_run, stats, context)helper that wrapsmark_thread_done, swallowsCalledProcessError/TimeoutExpired, appends tostats.errors, logs a warning, and returns bool. Refactored every post-action call site (merge, label-and-merge, close-prerelease, terminal-skip, branch-protected, archived-repo) to:so a flaky mark-done can no longer undo the cooldown. The excluded-dep paths (open + closed) deliberately keep their existing inline try/except - those have different semantics (the "action" there IS the mark-done, so cooldown must NOT engage on mark-done failure or the next tick could not retry). The terminal-skip path picked up an explicit
state[pr_url] = nowbefore its safe-mark-done call - previously it never wrote cooldown at all, so a flaky mark-done on a closed PR would loop forever (GPT-5.4 reviewer catch).The scope-narrowing on bug 1 and the bug 2 fix both came out of multi-model review (Opus / Sonnet / GPT). All three independently flagged the broad
exceptin the first draft as a regression; Opus and GPT independently flagged additional latent gaps (archived-repo branch consistency, terminal-skip cooldown).Files touched:
.copilot/skills/triage-dependabot/{triage_dependabot.py, tests.py, README.md, SKILL.md}.Testing
pytest tests.py-> 213 passed (added 8 new tests; updated 1).test_do_dependabot_close_posts_correct_commentupdated to assert both calls land in order. Newtest_do_dependabot_close_swallows_already_closed_errorcovers the benign-race path. Newtest_do_dependabot_close_propagates_real_close_failureandtest_do_dependabot_close_propagates_timeoutverify real failures bubble tostats.errorsand skip cooldown.state[pr_url]IS set whenmark_thread_donethrows, for the merge, label-and-merge, close-prerelease, and terminal-skip outcomes. 2 new unit tests for the helper itself (no-op on empty thread_id; records tostats.errorsand returns False on failure).test_run_excluded_dep_mark_done_failure_does_not_abort_run(which asserts the OPPOSITE invariant for excluded-dep paths - cooldown NOT set on mark-done failure) still passes, confirming I preserved the intentional semantic difference there.Tradeoffs and alternatives considered
gh pr close. Rejected. Dependabot's own state tracking benefits from seeing the@dependabot closedirective (it records the PR as intentionally declined and is less likely to re-open the same bump on the next poll). Belt and suspenders.gh pr closeand log warning. Rejected after multi-model review. A blanket swallow combined with the unconditionalmark_thread_done+ cooldown at the call site would silently lose any auth / rate-limit / timeout failure - the exact silent-loss pattern this PR is meant to eliminate. Narrow stderr matching keeps the benign race silent while letting real failures bubble tostats.errors.ACTION_COOLDOWN_SECONDSfrom 3600 to e.g. 7200. Treats the symptom, not the cause. The PR would still be open, still re-notifying on every push, still costing reviewer attention. Force-closing is the root-cause fix.Impact
gh pr closefailure now lands instats.errorsand the next cron tick retries instead of being hidden behind a cooldown.Rollout
~/Library/LaunchAgents/com.zkoppert.triage-dependabot.plistruns hourly and will pick up the new code automatically on the next tick after merge.@dependabot closecomments and that the PR closes on the first tick.stats.errorsacross the same period for anymark-done failed for ...entries - they should be rare but harmless now (cooldown still engages), and any uptick would indicate a notifications API issue worth investigating.