Skip to content

fix(triage-dependabot): force-close prerelease PRs + preserve cooldown on mark-done flakes#38

Merged
zkoppert merged 2 commits into
mainfrom
zkoppert/triage-dependabot-force-close-prerelease
Jun 15, 2026
Merged

fix(triage-dependabot): force-close prerelease PRs + preserve cooldown on mark-done flakes#38
zkoppert merged 2 commits into
mainfrom
zkoppert/triage-dependabot-force-close-prerelease

Conversation

@zkoppert

@zkoppert zkoppert commented Jun 15, 2026

Copy link
Copy Markdown
Owner

Why

Two related bugs in the triage-dependabot skill caused comment spam and silent action loss.

Bug 1: prerelease close was a comment-only directive. do_dependabot_close posted an @dependabot close comment and relied on Dependabot to actually shut the PR. Dependabot has historically ignored that directive for hours, and the hourly cron keeps re-encountering the still-open PR. On github-community-projects/contributors#496 the cron posted the directive 12+ times between 00:05 and 20:08 UTC on 2026-06-15 before Dependabot finally acted, spamming the PR with duplicate "@dependabot close" comments. The 1-hour ACTION_COOLDOWN_SECONDS does not save us: the cron fires once per hour, so (now - last) lands right at the 3600s boundary and in_cooldown returns False on the very next tick.

Bug 2: a flaky mark-done would unwind the cooldown for every post-action branch. Latent issue uncovered by the multi-model review of the bug 1 fix. The run loop called mark_thread_done() BEFORE setting state[pr_url] = now (the cooldown). If mark_thread_done threw a transient CalledProcessError / TimeoutExpired, control jumped to the outer "action failed" except handler and the cooldown was never written. Next cron tick would re-attempt the already-completed action - re-merge, re-close, re-comment. Same silent-loss family as bug 1, hidden behind a different code path.

What changed

Bug 1 (force-close prerelease PRs): I now follow the @dependabot close comment with a direct gh pr close <num> --repo <repo> --delete-branch call so the PR shuts on the first cron tick. The comment stays in place so Dependabot's own tracking still records the directive. Only the narrow "already closed / not found" race on the close call is swallowed (e.g. a parallel run, or Dependabot acting on the comment between the two calls). Every other failure - auth, rate limit, timeout, transient API error, branch-deletion failure - propagates so the outer run loop records it in stats.errors and skips the cooldown that would otherwise hide the still-open PR until the next cron tick.

Bug 2 (preserve cooldown across flaky mark-done): Added _safe_mark_thread_done(thread_id, *, dry_run, stats, context) helper that wraps mark_thread_done, swallows CalledProcessError / TimeoutExpired, appends to stats.errors, logs a warning, and returns bool. Refactored every post-action call site (merge, label-and-merge, close-prerelease, terminal-skip, branch-protected, archived-repo) to:

do_action()
stats.<counter> += 1
state[pr_url] = now      # cooldown set FIRST
_safe_mark_thread_done(...)
_cleanup_stale_entries(...)

so a flaky mark-done can no longer undo the cooldown. The excluded-dep paths (open + closed) deliberately keep their existing inline try/except - those have different semantics (the "action" there IS the mark-done, so cooldown must NOT engage on mark-done failure or the next tick could not retry). The terminal-skip path picked up an explicit state[pr_url] = now before its safe-mark-done call - previously it never wrote cooldown at all, so a flaky mark-done on a closed PR would loop forever (GPT-5.4 reviewer catch).

The scope-narrowing on bug 1 and the bug 2 fix both came out of multi-model review (Opus / Sonnet / GPT). All three independently flagged the broad except in the first draft as a regression; Opus and GPT independently flagged additional latent gaps (archived-repo branch consistency, terminal-skip cooldown).

Files touched: .copilot/skills/triage-dependabot/{triage_dependabot.py, tests.py, README.md, SKILL.md}.

Testing

  • pytest tests.py -> 213 passed (added 8 new tests; updated 1).
  • Bug 1 tests: test_do_dependabot_close_posts_correct_comment updated to assert both calls land in order. New test_do_dependabot_close_swallows_already_closed_error covers the benign-race path. New test_do_dependabot_close_propagates_real_close_failure and test_do_dependabot_close_propagates_timeout verify real failures bubble to stats.errors and skip cooldown.
  • Bug 2 tests: 5 new run-loop regression tests assert state[pr_url] IS set when mark_thread_done throws, for the merge, label-and-merge, close-prerelease, and terminal-skip outcomes. 2 new unit tests for the helper itself (no-op on empty thread_id; records to stats.errors and returns False on failure).
  • Existing test_run_excluded_dep_mark_done_failure_does_not_abort_run (which asserts the OPPOSITE invariant for excluded-dep paths - cooldown NOT set on mark-done failure) still passes, confirming I preserved the intentional semantic difference there.

Tradeoffs and alternatives considered

  • Bug 1 alternative: skip the comment, only call gh pr close. Rejected. Dependabot's own state tracking benefits from seeing the @dependabot close directive (it records the PR as intentionally declined and is less likely to re-open the same bump on the next poll). Belt and suspenders.
  • Bug 1 alternative: catch every exception from gh pr close and log warning. Rejected after multi-model review. A blanket swallow combined with the unconditional mark_thread_done + cooldown at the call site would silently lose any auth / rate-limit / timeout failure - the exact silent-loss pattern this PR is meant to eliminate. Narrow stderr matching keeps the benign race silent while letting real failures bubble to stats.errors.
  • Bug 1 alternative: tighten ACTION_COOLDOWN_SECONDS from 3600 to e.g. 7200. Treats the symptom, not the cause. The PR would still be open, still re-notifying on every push, still costing reviewer attention. Force-closing is the root-cause fix.
  • Bug 2 alternative: leave the latent cooldown-loss bug for a follow-up PR. Rejected. The bug was uncovered while reviewing the bug 1 fix and shares the silent-action-loss pattern. Fixing both together keeps the run-loop invariants consistent and avoids a second PR that touches the same call sites.
  • Bug 2 alternative: refactor the excluded-dep paths to use the helper too. Rejected. Those paths have DIFFERENT semantics (cooldown gated on mark-done success because the mark-done IS the action). Forcing them onto the helper would conflate two distinct contracts and break the existing resilience test.

Impact

  • Bug 1: Comment spam on prerelease Dependabot PRs drops from 1 per hourly cron tick (12+ on contributors#496 before it closed) to 1 per PR lifetime. Reviewer noise removed: no more "why did this bot post the same comment 12 times" investigations.
  • Bug 2: Eliminates a silent action-loss surface that could replay any successful merge / close / comment on the next cron tick if the notifications API hiccupped. In practice this was rare but unbounded; with the fix, the cooldown is durable across mark-done failures and the next tick correctly short-circuits.
  • New silent-failure surface eliminated: any non-benign gh pr close failure now lands in stats.errors and the next cron tick retries instead of being hidden behind a cooldown.

Rollout

  • Merge to main. The launchd job at ~/Library/LaunchAgents/com.zkoppert.triage-dependabot.plist runs hourly and will pick up the new code automatically on the next tick after merge.
  • I will watch the next ~10 cron runs (~10 hours) on the next freshly opened prerelease Dependabot PR to confirm zero duplicate @dependabot close comments and that the PR closes on the first tick.
  • I will also watch stats.errors across the same period for any mark-done failed for ... entries - they should be rare but harmless now (cooldown still engages), and any uptick would indicate a notifications API issue worth investigating.
  • Rollback: revert these two commits; behavior reverts to comment-only close + the latent mark-done cooldown-loss bug returns.

The previous close-prerelease outcome only posted '@dependabot close'
as a comment and relied on Dependabot to act on it. Dependabot has
historically ignored that directive for hours - on
github-community-projects/contributors#496 the hourly cron posted the
comment 12+ times between 00:05 and 20:08 UTC on 2026-06-15 before the
PR actually closed, spamming the PR with duplicate directives.

I now follow the comment with a direct 'gh pr close --delete-branch'
call so the PR shuts on the first cron tick. The comment stays in
place so Dependabot's own tracking still sees the directive.

Only the narrow 'already closed / not found' race on the close call
is swallowed (e.g. a parallel run, or Dependabot acting on the
comment between the two calls). Every other failure - auth, rate
limit, timeout, transient API error, branch-deletion failure -
propagates so the outer run loop records it in stats.errors and skips
the mark_thread_done + cooldown that would otherwise hide the
still-open PR until the next cron tick. Without this propagation the
fix would silently re-introduce the spam pathology it is trying to
escape.

Tests:
- do_dependabot_close_posts_correct_comment updated to assert both
  calls (comment + close with --delete-branch).
- New do_dependabot_close_swallows_already_closed_error verifies the
  benign-race path.
- New do_dependabot_close_propagates_real_close_failure verifies rate
  limit / auth / API failures propagate so cooldown does not engage.
- New do_dependabot_close_propagates_timeout verifies TimeoutExpired
  propagates for the same reason.
- README.md and SKILL.md updated to describe the two-step close.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Zack Koppert <zkoppert@github.com>
@zkoppert zkoppert self-assigned this Jun 15, 2026
Previously, every post-action branch in the run loop called
mark_thread_done() BEFORE setting state[pr_url] = now. If
mark_thread_done threw (transient notifications-API failure), control
jumped to the outer 'action failed' except handler and the cooldown
was never written - so the next hourly cron tick would re-attempt the
already-completed action (re-merge, re-close, re-comment), producing
exactly the spam pathology this skill is built to avoid.

I added a _safe_mark_thread_done(thread_id, *, dry_run, stats, context)
helper that swallows CalledProcessError + TimeoutExpired, appends to
stats.errors, logs a warning, and returns bool. Every post-action call
site (merge, label-and-merge, close-prerelease, terminal-skip,
branch-protected, archived-repo) was refactored to:

    do_action()
    stats.<counter> += 1
    state[pr_url] = now      # cooldown set FIRST
    _safe_mark_thread_done(...)
    _cleanup_stale_entries(...)

so a flaky mark-done never undoes the cooldown.

The excluded-dep paths (open + closed) were deliberately left with
their existing inline try/except. Those have DIFFERENT semantics: the
'action' there IS the mark-done, so cooldown must NOT engage on
mark-done failure or the next tick could not retry. Reviewer (Opus)
caught a stated-invariant mismatch on the archived-repo branch, which
actually had the SAME semantics as the post-action branches; that
branch is now also unified on the helper. Reviewer (GPT) caught that
terminal-skip never wrote cooldown at all, even pre-refactor, so a
flaky mark-done on a closed PR would loop forever; terminal-skip now
sets cooldown before the safe-mark-done call.

Tests:
- 5 new run-loop regression tests assert state[pr_url] IS set when
  mark_thread_done throws, for merge, label-and-merge, close-prerelease,
  and terminal-skip outcomes.
- 2 new unit tests for the helper (no-op on empty thread_id; records to
  stats.errors and returns False on failure).

All 213 tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Zack Koppert <zkoppert@github.com>
@zkoppert zkoppert changed the title fix(triage-dependabot): force-close prerelease PRs via gh pr close fix(triage-dependabot): force-close prerelease PRs + preserve cooldown on mark-done flakes Jun 15, 2026
@zkoppert zkoppert marked this pull request as ready for review June 15, 2026 21:05
Copilot AI review requested due to automatic review settings June 15, 2026 21:05
@zkoppert zkoppert merged commit 46886b8 into main Jun 15, 2026
3 checks passed
@zkoppert zkoppert deleted the zkoppert/triage-dependabot-force-close-prerelease branch June 15, 2026 21:06

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes two related bugs in the triage-dependabot skill that caused comment spam and silent action loss. Bug 1 added a direct gh pr close --delete-branch call after the @dependabot close comment to force-close prerelease PRs on the first cron tick (previously Dependabot ignored the comment for hours, causing 12+ duplicate comments). Bug 2 introduced a _safe_mark_thread_done helper and reordered call sites so the per-PR cooldown is written BEFORE the mark-done call, preventing a flaky notifications API from unwinding the cooldown and causing replayed actions.

Changes:

  • Added _is_already_closed_stderr helper and a direct gh pr close --delete-branch call inside do_dependabot_close, swallowing only the benign "already closed" race and propagating all other failures.
  • Introduced _safe_mark_thread_done wrapper and refactored all post-action call sites (merge, label-and-merge, close-prerelease, terminal-skip, archived-repo, branch-protected) to write cooldown before calling mark-done, so a flaky notifications API cannot undo the cooldown.
  • Added 8 new tests covering both bugs' failure/propagation paths, plus updated 1 existing test and updated documentation in SKILL.md and README.md.
Show a summary per file
File Description
.copilot/skills/triage-dependabot/triage_dependabot.py Core bug fixes: _is_already_closed_stderr, force-close in do_dependabot_close, _safe_mark_thread_done helper, and reordered cooldown/mark-done at all call sites
.copilot/skills/triage-dependabot/tests.py 8 new tests + 1 updated test covering force-close behavior, error propagation, cooldown preservation, and the safe helper
.copilot/skills/triage-dependabot/SKILL.md Documentation updated to reflect the two-step close behavior
.copilot/skills/triage-dependabot/README.md Decision table and action descriptions updated for force-close

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants