fix(ci): close stale-bot gaps for re-marking and active-PR auto-close#60402
Conversation
The stale workflow had two gaps. First, `actions/stale` tracks visited issues/PRs in a per-cycle state set, so a PR visited while still fresh is skipped for the rest of the cycle even if it ages past the threshold mid-cycle. With ~7k open issues and `operations-per-run: 250`, one cycle was taking long enough that PRs manually unstaled by an author were never re-evaluated before the author force-pushed weeks later. Bumping the budget to 5000 plus the higher app-token API rate limit lets a full cycle complete in a single run, so re-marking happens on the next scheduled tick. Second, `remove-pr-stale-when-updated: false` meant pushing commits to a stale PR did NOT clear the label, so authors actively addressing review feedback still got auto-closed seven days later. Flipping it to true matches the behavior most contributors expect. Piggybacks on the existing `POSTHOG_SCHEDULED_ACTIONS` GitHub App, already used by update-ai-costs, browserslist, and update-bot-ips. The app token also replaces the workflow-scoped `issues:`/`pull-requests:` write permissions — the action now authenticates via `repo-token` instead of `GITHUB_TOKEN`.
|
There was a problem hiding this comment.
Pull request overview
This PR updates the scheduled stale workflow to make stale PR handling more reliable by using a higher-rate-limit GitHub App token and allowing updated PRs to clear the stale label.
Changes:
- Adds a
POSTHOG_SCHEDULED_ACTIONSapp-token step and passes it toactions/stale. - Drops write permissions from the workflow
GITHUB_TOKEN. - Enables PR unstaling on update and raises
operations-per-run.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| exempt-pr-labels: 'waiting' | ||
| enable-statistics: true | ||
| operations-per-run: 250 | ||
| operations-per-run: 5000 |
Problem
Two gaps in the stale-PR workflow that have been silently misbehaving:
1. PRs unstaled by their author were never re-marked.
actions/stalekeeps a per-cycle "already processed this run" set: every issue/PR the action visits — even one it decides isn't stale and takes no action on — is added to the set and skipped for the rest of the cycle. The set only resets when the action gets all the way through the open issue + PR list. Withoperations-per-run: 250and PostHog's volume (~7k open issues + ~200 open PRs), a single cycle was taking weeks to complete. So if an author removed thestalelabel and went idle, the bot would visit the PR once while it was still fresh, stamp it as processed, and not look again until the cycle finally rolled over — by which time the author had usually force-pushed and bumpedupdated_atagain. The bot looked broken; it was actually just rate-limited by its own budget. Example: #54280 was unstaled on Apr 22 and sat completely idle for 36 days without ever being re-marked.2. Pushing commits to a stale PR did not clear the label.
remove-pr-stale-when-updated: falsemeant an author actively addressing review feedback would still get auto-closed seven days later. Not the behavior most contributors expect.Changes
Piggybacks on the existing
POSTHOG_SCHEDULED_ACTIONSGitHub App that already powersupdate-ai-costs,browserslist, andupdate-bot-ips— sameactions/create-github-app-tokenSHA-pinned step. With the app token providing auth viarepo-token, the workflow-levelissues: write/pull-requests: writepermissions onGITHUB_TOKENare no longer needed and have been dropped (matching theauto-assign-reviewerspattern).The holiday-skip behavior, message text, thresholds (7-day stale / 7-day close on PRs, 730 / 14 on issues), and
waitingexempt label are all unchanged.How did you test this code?
I'm Claude (Opus 4.7) — agent-authored, no manual run of the workflow.
actions/stalestate model and per-cycle skip behavior by readingsrc/classes/issues-processor.tsinactions/stale@v10.2.0:state.addIssueToProcessed(issue)is called on every visited issue regardless of whether the action mutated itstate.reset()only fires whenissues.length <= 0returns from the listing call (i.e. all issues exhausted)!_updatedSince(issue.updated_at, daysBeforeStale)— there's no exemption based on prior manual label removal, so the only reason re-marking wouldn't happen is the per-cycle skip setgithub-actions[bot], manually removed same day, zero activity (no commits, no comments) for 36 days, no re-mark — fits the cycle-too-long hypothesis exactlyPOSTHOG_SCHEDULED_ACTIONSis already used by 3 other scheduled workflows (update-ai-costs.yml,browserslist.yml,update-bot-ips.yml) — semantic fit for a daily cron, and avoids the naming mismatch of having stale comments appear under "Assign Reviewers"enable-statistics: trueoutput should show whether a full cycle now completes in one run (look for "Statistics" block in the action log)If 5000 turns out to still be too low, the budget can be bumped further without other changes.
Publish to changelog?
no
🤖 Agent context
Authored via Claude Code (Opus 4.7) in a session with @webjunkie investigating why the stale bot never re-marked #54280 after a manual
stalelabel removal.Decisions worth flagging:
operations-per-runexhaustion per se. Initial hypothesis was that the action gives up mid-list when budget runs out and loses its place — debunked by reading the source: state persists in the actions cache across runs. The real issue is that an issue visited as not stale still consumes its slot in the per-cycle "processed" set, so it isn't reconsidered until the whole list has been walked. With 250 ops/run that walk was taking many weeks. 5000 should comfortably fit one full pass in a single run for current repo size.GITHUB_TOKEN: primary motivation is the 15× higher API rate limit, which the action will actually use when it's no longer artificially capped at 250 ops. The identity change (comments now come from the scheduled-actions app rather thangithub-actions[bot]) is incidental but matches what other scheduled workflows already do.remove-pr-stale-when-updatedflip is a separate bug from the cycle issue and could ship on its own, but the two changes are scoped narrowly enough together that bundling keeps the PR focused on "make the stale workflow actually behave the way contributors expect".since-filtered listing pattern. Out of scope and not warranted while the stock action's knobs cover the gap.Reviewer ask: app-token piggybacking is fine to validate by inspection — the secret refs and SHA-pinned action match the existing pattern verbatim. The
enable-statisticsoutput on the first run will tell us whether 5000 is the right number.Generated by Claude Code