Skip to content

fix(webhook): don't retain failed PR metadata jobs#118

Merged
entrius merged 1 commit into
testfrom
fix/metadata-job-remove-on-fail
May 20, 2026
Merged

fix(webhook): don't retain failed PR metadata jobs#118
entrius merged 1 commit into
testfrom
fix/metadata-job-remove-on-fail

Conversation

@anderdc
Copy link
Copy Markdown
Collaborator

@anderdc anderdc commented May 20, 2026

Summary

  • A fetch-pr-metadata job uses a stable custom jobId per PR (meta-<repo>-<pr>). BullMQ ignores add() when a job with that id already exists in any state, including failed retention.
  • With removeOnFail: 50, a metadata job that exhausted its 3 retries during a transient GitHub outage sat in the failed set and blocked every later edited / closed / reopened / synchronize webhook for the same PR until the global 50-slot cap evicted it — leaving body, last_edited_at, closing_issue_numbers, and downstream issues.solved_by_pr stale.
  • Drop retention for the two metadata enqueue sites (webhook handler + backfill follow-up) to true so failed jobs evict immediately and the next webhook gets a fresh fetch. Failure detail remains in service logs.

Fixes #75. PR_FILES enqueues are unaffected — their jobId already varies by head/base SHA, so a failed entry can't squat across pushes.

Test plan

  • npm run format:check
  • npm run lint
  • npm run build

A `fetch-pr-metadata` job uses a stable custom jobId per PR
(`meta-<repo>-<pr>`). BullMQ ignores `add()` when a job with that id
already exists in any state, including the failed-retention set. With
`removeOnFail: 50`, a metadata job that exhausts its 3 retries during a
transient GitHub outage sat in the failed set and blocked every later
`edited`/`closed`/`reopened`/`synchronize` webhook for the same PR until
the global 50-slot cap evicted it — leaving `body`,
`last_edited_at`, `closing_issue_numbers`, and downstream
`issues.solved_by_pr` stale.

Drop retention for these enqueues to `true` so failed jobs evict
immediately and the next webhook gets a fresh fetch. Failure detail
remains in service logs. Fixes #75.
@entrius entrius merged commit 437fddf into test May 20, 2026
2 checks passed
@entrius entrius deleted the fix/metadata-job-remove-on-fail branch May 20, 2026 23:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Failed PR metadata jobs can block future metadata refreshes

2 participants