Batch webhook delivery cleanup to prevent DB lock contention#2555
Batch webhook delivery cleanup to prevent DB lock contention#2555
Conversation
Follow-up to #2550 which added a created_at index and increased cleanup frequency. The delete itself was still unbatched and vulnerable whenever a backlog accumulates (first run after deploy, clock skew, etc). Delete in small batches with a pause between each to let other transactions through, following the SolidQueue pattern.
There was a problem hiding this comment.
Pull request overview
This PR mitigates database lock contention caused by unbatched deletion of stale Webhook::Delivery rows by switching cleanup to a batched delete with a brief pause between batches, aligning with the operational intent described in the follow-up to #2550.
Changes:
- Update
Webhook::Delivery.cleanupto delete stale rows in batches (default 500). - Add a configurable pause (default 100ms) between delete batches to reduce sustained lock pressure.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a51dc48b76
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Summary
Follow-up to #2550. The
created_atindex and increased frequency reduced per-run volume, but the delete itself was still unbatched — vulnerable whenever a backlog accumulates (first run after deploy, clock skew, etc). On 2026-02-16 at 22:43 UTC this held row locks long enough to exhaust the Puma thread pool and fail Cloudflare health checks.Now deletes in batches of 500 with a 100ms pause between each, following the SolidQueue
clear_finished_in_batchespattern. This lets queued INSERT/UPDATE transactions acquire their locks between batches.Test plan
Webhook::Deliverytest coverage passes unchanged — batching is a performance detail, not a correctness change