feat(deletion): Add partition support to BulkDeleteQuery and cleanup command by dashed · Pull Request #107906 · getsentry/sentry

dashed · 2026-02-10T00:28:55Z

Summary

Add partition parameter to BulkDeleteQuery to split deletion work across multiple runs using modulo-based row bucketing
Add --partition-bucket, --partition-total, and --partition-key flags to the sentry cleanup CLI command
Fully backward compatible: behavior is unchanged when partition flags are not provided

Problem

The daily sentry cleanup CronJob for SpikeProjections and Spike models runs a tight DELETE loop via BulkDeleteQuery._continuous_query(), creating a burst of dead tuples on db-usage-1. This triggers a massive autovacuum that causes WAL replication delay, forcing db-usage-repl to fall back to GSCP recovery.

Since valid_date values are always at midnight UTC, simply running the CronJob more frequently doesn't help — all eligible rows become deletable at the same instant. We need a way to partition the rows across multiple runs.

Solution

Add id % N partitioning to BulkDeleteQuery:

sentry cleanup --model=SpikeProjections --days=90 --partition-bucket=0 --partition-total=4
sentry cleanup --model=SpikeProjections --days=90 --partition-bucket=1 --partition-total=4
sentry cleanup --model=SpikeProjections --days=90 --partition-bucket=2 --partition-total=4
sentry cleanup --model=SpikeProjections --days=90 --partition-bucket=3 --partition-total=4

Each run adds WHERE id % 4 = {bucket} to the DELETE query, handling ~25% of eligible rows. The --partition-key flag allows using a different column (defaults to id).

Using id (auto-increment) ensures uniform distribution, following the same principle as PR #18736 which switched spike projection batching from organization_id (snowflake, uneven) to subscription.id (auto-increment, uniform).

Changes

`src/sentry/db/deletion.py`

Added partition: tuple[int, int, str] | None parameter to BulkDeleteQuery.__init__()
Added partition filter to the WHERE clause in execute()
Added partition filter to iterator() via Func(F(key), Value(total), function="MOD")

`src/sentry/runner/commands/cleanup.py`

Added --partition-bucket CLI flag (integer, 0-based bucket index)
Added --partition-total CLI flag (integer, total number of buckets)
Added --partition-key CLI flag (default: id)
Validation: both bucket and total must be used together, bucket must be non-negative and less than total, total must be positive
Threaded through cleanup() → _cleanup() → run_bulk_query_deletes() → BulkDeleteQuery()

Test plan

test_partition_restriction — verifies only rows in the matching bucket are deleted
test_partition_with_datetime_restriction — combines partition + date filter
test_partition_all_buckets_cover_all_rows — verifies complete coverage across all buckets
test_iteration_with_partition — verifies iterator() respects partition filter
test_partition_bucket_exceeds_total — validation error for bucket >= total
test_partition_negative_bucket — validation error for negative bucket
test_partition_zero_total — validation error for zero total
test_partition_bucket_without_total — validation error when only bucket is set
test_partition_total_without_bucket — validation error when only total is set

Add a `partition` parameter to BulkDeleteQuery that enables splitting deletion work across multiple runs using modulo-based row bucketing. When `partition=(bucket, total, key_column)` is provided, the DELETE query adds `WHERE {key} % {total} = {bucket}`, so each run only handles a fraction of eligible rows. This allows spreading deletion load across multiple scheduled jobs to reduce dead tuple bursts and autovacuum contention on high-churn tables like accounts_spike_projections. The partition filter is applied in the inner SELECT subquery (where the LIMIT is), ensuring each partition independently selects and deletes its own candidate rows.

ajay-sentry · 2026-02-10T17:50:59Z

src/sentry/runner/commands/cleanup.py

+    # Parse and validate --partition flag
+    parsed_partition: tuple[int, int, str] | None = None
+    if partition is not None:
+        parts = partition.split("/")


a thought here, is we could have had 2 separate params here for total_buckets and bucket_id or smth to make it a little more straightforward

ajay-sentry

looks good, I see partition_key isn't used on the cron but makes sense if we want to easily update later

…cleanup command Expose BulkDeleteQuery's partition support via the `sentry cleanup` CLI: --partition-bucket BUCKET (0-based bucket index) --partition-total TOTAL (total number of buckets) --partition-key COLUMN (default: id) This allows K8s CronJobs to split bulk deletion work across multiple scheduled runs. For example, the spikeprotections cleanup can be split into 4 jobs at 6-hour intervals, each handling ~25% of eligible rows via `id % 4 = {0,1,2,3}`. Includes input validation: both flags must be used together, bucket must be non-negative and less than total, total must be positive.

…command (#107906) ## Summary - Add `partition` parameter to `BulkDeleteQuery` to split deletion work across multiple runs using modulo-based row bucketing - Add `--partition-bucket`, `--partition-total`, and `--partition-key` flags to the `sentry cleanup` CLI command - Fully backward compatible: behavior is unchanged when partition flags are not provided ## Problem The daily `sentry cleanup` CronJob for `SpikeProjections` and `Spike` models runs a tight DELETE loop via `BulkDeleteQuery._continuous_query()`, creating a burst of dead tuples on `db-usage-1`. This triggers a massive autovacuum that causes WAL replication delay, forcing `db-usage-repl` to fall back to GSCP recovery. Since `valid_date` values are always at midnight UTC, simply running the CronJob more frequently doesn't help — all eligible rows become deletable at the same instant. We need a way to **partition the rows** across multiple runs. ## Solution Add `id % N` partitioning to `BulkDeleteQuery`: ``` sentry cleanup --model=SpikeProjections --days=90 --partition-bucket=0 --partition-total=4 sentry cleanup --model=SpikeProjections --days=90 --partition-bucket=1 --partition-total=4 sentry cleanup --model=SpikeProjections --days=90 --partition-bucket=2 --partition-total=4 sentry cleanup --model=SpikeProjections --days=90 --partition-bucket=3 --partition-total=4 ``` Each run adds `WHERE id % 4 = {bucket}` to the DELETE query, handling ~25% of eligible rows. The `--partition-key` flag allows using a different column (defaults to `id`). Using `id` (auto-increment) ensures uniform distribution, following the same principle as [PR #18736](getsentry/getsentry#18736) which switched spike projection batching from `organization_id` (snowflake, uneven) to `subscription.id` (auto-increment, uniform). ## Changes ### `src/sentry/db/deletion.py` - Added `partition: tuple[int, int, str] | None` parameter to `BulkDeleteQuery.__init__()` - Added partition filter to the WHERE clause in `execute()` - Added partition filter to `iterator()` via `Func(F(key), Value(total), function="MOD")` ### `src/sentry/runner/commands/cleanup.py` - Added `--partition-bucket` CLI flag (integer, 0-based bucket index) - Added `--partition-total` CLI flag (integer, total number of buckets) - Added `--partition-key` CLI flag (default: `id`) - Validation: both bucket and total must be used together, bucket must be non-negative and less than total, total must be positive - Threaded through `cleanup()` → `_cleanup()` → `run_bulk_query_deletes()` → `BulkDeleteQuery()` ## Test plan - [x] `test_partition_restriction` — verifies only rows in the matching bucket are deleted - [x] `test_partition_with_datetime_restriction` — combines partition + date filter - [x] `test_partition_all_buckets_cover_all_rows` — verifies complete coverage across all buckets - [x] `test_iteration_with_partition` — verifies `iterator()` respects partition filter - [x] `test_partition_bucket_exceeds_total` — validation error for bucket >= total - [x] `test_partition_negative_bucket` — validation error for negative bucket - [x] `test_partition_zero_total` — validation error for zero total - [x] `test_partition_bucket_without_total` — validation error when only bucket is set - [x] `test_partition_total_without_bucket` — validation error when only total is set ## Related - Ops PR (K8s config): getsentry/ops#19081 - Analysis: daily mass deletion on `accounts_spike_projections` causes replication delay on `db-usage-1`

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Feb 10, 2026

vercel bot deployed to Preview February 10, 2026 00:31 View deployment

dashed self-assigned this Feb 10, 2026

dashed marked this pull request as ready for review February 10, 2026 17:22

dashed requested review from a team and ellisonmarks February 10, 2026 17:25

cursor bot reviewed Feb 10, 2026

View reviewed changes

src/sentry/db/deletion.py Show resolved Hide resolved

dashed force-pushed the alberto/partition-bulk-delete branch from c6dba4c to 2e2f9c0 Compare February 10, 2026 17:41

vercel bot deployed to Preview February 10, 2026 17:44 View deployment

ajay-sentry reviewed Feb 10, 2026

View reviewed changes

ajay-sentry approved these changes Feb 10, 2026

View reviewed changes

dashed force-pushed the alberto/partition-bulk-delete branch from 2e2f9c0 to 4191754 Compare February 10, 2026 21:04

vercel bot deployed to Preview February 10, 2026 21:07 View deployment

dashed merged commit 076b727 into master Feb 10, 2026
73 checks passed

dashed deleted the alberto/partition-bulk-delete branch February 10, 2026 22:15

sentry-release-bot bot mentioned this pull request Feb 17, 2026

publish: getsentry/sentry@26.2.0 getsentry/publish#7194

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(deletion): Add partition support to BulkDeleteQuery and cleanup command#107906

feat(deletion): Add partition support to BulkDeleteQuery and cleanup command#107906
dashed merged 2 commits intomasterfrom
alberto/partition-bulk-delete

dashed commented Feb 10, 2026 •

edited

Loading

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

ajay-sentry Feb 10, 2026

Uh oh!

ajay-sentry left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

dashed commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Changes

src/sentry/db/deletion.py

src/sentry/runner/commands/cleanup.py

Test plan

Related

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ajay-sentry Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

ajay-sentry left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dashed commented Feb 10, 2026 •

edited

Loading

`src/sentry/db/deletion.py`

`src/sentry/runner/commands/cleanup.py`