Skip to content

ref(grouping): Clear old and invalid secondary grouping config options#114378

Merged
lobsterkatie merged 4 commits intomasterfrom
kmclb-clear-old-and-invalid-secondary-grouping-config-options
Apr 29, 2026
Merged

ref(grouping): Clear old and invalid secondary grouping config options#114378
lobsterkatie merged 4 commits intomasterfrom
kmclb-clear-old-and-invalid-secondary-grouping-config-options

Conversation

@lobsterkatie
Copy link
Copy Markdown
Member

@lobsterkatie lobsterkatie commented Apr 29, 2026

Right now, we store an expiry date for each project's grouping config transition period, but we never go back and clean out old/expired transition settings once that period has ended. We also don't make any effort to fix invalid data if/when we find it.

Other than wasting space in the database, this normally isn't an issue, unless the corrupt data leads our assumptions to be broken and our code to therefore fail, as happened recently when a project somehow got the same value set for its default and secondary grouping configs. (In that case, the fact that we discard new secondary hashes led us to also discard new primary hashes, breaking grouping.)

To prevent future problems, and to clean up after ourselves, this PR therefore adds a new helper, _clean_up_expired_config_options (which, as the name suggests, deletes a project's secondary config and secondary config expiry ProjectOption records), and calls it when necessary in the spot where we're already checking the relevant values, namely in the existing is_in_transition function. Like grouping config upgrades, these database calls are guarded by both a cache key and a lock in order to prevent race conditions. It also turns out there were no tests specifically covering the transition check, so this adds them, and includes a check of the clean-up work as part of what they test.

@github-actions github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label Apr 29, 2026
@lobsterkatie lobsterkatie changed the title ref(grouping): Cear old and invalid secondary grouping config options ref(grouping): Clear old and invalid secondary grouping config options Apr 29, 2026
@getsentry getsentry deleted a comment from github-actions Bot Apr 29, 2026
@lobsterkatie lobsterkatie marked this pull request as ready for review April 29, 2026 22:06
@lobsterkatie lobsterkatie requested a review from a team as a code owner April 29, 2026 22:06
Comment thread src/sentry/grouping/ingest/config.py
@lobsterkatie lobsterkatie force-pushed the kmclb-clear-old-and-invalid-secondary-grouping-config-options branch from 30f6125 to fb0ac8a Compare April 29, 2026 22:20
Copy link
Copy Markdown
Member

@yuvmen yuvmen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, just a small question

Comment thread src/sentry/grouping/ingest/config.py
@lobsterkatie lobsterkatie merged commit 6d79887 into master Apr 29, 2026
80 checks passed
@lobsterkatie lobsterkatie deleted the kmclb-clear-old-and-invalid-secondary-grouping-config-options branch April 29, 2026 22:42
cleptric pushed a commit that referenced this pull request May 5, 2026
#114378)

Right now, we store an expiry date for each project's grouping config transition period, but we never go back and clean out old/expired transition settings once that period has ended. We also don't make any effort to fix invalid data if/when we find it.

Other than wasting space in the database, this normally isn't an issue, unless the corrupt data leads our assumptions to be broken and our code to therefore fail, as happened recently when a project somehow got the same value set for its default and secondary grouping configs. (In that case, the fact that we discard new secondary hashes led us to also discard new primary hashes, breaking grouping[1].)

To prevent future problems, and to clean up after ourselves, this PR therefore adds a new helper, `_clean_up_expired_config_options` (which, as the name suggests, deletes a project's secondary config and secondary config expiry `ProjectOption` records), and calls it when necessary in the spot where we're already checking the relevant values, namely in the existing `is_in_transition` function. Like grouping config upgrades, these database calls are guarded by both a cache key and a lock in order to prevent race conditions. It also turns out there were no tests specifically covering the transition check, so this adds them, and includes a check of the clean-up work as part of what they test.


[1] https://sentry.sentry.io/issues/7368625348
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants