-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
fix(deletions): Delete seer matched group hash metadata first #102612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #102612 +/- ##
========================================
Coverage 80.93% 80.93%
========================================
Files 8930 8930
Lines 391174 391191 +17
Branches 24858 24858
========================================
+ Hits 316579 316629 +50
+ Misses 74227 74194 -33
Partials 368 368 |
9fbee5c to
a3951b3
Compare
|
bugbot run |
a3951b3 to
8c99fc1
Compare
|
bugbot run |
| # If we update the columns first, the deletion of the grouphash metadata rows will have less work to do, | ||
| # thus, improving the performance of the deletion. | ||
| if options.get("deletions.group-hashes-metadata.update-seer-matched-grouphash-ids"): | ||
| GroupHashMetadata.objects.filter(seer_matched_grouphash_id__in=hash_ids).update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what we should have done from the beginning (#101720).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice :)
This will ensure we iteratively delete chunks of Activity rather than failing during Group instance delete. This error will show en masse once the cleanup script runs again. This will start happening since we enabled #102612 today. Fixes [SENTRY-5BYJ](https://sentry.sentry.io/issues/6997944963/).
Issues attributed to commits in this pull requestThis pull request was merged and Sentry observed the following issues: |
When deleting a GroupHash row, all GroupHashMetadata rows pointing to it via `seer_matched_grouphash` need updating (see code): https://github.com/getsentry/sentry/blob/698262018e6009759d8562e2da63be749df7c32d/src/sentry/models/grouphashmetadata.py#L115-L118 Before #101720, we would only delete GroupHash rows and that would time out because we would stomp queries longer than 30 seconds. In #101720 we added the deletion of the GroupHashMetadata rows but we should have also added the updating. The new code will have these three stages: ``` GroupHashMetadata.objects.filter(seer_matched_grouphash_id__in=hash_ids).update(seer_matched_grouphash=None) GroupHashMetadata.objects.filter(grouphash_id__in=hash_ids).delete() GroupHash.objects.filter(id__in=hash_ids).delete() ``` Fixes [SENTRY-5ABJ](https://sentry.sentry.io/issues/6930113529/). For posterity, this is the top of the stack trace: ``` OperationalError canceling statement due to user request SQL: UPDATE "sentry_grouphashmetadata" SET "seer_matched_grouphash_id" = NULL WHERE "sentry_grouphashmetadata"."seer_matched_grouphash_id" IN (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s) ```
This will ensure we iteratively delete chunks of Activity rather than failing during Group instance delete. This error will show en masse once the cleanup script runs again. This will start happening since we enabled #102612 today. Fixes [SENTRY-5BYJ](https://sentry.sentry.io/issues/6997944963/).
This is a follow-up to #102612. Fixes [SENTRY-5C13](https://sentry.sentry.io/issues/7001709353/).
This is a follow-up to #102612. Fixes [SENTRY-5C13](https://sentry.sentry.io/issues/7001709353/). For posterity ``` OperationalError canceling statement due to user request SQL: UPDATE "sentry_grouphashmetadata" SET "seer_matched_grouphash_id" = NULL WHERE "sentry_grouphashmetadata"."seer_matched_grouphash_id" IN (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s) ``` [Link](https://github.com/getsentry/sentry/blob/cdfbaa2cebe0f119104586cbe11da667072d5637/src/sentry/deletions/defaults/group.py#L267-L272) to code: ```python if options.get("deletions.group-hashes-metadata.update-seer-matched-grouphash-ids"): # This is the line where the error comes from GroupHashMetadata.objects.filter( seer_matched_grouphash_id__in=hash_ids ).update(seer_matched_grouphash=None) GroupHashMetadata.objects.filter(grouphash_id__in=hash_ids).delete() GroupHash.objects.filter(id__in=hash_ids).delete() ```
When deleting a GroupHash row, all GroupHashMetadata rows pointing to it via
seer_matched_grouphashneed updating (see code):sentry/src/sentry/models/grouphashmetadata.py
Lines 115 to 118 in 6982620
Before #101720, we would only delete GroupHash rows and that would time out because we would stomp queries longer than 30 seconds. In #101720 we added the deletion of the GroupHashMetadata rows but we should have also added the updating.
The new code will have these three stages:
Fixes SENTRY-5ABJ.
For posterity, this is the top of the stack trace: