Skip to content

Conversation

@armenzg
Copy link
Member

@armenzg armenzg commented Nov 3, 2025

When deleting a GroupHash row, all GroupHashMetadata rows pointing to it via seer_matched_grouphash need updating (see code):

# The `GroupHash` record representing the match Seer sent back as a match (if any)
seer_matched_grouphash = FlexibleForeignKey(
"sentry.GroupHash", related_name="seer_matchees", on_delete=models.SET_NULL, null=True
)

Before #101720, we would only delete GroupHash rows and that would time out because we would stomp queries longer than 30 seconds. In #101720 we added the deletion of the GroupHashMetadata rows but we should have also added the updating.

The new code will have these three stages:

GroupHashMetadata.objects.filter(seer_matched_grouphash_id__in=hash_ids).update(seer_matched_grouphash=None)
GroupHashMetadata.objects.filter(grouphash_id__in=hash_ids).delete()
GroupHash.objects.filter(id__in=hash_ids).delete()

Fixes SENTRY-5ABJ.

For posterity, this is the top of the stack trace:

OperationalError
canceling statement due to user request

SQL: UPDATE "sentry_grouphashmetadata" SET "seer_matched_grouphash_id" = NULL WHERE "sentry_grouphashmetadata"."seer_matched_grouphash_id" IN (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)

@armenzg armenzg self-assigned this Nov 3, 2025
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Nov 3, 2025
cursor[bot]

This comment was marked as outdated.

@codecov
Copy link

codecov bot commented Nov 3, 2025

Codecov Report

❌ Patch coverage is 66.66667% with 1 line in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/deletions/defaults/group.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           master   #102612   +/-   ##
========================================
  Coverage   80.93%    80.93%           
========================================
  Files        8930      8930           
  Lines      391174    391191   +17     
  Branches    24858     24858           
========================================
+ Hits       316579    316629   +50     
+ Misses      74227     74194   -33     
  Partials      368       368           

@armenzg armenzg force-pushed the 0/fix/group_hash_metadata/cleanup/armenzg branch 2 times, most recently from 9fbee5c to a3951b3 Compare November 3, 2025 20:52
@armenzg
Copy link
Member Author

armenzg commented Nov 3, 2025

bugbot run

@armenzg armenzg changed the title fix(deletions): A simpler approach to deletiong group hashes fix(deletions): Delete seer matched group hash metadata first Nov 3, 2025
cursor[bot]

This comment was marked as outdated.

@armenzg armenzg force-pushed the 0/fix/group_hash_metadata/cleanup/armenzg branch from a3951b3 to 8c99fc1 Compare November 4, 2025 18:48
cursor[bot]

This comment was marked as outdated.

@armenzg
Copy link
Member Author

armenzg commented Nov 4, 2025

bugbot run

cursor[bot]

This comment was marked as outdated.

# If we update the columns first, the deletion of the grouphash metadata rows will have less work to do,
# thus, improving the performance of the deletion.
if options.get("deletions.group-hashes-metadata.update-seer-matched-grouphash-ids"):
GroupHashMetadata.objects.filter(seer_matched_grouphash_id__in=hash_ids).update(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what we should have done from the beginning (#101720).

@armenzg armenzg marked this pull request as ready for review November 4, 2025 19:06
@armenzg armenzg requested a review from a team as a code owner November 4, 2025 19:06
Copy link
Member

@yuvmen yuvmen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :)

@armenzg armenzg merged commit 4582533 into master Nov 5, 2025
69 checks passed
@armenzg armenzg deleted the 0/fix/group_hash_metadata/cleanup/armenzg branch November 5, 2025 12:26
armenzg added a commit that referenced this pull request Nov 5, 2025
This will ensure we iteratively delete chunks of Activity rather than failing during Group instance delete.

This error will show en masse once the cleanup script runs again.
This will start happening since we enabled #102612 today.

Fixes [SENTRY-5BYJ](https://sentry.sentry.io/issues/6997944963/).
@sentry
Copy link

sentry bot commented Nov 6, 2025

Issues attributed to commits in this pull request

This pull request was merged and Sentry observed the following issues:

priscilawebdev pushed a commit that referenced this pull request Nov 6, 2025
When deleting a GroupHash row, all GroupHashMetadata rows pointing to it
via `seer_matched_grouphash` need updating (see code):

https://github.com/getsentry/sentry/blob/698262018e6009759d8562e2da63be749df7c32d/src/sentry/models/grouphashmetadata.py#L115-L118

Before #101720, we would only delete GroupHash rows and that would time
out because we would stomp queries longer than 30 seconds. In #101720 we
added the deletion of the GroupHashMetadata rows but we should have also
added the updating.

The new code will have these three stages:
```
GroupHashMetadata.objects.filter(seer_matched_grouphash_id__in=hash_ids).update(seer_matched_grouphash=None)
GroupHashMetadata.objects.filter(grouphash_id__in=hash_ids).delete()
GroupHash.objects.filter(id__in=hash_ids).delete()
```

Fixes [SENTRY-5ABJ](https://sentry.sentry.io/issues/6930113529/).

For posterity, this is the top of the stack trace:
```
OperationalError
canceling statement due to user request

SQL: UPDATE "sentry_grouphashmetadata" SET "seer_matched_grouphash_id" = NULL WHERE "sentry_grouphashmetadata"."seer_matched_grouphash_id" IN (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
```
armenzg added a commit that referenced this pull request Nov 6, 2025
This will ensure we iteratively delete chunks of Activity rather than
failing during Group instance delete.

This error will show en masse once the cleanup script runs again. This
will start happening since we enabled #102612 today.

Fixes [SENTRY-5BYJ](https://sentry.sentry.io/issues/6997944963/).
armenzg added a commit that referenced this pull request Nov 6, 2025
This is a follow-up to #102612.

Fixes [SENTRY-5C13](https://sentry.sentry.io/issues/7001709353/).

For posterity 
```
OperationalError
canceling statement due to user request

SQL: UPDATE "sentry_grouphashmetadata" SET "seer_matched_grouphash_id" = NULL WHERE "sentry_grouphashmetadata"."seer_matched_grouphash_id" IN (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
```

[Link](https://github.com/getsentry/sentry/blob/cdfbaa2cebe0f119104586cbe11da667072d5637/src/sentry/deletions/defaults/group.py#L267-L272)
to code:
```python
if options.get("deletions.group-hashes-metadata.update-seer-matched-grouphash-ids"):
    # This is the line where the error comes from
    GroupHashMetadata.objects.filter(
        seer_matched_grouphash_id__in=hash_ids
    ).update(seer_matched_grouphash=None)
GroupHashMetadata.objects.filter(grouphash_id__in=hash_ids).delete()
GroupHash.objects.filter(id__in=hash_ids).delete()
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants