Skip to content

Conversation

@armenzg
Copy link
Member

@armenzg armenzg commented Oct 17, 2025

The issues comes from this block:

try:
if seer_deletion:
# Tell seer to delete grouping records for these groups
# It's low priority to delete the hashes from seer, so we don't want
# any network errors to block the deletion of the groups
hash_values = [gh[1] for gh in hashes_chunk]
may_schedule_task_to_delete_hashes_from_seer(project_id, hash_values)
except Exception:
logger.warning("Error scheduling task to delete hashes from seer")
finally:
hash_ids = [gh[0] for gh in hashes_chunk]
GroupHash.objects.filter(id__in=hash_ids).delete()

The update is triggered because of this on_delete:

seer_matched_grouphash = FlexibleForeignKey(
"sentry.GroupHash", related_name="seer_matchees", on_delete=models.SET_NULL, null=True
)

Currently, when we try to delete all the group hashes, we update the related group hash metadata first. This query ends up failing for taking longer than 30 seconds:

SQL: UPDATE "sentry_grouphashmetadata" SET "seer_matched_grouphash_id" = NULL WHERE "sentry_grouphashmetadata"."seer_matched_grouphash_id" IN (%s, ..., %s)

This can be resolved by deleting the group hash metadata rows before trying to delete the group hash rows. This will avoid the update statement altogether.

This fix was initially started in #101545, however, the solution has completely changed, thus, starting a new PR.

Fixes SENTRY-5ABJ.

The issues comes from this block:
https://github.com/getsentry/sentry/blob/a3a771719d4777bd747d98fb05eb77c20425e3d6/src/sentry/deletions/defaults/group.py#L248-L259

The update is triggered because of this `on_delete`:
https://github.com/getsentry/sentry/blob/b1f684a335128dbc74ad3a7fac1d7052df9e8f01/src/sentry/models/grouphashmetadata.py#L116-L118

Currently, when we try to delete all the group hashes, we update the related group hash metadata first. This query ends up failing for taking longer than 30 seconds:

> SQL: UPDATE "sentry_grouphashmetadata" SET "seer_matched_grouphash_id" = NULL WHERE "sentry_grouphashmetadata"."seer_matched_grouphash_id" IN (%s, ..., %s)

This can be resolved by deleting the group hash _metadata_ rows before trying to delete the group hash rows. This will avoid the update statement altogether.

This fix was initially started in #101545, however, the solution has completely changed, thus, starting a new PR.

Fixes [SENTRY-5ABJ](https://sentry.io/organizations/sentry/issues/6930113529/).
@armenzg armenzg self-assigned this Oct 17, 2025
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Oct 17, 2025
cursor[bot]

This comment was marked as outdated.


iterations += 1

if iterations == GROUP_HASH_ITERATIONS:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by metric.

GroupHashMetadata.objects.filter(grouphash_id__in=hash_ids).delete()
except Exception:
# XXX: Let's make sure that no issues are caused by this and then remove it
logger.exception("Error deleting group hash metadata")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once I enable the option, I would like to know if any problems are caused by this while falling back to the original behaviour rather than completely aborting the process.


__repr__ = sane_repr("group_id", "hash")
__repr__ = sane_repr("group_id", "hash", "metadata")
__str__ = __repr__
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes print statements during debugging actually useful.

register(
"deletions.group-hashes-batch-size",
default=10000,
default=100,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flags=FLAG_AUTOMATOR_MODIFIABLE,
)
register(
"deletions.group.delete_group_hashes_metadata_first",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will control the new behaviour.

assert grouphash_a.metadata is not None
assert grouphash_a.metadata.seer_matched_grouphash is None
assert grouphash_b.metadata is not None
assert grouphash_b.metadata.seer_matched_grouphash == grouphash_a
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This verifies that we have the column set to the first group hash. Not that we end up doing the update statement, however, I want to verify that we're testing the same code path.

"""
Test that when deleting group hashes, the group hash metadata is deleted first (which will not update the references to the other group hashes)
"""
with self.options({"deletions.group.delete_group_hashes_metadata_first": True}):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two tests are functionally the same.
This test avoids the update call.

@armenzg
Copy link
Member Author

armenzg commented Oct 17, 2025

bugbot run

cursor[bot]

This comment was marked as outdated.

# gh B -> ghm B -> gh C
# gh C -> ghm C -> gh A
#
# Deleting group hashes A, B & C (since they all point to the same group) will require:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is called as this: delete_group_hashes(project_id, group_ids), thus, all group hashes point to the same group.

Image

# Deleting group hashes A, B & C (since they all point to the same group) will require:
# * Updating columns ghmB & ghmC to point to None
# * Deleting the group hash metadata rows
# * Deleting the group hashes
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this PR, the current approach is this:

  • Start group hashes deletion
    • Which triggers group hash metadata column updating
    • Which then triggers deleting the group hash metadata rows
  • Now that children group hash metadata are deleted we can delete the group hashes

@armenzg armenzg marked this pull request as ready for review October 17, 2025 14:15
@armenzg armenzg requested review from a team as code owners October 17, 2025 14:15
@armenzg armenzg requested a review from markstory October 17, 2025 14:15
@codecov
Copy link

codecov bot commented Oct 17, 2025

Codecov Report

❌ Patch coverage is 73.33333% with 4 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/deletions/defaults/group.py 60.00% 4 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #101720      +/-   ##
===========================================
+ Coverage   80.85%    80.95%   +0.09%     
===========================================
  Files        8707      8707              
  Lines      387091    387104      +13     
  Branches    24524     24524              
===========================================
+ Hits       313000    313393     +393     
+ Misses      73743     73363     -380     
  Partials      348       348              

# Verify that seer matched event_b to event_a's hash
assert event_a.group_id == event_b.group_id
# Make sure it has not changed
assert grouphash_a.metadata is not None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The django ORM record won't change even if the underlying db record was changed. You could add grouphash_a.refresh_from_db() to ensure that you have the latest data from postgres.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to push that change. I have it locally. I will make it part of my next PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this in #101796

@armenzg armenzg merged commit 03e49a0 into master Oct 17, 2025
69 checks passed
@armenzg armenzg deleted the 0/delete_group_hashes_metadata_first/armenzg branch October 17, 2025 14:48
@sentry-io
Copy link

sentry-io bot commented Oct 17, 2025

Issues attributed to commits in this pull request

This pull request was merged and Sentry observed the following issues:

@armenzg
Copy link
Member Author

armenzg commented Oct 17, 2025

Issues attributed to commits in this pull request

This pull request was merged and Sentry observed the following issues:

This is part of the ThreadLeaks project and not part of the CI test runs. I can see my host name in there.

armenzg added a commit that referenced this pull request Oct 20, 2025
Last week when I was debugging tests for #101720 it was confusing to
find events and groups that had nothing to do with the tests I was
working on.

This refactor moves the majority of the logic from the setUp function to
the first test since it's where its needed.
armenzg added a commit that referenced this pull request Oct 22, 2025
armenzg added a commit that referenced this pull request Oct 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants