-
-
Couldn't load subscription status.
- Fork 4.5k
fix(grouping): Fix unmerged seer grouphash integrity error #83081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(grouping): Fix unmerged seer grouphash integrity error #83081
Conversation
f9c3bb1 to
4643f58
Compare
|
would it be better to set
|
Good question. There's some disconnect between the UPDATE: That worked! Have updated the PR. |
4643f58 to
a9a9823
Compare
|
This PR has a migration; here is the generated SQL for --
-- Alter field seer_matched_grouphash on grouphashmetadata
--
-- (no-op) |
| # The `GroupHash` record representing the match Seer sent back as a match (if any) | ||
| seer_matched_grouphash = FlexibleForeignKey( | ||
| "sentry.GroupHash", related_name="seer_matchees", on_delete=models.DO_NOTHING, null=True | ||
| "sentry.GroupHash", related_name="seer_matchees", on_delete=models.SET_NULL, null=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, there shouldn't be too many GroupHashMetadata per GroupHash, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's technically possible for it to be a large number, but most often it's not.
Also, I assume (?) that the null setting happens only when delete is called on the linked object, yes? If so, then in cases where there's been no manual unmerge (the 99.9% case), all of the GroupHashMetadata records pointing at the GroupHash instance will already have been deleted by the time we get to the null setting, as they'll belong to sibling GroupHash records of the GroupHash record in question. (See point 3 in the first list in the PR description.) So only in the 0.1% case will there even be any remaining GroupHashMetadata records on which to update the seer_matched_grouphash value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is happening: https://sentry.sentry.io/issues/6882819222/
If we're deleting a group hash, why do we need to update the group hash metadata?
Should not group hash metadata be also deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great!
We've been getting occasional integrity errors when trying to delete `grouphash` records, because of the `seer_matched_grouphash` foreign key in the `GroupHashMetadata` model. Normally, the process works like this: 1. An event creates `grouphash_A` and `group_1`. 2. A new event creates `grouphash_B`, which Seer then matches to `grouphash_A`. The new event goes in `group_1`, and `grouphash_B.metadata.seer_matched_grouphash` is set to `grouphash_A. 3. `group_1` ages out, and we delete it. In this process deletions happen from the bottom up, so first the `GroupHashMetadata` records for both `grouphash_A` and `grouphash_B` are deleted, then ``grouphash_A` and `grouphash_B` are themselves deleted, and finally `group_1` is deleted. Because of that ordering, the `grouphash_B.metadata.seer_matched_grouphash` link to `grouphash_A` disappears before we try to delete `grouphash_A`, and no integrity errors are thrown. However, occasionally it goes like this: 1. Same as above 2. Same as above 3. The user decides they disagree with Seer, and unmerges `grouphash_B` from `group_1`, such that it now points to `group_2`, which is newer than `group_1` and therefore doesn't expire when `group_1` does. 4. `group_1` ages out, and deletion happens as before, except this time it's only the `GroupHashMetadata` record for `grouphash_A` which gets deleted ahead of `grouphash_A`'s deletion. So even after `grouphash_A` is deleted, `grouphash_B.metadata.seer_matched_grouphash` still points to it, and boom: you've got yourself an integrity error. This fixes the problem by updating the `on_delete` setting for the `seer_matched_grouphash` field to be `SET_NULL`, so that Django will automatically break the foreign key link whenever a `seer_matched_grouphash` is deleted. I chose to have that happen during deletion rather than unmerging because when unmerges happen it's nice for debugging to be able to compare Seer's result to the human-mediated one.
We've been getting occasional integrity errors when trying to delete
grouphashrecords, because of theseer_matched_grouphashforeign key in theGroupHashMetadatamodel. Normally, the process works like this:grouphash_Aandgroup_1.grouphash_B, which Seer then matches togrouphash_A. The new event goes ingroup_1, andgrouphash_B.metadata.seer_matched_grouphashis set to `grouphash_A.group_1ages out, and we delete it. In this process deletions happen from the bottom up, so first theGroupHashMetadatarecords for bothgrouphash_Aandgrouphash_Bare deleted, then ``grouphash_Aandgrouphash_B` are themselves deleted, and finally `group_1` is deleted. Because of that ordering, the `grouphash_B.metadata.seer_matched_grouphash` link to `grouphash_A` disappears before we try to delete `grouphash_A`, and no integrity errors are thrown.However, occasionally it goes like this:
grouphash_Bfromgroup_1, such that it now points togroup_2, which is newer thangroup_1and therefore doesn't expire whengroup_1does.group_1ages out, and deletion happens as before, except this time it's only theGroupHashMetadatarecord forgrouphash_Awhich gets deleted ahead ofgrouphash_A's deletion. So even aftergrouphash_Ais deleted,grouphash_B.metadata.seer_matched_grouphashstill points to it, and boom: you've got yourself an integrity error.This fixes the problem by updating the
on_deletesetting for theseer_matched_grouphashfield to beSET_NULL, so that Django will automatically break the foreign key link whenever aseer_matched_grouphashis deleted. I chose to have that happen during deletion rather than unmerging because when unmerges happen it's nice for debugging to be able to compare Seer's result to the human-mediated one.