-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
fix(deletion): Fix MonitorCheckIn deletion failures #103786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Our previous attmept at correcting processing timeouts for MonitorCheckIn deletions failed, and introduced integrity errors because MonitorCheckIn cant be bulk deleted without manually removing child dependancies (no CASCADE in db FKs). Instead we add proper deletion tasks to the different models. Note we also use `get_child_relations_bulk` here which is actually an old method but not in use in our codebase today. Using it though we might be able to speed up querying for child relations of models with complex relations, when they cant just be outright bulk deleted using BulkModelDeletionTask.
966cfe3 to
1bea4ed
Compare
wedamija
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe we should cascade this like
Monitor -> MonitorIncident -> MonitorCheckin rather than Monitor -> MonitorCheckin -> MonitorIncident
If we delete the MonitorIncident rows first, then the checkins won't have to cascade delete to the MonitorIncident table and you can use the bulk deletion stuff.
This also helps with the MonitorEnvironment deletions, since they won't be referenced by MonitorIncident either
| class MonitorCheckInDeletionTask(ModelDeletionTask[MonitorCheckIn]): | ||
| def get_child_relations_bulk( | ||
| self, instance_list: Sequence[MonitorCheckIn] | ||
| ) -> list[BaseRelation]: | ||
| """ | ||
| Return bulk child relations for MonitorCheckIn deletion. | ||
| Uses __in queries to efficiently delete MonitorIncidents that reference these check-ins. | ||
| """ | ||
| from sentry.monitors import models | ||
|
|
||
| checkin_ids = [ci.id for ci in instance_list] | ||
|
|
||
| return [ | ||
| ModelRelation( | ||
| models.MonitorIncident, | ||
| {"starting_checkin_id__in": checkin_ids}, | ||
| ), | ||
| ModelRelation( | ||
| models.MonitorIncident, | ||
| {"resolving_checkin_id__in": checkin_ids}, | ||
| ), | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this if we cascade in the other direction
…itor and monitor_environment flows allow MonitorIncident to be deleted not in bulk - there shouldnt be so many to cause problems without it
|
Refactored according to discussion with Dan, will leverage two facts:
|
In #103786, we refactored the way MonitorCheckIn gets deleted through Monitor and MonitorEnvironment deletion, but in the process hurt direct MonitorCheckIn cleanup which also happens daily. Added a task to handle it and set the relations to use Bulk Deletion instead of defaulting to Bulk.
Our previous attempt (#103495) at correcting processing timeouts for MonitorCheckIn deletions failed , and introduced integrity errors because MonitorCheckIn cant be bulk deleted without manually removing child dependancies (no CASCADE in db FKs).
Added deletion of MonitorIncident directly before attempting to bulk delete MonitorCheckIns, basically removing the children of MonitorCheckins before deleting them.