-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
feat(deletions): Schedule task to delete pending deletions groups #103820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Sometimes, groups do not get deleted and end up in limbo. This task will handle groups that are pending deletion before 90 days retention and older than a day.
| "task": "deletions:sentry.tasks.delete_pending_groups", | ||
| # Runs every 2 hours during 9am-5pm Eastern Time (EST: UTC-5) | ||
| # 9am, 11am, 1pm, 3pm, 5pm EST = 14:00, 16:00, 18:00, 20:00, 22:00 UTC | ||
| "schedule": task_crontab("0", "14,16,18,20,22", "*", "*", "*"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The schedule now is wonky. I want to see it run during my working hours. Once there are no issues I will switch it to once a day.
| # This can timeout for lack of an index on the status field | ||
| # Not using the last_seen index to avoid the lack of composite index on status and last_seen | ||
| groups = Group.objects.filter(status__in=statuses_to_delete).values_list( | ||
| "id", "project_id", "last_seen" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grabbing last_seen here to filter it out programatically down below.
| # Not using the last_seen index to avoid the lack of composite index on status and last_seen | ||
| groups = Group.objects.filter(status__in=statuses_to_delete).values_list( | ||
| "id", "project_id", "last_seen" | ||
| )[:BATCH_LIMIT] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there's no index on status, let's grab a few a time. Currently, the query takes about 20 seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a compund index with status being the first on the index so it should be hitting and using it, you can try and EXPLAIN the query and you will see its hitting the index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sentry/src/sentry/models/group.py
Lines 635 to 643 in b2fff1d
| status = BoundedPositiveIntegerField( | |
| default=GroupStatus.UNRESOLVED, | |
| choices=( | |
| (GroupStatus.UNRESOLVED, _("Unresolved")), | |
| (GroupStatus.RESOLVED, _("Resolved")), | |
| (GroupStatus.IGNORED, _("Ignored")), | |
| ), | |
| db_index=True, | |
| ) |
It looks like there is an index on status, and checking locally I have an index on the status column as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh my! I'm glad it's there. I may have been looking at the list of compound indexes.
| ) | ||
|
|
||
| for project_id, group_ids in groups_by_project.items(): | ||
| # Schedule deletion tasks in chunks of GROUP_CHUNK_SIZE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to how we schedule it here:
sentry/src/sentry/api/helpers/group_index/delete.py
Lines 87 to 95 in 1483996
| # Schedule a task per GROUP_CHUNK_SIZE batch of groups | |
| for i in range(0, len(group_ids), GROUP_CHUNK_SIZE): | |
| delete_groups_for_project.apply_async( | |
| kwargs={ | |
| "project_id": project.id, | |
| "object_ids": group_ids[i : i + GROUP_CHUNK_SIZE], | |
| "transaction_id": str(transaction_id), | |
| } | |
| ) |
| chunk = group_ids[i : i + GROUP_CHUNK_SIZE] | ||
| transaction_id = str(uuid4()) | ||
|
|
||
| delete_groups_for_project.apply_async( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As reference, the called task:
| def delete_groups_for_project( |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #103820 +/- ##
========================================
Coverage 80.62% 80.62%
========================================
Files 9281 9282 +1
Lines 396223 396285 +62
Branches 25256 25256
========================================
+ Hits 319438 319488 +50
- Misses 76325 76337 +12
Partials 460 460 |
yuvmen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good! one minor comment
| # Not using the last_seen index to avoid the lack of composite index on status and last_seen | ||
| groups = Group.objects.filter(status__in=statuses_to_delete).values_list( | ||
| "id", "project_id", "last_seen" | ||
| )[:BATCH_LIMIT] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a compund index with status being the first on the index so it should be hitting and using it, you can try and EXPLAIN the query and you will see its hitting the index
|
|
||
| for project_id, group_ids in groups_by_project.items(): | ||
| # Schedule deletion tasks in chunks of GROUP_CHUNK_SIZE | ||
| for i in range(0, len(group_ids), GROUP_CHUNK_SIZE): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, we got the chunked function you can use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will handle this later. I want to merge it now.
markstory
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me. These scheduled tasks will help cleanup any API based deletions that failed and aren't being retried as there isn't a scheduled deletion row to restart those deletions currently.
| # Not using the last_seen index to avoid the lack of composite index on status and last_seen | ||
| groups = Group.objects.filter(status__in=statuses_to_delete).values_list( | ||
| "id", "project_id", "last_seen" | ||
| )[:BATCH_LIMIT] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sentry/src/sentry/models/group.py
Lines 635 to 643 in b2fff1d
| status = BoundedPositiveIntegerField( | |
| default=GroupStatus.UNRESOLVED, | |
| choices=( | |
| (GroupStatus.UNRESOLVED, _("Unresolved")), | |
| (GroupStatus.RESOLVED, _("Resolved")), | |
| (GroupStatus.IGNORED, _("Ignored")), | |
| ), | |
| db_index=True, | |
| ) |
It looks like there is an index on status, and checking locally I have an index on the status column as well.
This is a follow-up to #103820. A failure to delete a group should be re-processed within 24 hours. Changes included: * Schedule re-processing task every 6 hours * Re-process groups pending deletion that are older than 6 hours rather than midnight from the day before
This is a follow-up to #103820. A failure to delete a group should be re-processed within 24 hours. Changes included: * Schedule re-processing task every 6 hours * Re-process groups pending deletion that are older than 6 hours rather than midnight from the day before
This is a follow-up to #103820. A failure to delete a group should be re-processed within 24 hours. Changes included: * Schedule re-processing task every N hours instead of odd schedule * Re-process groups pending deletion that are older than 6 hours rather than midnight from the day before
Sometimes, groups do not get deleted and end up in limbo.
This task will handle groups that are pending deletion before 90 days retention and older than a day.