feat(deletions): Trigger nodestore deletions from group deletions #15065

lynnagara · 2019-10-11T23:09:32Z

When we stop writing events to postgres, we will no longer be able to
trigger nodestore deletions from an event.

When we stop writing events to postgres, we will no longer be able to trigger nodestore deletions from an event.

lynnagara · 2019-10-11T23:19:12Z

src/sentry/deletions/defaults/group.py

+from ..base import BaseDeletionTask, BaseRelation, ModelDeletionTask, ModelRelation
+
+
+class GroupNodeDeletionTask(BaseDeletionTask):


Will be expanded to handle EventAttachment in future

We might also want to handle UserReports this way, since we don't always write a group_id when these are received

lynnagara · 2019-10-11T23:19:22Z

src/sentry/deletions/__init__.py

@@ -24,6 +24,7 @@
 from __future__ import absolute_import

 from .base import BulkModelDeletionTask, ModelDeletionTask, ModelRelation  # NOQA
+from .defaults.group import GroupNodeDeletionTask  # NOQA


Just for tests :(

Could you please elaborate? What is the issue there ?

This was the easiest way I could find to overwrite the DEFAULT_CHUNK_SIZE value, so that we would also be running through the batching logic in the test.

Not following why you cannot just import GroupNodeDeletionTask in the test and overwrite?
But if this is a parameter you want to overwrite, why not making it an option ? It will be very hard to understand (later) why this import is here.

Ah sorry, I think i messed up the import path before. Removed.

fpacifici

Could you please provide some details about how the test plan worked ?

fpacifici · 2019-10-14T19:25:54Z

src/sentry/deletions/defaults/group.py

@@ -36,6 +89,15 @@ def get_child_relations(self, instance):

        relations.extend([ModelRelation(m, {"group_id": instance.id}) for m in model_list])

+        relations.extend(


The event deletion would still run and thus try to delete the same nodes on nodestore. Would that make the nodestore deletion fail ?

It is safe to delete the same node twice from nodestore. We are already performing 2 deletes from nodestore right now (which will soon both go away) - one triggered by event deletion https://github.com/getsentry/sentry/blob/master/src/sentry/deletions/defaults/event.py#L8-L15 and one by nodefield deletion https://github.com/getsentry/sentry/blob/master/src/sentry/db/models/fields/node.py#L184).

fpacifici · 2019-10-14T19:30:55Z

src/sentry/deletions/defaults/group.py

+            node_id = Event.generate_node_id(self.project_id, event.id)
+            node_ids.append(node_id)
+
+        nodestore.delete_multi(node_ids)


What happens if this fails half way through and throws an exception?
Specifically:

is delete_multi an atomic operation or can it fail with half the node deleted and half not?

if this fails, how do we retry the deletion ?

delete_multi may result in partial deletes, although it won't raise an exception. For the bigtable implementation, the SDK we use has some default retry logic that will be used, but we don't have a way of knowing when it has reached the retry deadline and given up.

ok for delete-multi.
Still this task can throw for other reasons. Do we retry in those cases ?

Yeah, i think we run Celery deletion tasks with infinite retry if any kind of exception is raised.

fpacifici · 2019-10-14T19:33:07Z

src/sentry/deletions/defaults/group.py

+        events = eventstore.get_events(
+            filter=eventstore.Filter(
+                conditions=conditions, project_ids=[self.project_id], group_ids=[self.group_id]
+            ),
+            limit=self.DEFAULT_CHUNK_SIZE,
+            referrer="deletions.group",
+            orderby=["-timestamp", "-event_id"],
+        )


Are we guaranteed that this operation happens before clickhouse has dropped the events for retention ? What about failures ? How does the retry policy works and is the deletion retried before the events are dropped by snuba ?

I believe nodestore values will expire after the retention period, so we only need to cover the case where groups are deleted via a user action prior to the retention period end.

fpacifici · 2019-10-14T19:33:58Z

src/sentry/deletions/__init__.py

@@ -24,6 +24,7 @@
 from __future__ import absolute_import

 from .base import BulkModelDeletionTask, ModelDeletionTask, ModelRelation  # NOQA
+from .defaults.group import GroupNodeDeletionTask  # NOQA


Could you please elaborate? What is the issue there ?

lynnagara · 2019-10-16T18:59:57Z

Since we are only adding a nodestore deletion here (and not yet removing from the original place where this happens), the test is going to pass regardless of the code being added here.

I also ran the tests locally omitting this line to make sure that deletion is also being performed from group deletions now https://github.com/getsentry/sentry/blob/master/src/sentry/deletions/defaults/group.py#L34

fpacifici

Let's resolve the import issue, otherwise it seems fine

fpacifici · 2019-10-17T21:11:11Z

src/sentry/deletions/__init__.py

@@ -24,6 +24,7 @@
 from __future__ import absolute_import

 from .base import BulkModelDeletionTask, ModelDeletionTask, ModelRelation  # NOQA
+from .defaults.group import GroupNodeDeletionTask  # NOQA


Not following why you cannot just import GroupNodeDeletionTask in the test and overwrite?
But if this is a parameter you want to overwrite, why not making it an option ? It will be very hard to understand (later) why this import is here.

fpacifici · 2019-10-17T21:16:04Z

src/sentry/deletions/defaults/group.py

+            node_id = Event.generate_node_id(self.project_id, event.id)
+            node_ids.append(node_id)
+
+        nodestore.delete_multi(node_ids)


ok for delete-multi.
Still this task can throw for other reasons. Do we retry in those cases ?

fpacifici · 2019-10-17T22:21:56Z

src/sentry/deletions/defaults/group.py

+        node_ids = []
+        for event in events:
+            node_id = Event.generate_node_id(self.project_id, event.id)
+            node_ids.append(node_id)


nit, I think this works well as a list comprehension

node_ids = [Event.generate_node_id(self.project_id, event.id) for event in events]

fpacifici · 2019-10-17T22:24:49Z

tests/sentry/deletions/test_group.py

+        self.store_event(
+            data={
+                "event_id": event_id2,
                "timestamp": iso_format(before_now(minutes=1)),
+                "fingerprint": ["group1"],
            },
            project_id=project.id,
        )
+


I'd advice you added a third event with a different fingerprint and assert that event is not removed from nodestore.

feat(deletions): Trigger nodestore deletions from group deletions

66c3727

When we stop writing events to postgres, we will no longer be able to trigger nodestore deletions from an event.

lynnagara commented Oct 11, 2019

View reviewed changes

fpacifici reviewed Oct 14, 2019

View reviewed changes

fpacifici requested changes Oct 17, 2019

View reviewed changes

fix import

c6ae2f6

fpacifici approved these changes Oct 17, 2019

View reviewed changes

lynnagara added 2 commits October 18, 2019 00:05

list comprehension

249f19c

add test with different group

59fb239

lynnagara merged commit 552f1ac into master Oct 21, 2019

lynnagara deleted the deletions branch October 21, 2019 17:30

github-actions bot locked and limited conversation to collaborators Dec 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(deletions): Trigger nodestore deletions from group deletions #15065

feat(deletions): Trigger nodestore deletions from group deletions #15065

lynnagara commented Oct 11, 2019

lynnagara Oct 11, 2019

lynnagara Oct 14, 2019

lynnagara Oct 11, 2019

fpacifici Oct 14, 2019

lynnagara Oct 16, 2019

fpacifici Oct 17, 2019

lynnagara Oct 17, 2019

fpacifici left a comment

fpacifici Oct 14, 2019

lynnagara Oct 16, 2019

fpacifici Oct 14, 2019

lynnagara Oct 16, 2019

fpacifici Oct 17, 2019

lynnagara Oct 17, 2019 •

edited

fpacifici Oct 14, 2019

lynnagara Oct 16, 2019

fpacifici Oct 14, 2019

lynnagara commented Oct 16, 2019

fpacifici left a comment

fpacifici Oct 17, 2019

fpacifici Oct 17, 2019

fpacifici Oct 17, 2019

lynnagara Oct 18, 2019

fpacifici Oct 17, 2019

		from ..base import BaseDeletionTask, BaseRelation, ModelDeletionTask, ModelRelation


		class GroupNodeDeletionTask(BaseDeletionTask):

		@@ -36,6 +89,15 @@ def get_child_relations(self, instance):

		relations.extend([ModelRelation(m, {"group_id": instance.id}) for m in model_list])

		relations.extend(

Navigation Menu

feat(deletions): Trigger nodestore deletions from group deletions #15065

feat(deletions): Trigger nodestore deletions from group deletions #15065

Conversation

lynnagara commented Oct 11, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fpacifici left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lynnagara Oct 17, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lynnagara commented Oct 16, 2019

fpacifici left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lynnagara Oct 17, 2019 •

edited