Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(deletions): Trigger nodestore deletions from group deletions #15065

Merged
merged 4 commits into from Oct 21, 2019

Conversation

lynnagara
Copy link
Member

When we stop writing events to postgres, we will no longer be able to
trigger nodestore deletions from an event.

When we stop writing events to postgres, we will no longer be able to
trigger nodestore deletions from an event.
from ..base import BaseDeletionTask, BaseRelation, ModelDeletionTask, ModelRelation


class GroupNodeDeletionTask(BaseDeletionTask):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be expanded to handle EventAttachment in future

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might also want to handle UserReports this way, since we don't always write a group_id when these are received

@@ -24,6 +24,7 @@
from __future__ import absolute_import

from .base import BulkModelDeletionTask, ModelDeletionTask, ModelRelation # NOQA
from .defaults.group import GroupNodeDeletionTask # NOQA
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for tests :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate? What is the issue there ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the easiest way I could find to overwrite the DEFAULT_CHUNK_SIZE value, so that we would also be running through the batching logic in the test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not following why you cannot just import GroupNodeDeletionTask in the test and overwrite?
But if this is a parameter you want to overwrite, why not making it an option ? It will be very hard to understand (later) why this import is here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I think i messed up the import path before. Removed.

Copy link
Contributor

@fpacifici fpacifici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please provide some details about how the test plan worked ?

@@ -36,6 +89,15 @@ def get_child_relations(self, instance):

relations.extend([ModelRelation(m, {"group_id": instance.id}) for m in model_list])

relations.extend(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The event deletion would still run and thus try to delete the same nodes on nodestore. Would that make the nodestore deletion fail ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is safe to delete the same node twice from nodestore. We are already performing 2 deletes from nodestore right now (which will soon both go away) - one triggered by event deletion https://github.com/getsentry/sentry/blob/master/src/sentry/deletions/defaults/event.py#L8-L15 and one by nodefield deletion https://github.com/getsentry/sentry/blob/master/src/sentry/db/models/fields/node.py#L184).

node_id = Event.generate_node_id(self.project_id, event.id)
node_ids.append(node_id)

nodestore.delete_multi(node_ids)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if this fails half way through and throws an exception?
Specifically:

  • is delete_multi an atomic operation or can it fail with half the node deleted and half not?
  • if this fails, how do we retry the deletion ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete_multi may result in partial deletes, although it won't raise an exception. For the bigtable implementation, the SDK we use has some default retry logic that will be used, but we don't have a way of knowing when it has reached the retry deadline and given up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for delete-multi.
Still this task can throw for other reasons. Do we retry in those cases ?

Copy link
Member Author

@lynnagara lynnagara Oct 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, i think we run Celery deletion tasks with infinite retry if any kind of exception is raised.

Comment on lines +35 to +42
events = eventstore.get_events(
filter=eventstore.Filter(
conditions=conditions, project_ids=[self.project_id], group_ids=[self.group_id]
),
limit=self.DEFAULT_CHUNK_SIZE,
referrer="deletions.group",
orderby=["-timestamp", "-event_id"],
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we guaranteed that this operation happens before clickhouse has dropped the events for retention ? What about failures ? How does the retry policy works and is the deletion retried before the events are dropped by snuba ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe nodestore values will expire after the retention period, so we only need to cover the case where groups are deleted via a user action prior to the retention period end.

@@ -24,6 +24,7 @@
from __future__ import absolute_import

from .base import BulkModelDeletionTask, ModelDeletionTask, ModelRelation # NOQA
from .defaults.group import GroupNodeDeletionTask # NOQA
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate? What is the issue there ?

@lynnagara
Copy link
Member Author

Since we are only adding a nodestore deletion here (and not yet removing from the original place where this happens), the test is going to pass regardless of the code being added here.

I also ran the tests locally omitting this line to make sure that deletion is also being performed from group deletions now https://github.com/getsentry/sentry/blob/master/src/sentry/deletions/defaults/group.py#L34

Copy link
Contributor

@fpacifici fpacifici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's resolve the import issue, otherwise it seems fine

@@ -24,6 +24,7 @@
from __future__ import absolute_import

from .base import BulkModelDeletionTask, ModelDeletionTask, ModelRelation # NOQA
from .defaults.group import GroupNodeDeletionTask # NOQA
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not following why you cannot just import GroupNodeDeletionTask in the test and overwrite?
But if this is a parameter you want to overwrite, why not making it an option ? It will be very hard to understand (later) why this import is here.

node_id = Event.generate_node_id(self.project_id, event.id)
node_ids.append(node_id)

nodestore.delete_multi(node_ids)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for delete-multi.
Still this task can throw for other reasons. Do we retry in those cases ?

Comment on lines 49 to 52
node_ids = []
for event in events:
node_id = Event.generate_node_id(self.project_id, event.id)
node_ids.append(node_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, I think this works well as a list comprehension

node_ids = [Event.generate_node_id(self.project_id, event.id) for event in events]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it

Comment on lines +43 to +51
self.store_event(
data={
"event_id": event_id2,
"timestamp": iso_format(before_now(minutes=1)),
"fingerprint": ["group1"],
},
project_id=project.id,
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd advice you added a third event with a different fingerprint and assert that event is not removed from nodestore.

@lynnagara lynnagara merged commit 552f1ac into master Oct 21, 2019
@lynnagara lynnagara deleted the deletions branch October 21, 2019 17:30
@github-actions github-actions bot locked and limited conversation to collaborators Dec 19, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants