Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(deletions): Trigger nodestore deletions from group deletions #15065

Merged
merged 4 commits into from Oct 21, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions src/sentry/deletions/__init__.py
Expand Up @@ -24,6 +24,7 @@
from __future__ import absolute_import

from .base import BulkModelDeletionTask, ModelDeletionTask, ModelRelation # NOQA
from .defaults.group import GroupNodeDeletionTask # NOQA
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for tests :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate? What is the issue there ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the easiest way I could find to overwrite the DEFAULT_CHUNK_SIZE value, so that we would also be running through the batching logic in the test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not following why you cannot just import GroupNodeDeletionTask in the test and overwrite?
But if this is a parameter you want to overwrite, why not making it an option ? It will be very hard to understand (later) why this import is here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I think i messed up the import path before. Removed.

from .manager import DeletionTaskManager

default_manager = DeletionTaskManager(default_task=ModelDeletionTask)
Expand Down
64 changes: 63 additions & 1 deletion src/sentry/deletions/defaults/group.py
@@ -1,6 +1,59 @@
from __future__ import absolute_import, print_function

from ..base import ModelDeletionTask, ModelRelation
from sentry import eventstore, nodestore
from sentry.models import Event

from ..base import BaseDeletionTask, BaseRelation, ModelDeletionTask, ModelRelation


class GroupNodeDeletionTask(BaseDeletionTask):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be expanded to handle EventAttachment in future

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might also want to handle UserReports this way, since we don't always write a group_id when these are received

"""
Deletes nodestore data for group
"""

DEFAULT_CHUNK_SIZE = 10000

def __init__(self, manager, group_id, project_id, **kwargs):
self.group_id = group_id
self.project_id = project_id
self.last_event = None
super(GroupNodeDeletionTask, self).__init__(manager, **kwargs)

def chunk(self):
conditions = []
if self.last_event is not None:
conditions.extend(
[
["timestamp", "<=", self.last_event.timestamp],
[
["timestamp", "<", self.last_event.timestamp],
["event_id", "<", self.last_event.event_id],
],
]
)

events = eventstore.get_events(
filter=eventstore.Filter(
conditions=conditions, project_ids=[self.project_id], group_ids=[self.group_id]
),
limit=self.DEFAULT_CHUNK_SIZE,
referrer="deletions.group",
orderby=["-timestamp", "-event_id"],
)
Comment on lines +35 to +42
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we guaranteed that this operation happens before clickhouse has dropped the events for retention ? What about failures ? How does the retry policy works and is the deletion retried before the events are dropped by snuba ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe nodestore values will expire after the retention period, so we only need to cover the case where groups are deleted via a user action prior to the retention period end.


if not events:
return False

self.last_event = events[-1]

node_ids = []
for event in events:
node_id = Event.generate_node_id(self.project_id, event.id)
node_ids.append(node_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, I think this works well as a list comprehension

node_ids = [Event.generate_node_id(self.project_id, event.id) for event in events]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it


nodestore.delete_multi(node_ids)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if this fails half way through and throws an exception?
Specifically:

  • is delete_multi an atomic operation or can it fail with half the node deleted and half not?
  • if this fails, how do we retry the deletion ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete_multi may result in partial deletes, although it won't raise an exception. For the bigtable implementation, the SDK we use has some default retry logic that will be used, but we don't have a way of knowing when it has reached the retry deadline and given up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for delete-multi.
Still this task can throw for other reasons. Do we retry in those cases ?

Copy link
Member Author

@lynnagara lynnagara Oct 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, i think we run Celery deletion tasks with infinite retry if any kind of exception is raised.


return True


class GroupDeletionTask(ModelDeletionTask):
Expand Down Expand Up @@ -36,6 +89,15 @@ def get_child_relations(self, instance):

relations.extend([ModelRelation(m, {"group_id": instance.id}) for m in model_list])

relations.extend(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The event deletion would still run and thus try to delete the same nodes on nodestore. Would that make the nodestore deletion fail ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is safe to delete the same node twice from nodestore. We are already performing 2 deletes from nodestore right now (which will soon both go away) - one triggered by event deletion https://github.com/getsentry/sentry/blob/master/src/sentry/deletions/defaults/event.py#L8-L15 and one by nodefield deletion https://github.com/getsentry/sentry/blob/master/src/sentry/db/models/fields/node.py#L184).

[
BaseRelation(
{"group_id": instance.id, "project_id": instance.project_id},
GroupNodeDeletionTask,
)
]
)

return relations

def delete_instance(self, instance):
Expand Down
36 changes: 28 additions & 8 deletions tests/sentry/deletions/test_group.py
Expand Up @@ -13,32 +13,46 @@
ScheduledDeletion,
UserReport,
)
from sentry import nodestore
from sentry.deletions import GroupNodeDeletionTask
from sentry.tasks.deletion import run_deletion
from sentry.testutils import TestCase

from sentry.testutils import TestCase, SnubaTestCase
from sentry.testutils.helpers.datetime import iso_format, before_now


class DeleteGroupTest(TestCase):
class DeleteGroupTest(TestCase, SnubaTestCase):
def test_simple(self):
key = "key"
value = "value"

GroupNodeDeletionTask.DEFAULT_CHUNK_SIZE = 1 # test chunking logic
event_id = "a" * 32
event_id2 = "b" * 32
project = self.create_project()
node_id = Event.generate_node_id(project.id, event_id)
node_id2 = Event.generate_node_id(project.id, event_id2)

event = self.store_event(
data={
"event_id": event_id,
"tags": {key: value},
"tags": {"foo": "bar"},
"timestamp": iso_format(before_now(minutes=1)),
"fingerprint": ["group1"],
},
project_id=project.id,
)

self.store_event(
data={
"event_id": event_id2,
"timestamp": iso_format(before_now(minutes=1)),
"fingerprint": ["group1"],
},
project_id=project.id,
)

Comment on lines +45 to +53
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd advice you added a third event with a different fingerprint and assert that event is not removed from nodestore.

group = event.group
group.update(status=GroupStatus.PENDING_DELETION)

project = self.create_project()
group = self.create_group(project=project)
event = self.create_event(group=group)

UserReport.objects.create(group_id=group.id, project_id=event.project_id, name="Jane Doe")

Expand All @@ -50,6 +64,9 @@ def test_simple(self):
deletion = ScheduledDeletion.schedule(group, days=0)
deletion.update(in_progress=True)

assert nodestore.get(node_id)
assert nodestore.get(node_id2)

with self.tasks():
run_deletion(deletion.id)

Expand All @@ -58,3 +75,6 @@ def test_simple(self):
assert not GroupRedirect.objects.filter(group_id=group.id).exists()
assert not GroupHash.objects.filter(group_id=group.id).exists()
assert not Group.objects.filter(id=group.id).exists()

assert not nodestore.get(node_id)
assert not nodestore.get(node_id2)
26 changes: 25 additions & 1 deletion tests/sentry/tasks/test_deletion.py
Expand Up @@ -6,6 +6,7 @@

import pytest

from sentry import nodestore
from sentry.constants import ObjectStatus
from sentry.exceptions import DeleteAborted
from sentry.models import (
Expand Down Expand Up @@ -178,12 +179,30 @@ def test_cancels_without_pending_status(self):
class DeleteGroupTest(TestCase):
def test_simple(self):
event_id = "a" * 32
event_id_2 = "b" * 32
project = self.create_project()

node_id = Event.generate_node_id(project.id, event_id)
node_id_2 = Event.generate_node_id(project.id, event_id_2)

event = self.store_event(
data={"event_id": event_id, "timestamp": iso_format(before_now(minutes=1))},
data={
"event_id": event_id,
"timestamp": iso_format(before_now(minutes=1)),
"fingerprint": ["group1"],
},
project_id=project.id,
)

self.store_event(
data={
"event_id": event_id_2,
"timestamp": iso_format(before_now(minutes=1)),
"fingerprint": ["group1"],
},
project_id=project.id,
)

group = event.group
group.update(status=GroupStatus.PENDING_DELETION)

Expand All @@ -192,13 +211,18 @@ def test_simple(self):
GroupMeta.objects.create(group=group, key="foo", value="bar")
GroupRedirect.objects.create(group_id=group.id, previous_group_id=1)

assert nodestore.get(node_id)
assert nodestore.get(node_id_2)

with self.tasks():
delete_groups(object_ids=[group.id])

assert not Event.objects.filter(id=event.id).exists()
assert not GroupRedirect.objects.filter(group_id=group.id).exists()
assert not GroupHash.objects.filter(group_id=group.id).exists()
assert not Group.objects.filter(id=group.id).exists()
assert not nodestore.get(node_id)
assert not nodestore.get(node_id_2)


class DeleteApplicationTest(TestCase):
Expand Down