Skip to content

Conversation

@relaxolotl
Copy link
Contributor

@relaxolotl relaxolotl commented Oct 19, 2021

There's currently a significant cost (well, greater than what it should be) associated with renaming celery tasks. As mentioned in #28973 (comment) which can be verified via the instructions found in #28973 (comment), renaming tasks will cause all in-flight tasks associated with the previous name to fail and be dropped. One must be willing to eat the cost of these dropped in-flight tasks which could end up being a fairly costly decision, meaning that there's a fairly high impact associated with something which doesn't make any meaningful change to any of the logic in the code.

This attempts to make such changes more painless by adding an additional parameter legacy_names to the instrumented_task decorator: if a task is renamed, previous names can be added to the legacy_names list which will take care of making sure that any tasks associated with the previous name(s) will complete. Metrics associated with instrumented tasks has also been updated appropriately so it's possible to determine when a certain legacy name is no longer in use, and can be removed from the task.


The nitty gritty implementation overview: This registers additional tasks with the celery registry, one per legacy name supplied in legacy_names. None of these are associated with the actual task that is exposed and exported to other modules; Only the task associated with the current name as specified by name will be the one that is available and invokable by other modules.

@relaxolotl
Copy link
Contributor Author

@tonyo @oioki: This PR also updates the metrics associated with symbolication tasks to reflect the namespace change that was made in #28973. This will most likely cause a minor blip in our current metrics dashboards, which I plan to update if/after this gets merged. I'd like to get your thoughts on this in case this should be reverted, or if there needs to be additional prep work done in anticipation of this change.

cc @flub @loewenheim @Swatinem

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this registers a handler for any task with the given name?
Meaning that we register the same handler with multiple names, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, this registers the handler with all of the provided legacy names.

Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like this, it's too much complexity and moving for very little gain. Why can't we keep those things being named by their original task names forever?

@flub
Copy link
Contributor

flub commented Oct 20, 2021

See also #28973 (comment) which is related to this opinion 😄

@relaxolotl
Copy link
Contributor Author

relaxolotl commented Oct 20, 2021

I don't really like this, it's too much complexity and moving for very little gain. Why can't we keep those things being named by their original task names forever?

These names are currently being generated in a way such that they have some amount of meaning associated with the actual task itself. celery's documentation also encourages using module names as namespaces. Semantic names like these are expected to change over time to reflect changes in the code, its use cases, and its ownership. Keeping these immutable and pinned to their original value applies restrictions to these names that poorly reflect how they're being used and interpreted.

Let's consider the scenario that motivated this change to begin with: subsections of a shared file, store.py were being split out into a new file to better reflect code ownership of that file, and to prevent teams from being blocked by approvals from other teams that had no stake in the change.

#28973 (comment) suggests to rename the paths of the files such that they conform to the existing naming task scheme: tasks/store.py splits off into tasks/store/processing.py and tasks/store/symbolication.py. This works as a compromise, but it's coercing the directory structure to follow a naming decision that was made in the past that may not be applicable in the present. How does /store relate to the current contents of tasks/store.py? What does "store" mean in relation to symbolicating events, and general event processing?

I think that the greater problem that this unearths is that perhaps such names shouldn't be assigned any meaning if they're expected to stay static through a task's entire lifetime. If names were generated with zero or minimal association with the actual task's purpose, then it wouldn't make sense to change task names under most circumstances, barring maybe something like an accidental naming collision. The change in this PR at least opens up the ability to re-assign names if needed, buying us time to at least have a discussion about the naming scheme of tasks and if the issue outlined above warrants some reconsideration about how tasks are named.

@flub
Copy link
Contributor

flub commented Oct 20, 2021

I don't really like this, it's too much complexity and moving for very little gain. Why can't we keep those things being named by their original task names forever?

These names are currently being generated in a way such that they have some amount of meaning associated with the actual task itself. celery's documentation also encourages using module names as namespaces. Semantic names like these are expected to change over time to reflect changes in the code, its use cases, and its ownership. Keeping these immutable and pinned to their original value applies restrictions to these names that poorly reflect how they're being used and interpreted.

The Celery documentation describes in details how it can auto-name tasks based on the modules. I wouldn't claim (or at least I didn't see) that it encourages that. I think the sentry code base chose a very good pattern of making the task names explicit and to not let them depend on the modules where the code happens to live. This exactly decouples the task naming from the module naming which is great for refactoring.

Let's consider the scenario that motivated this change to begin with: subsections of a shared file, store.py were being split out into a new file to better reflect code ownership of that file, and to prevent teams from being blocked by approvals from other teams that had no stake in the change.

#28973 (comment) suggests to rename the paths of the files such that they conform to the existing naming task scheme: tasks/store.py splits off into tasks/store/processing.py and tasks/store/symbolication.py. This works as a compromise, but it's coercing the directory structure to follow a naming decision that was made in the past that may not be applicable in the present. How does /store relate to the current contents of tasks/store.py? What does "store" mean in relation to symbolicating events, and general event processing?

I would argue that the current task naming is logical: We have a bunch of tasks that are all related to the part of the pipeline that's involved in "storing" a received event: sentry.tasks.store.*. And I see no reason to start changing these task names. The fact that the current module layout maps neatly onto the task layout is more a sign that the code was structured logically after how the tasks was, which is generally fairly reasonable. The code might as well have lived in sentry.processing.tasks.store.py or some other reasonable location and the task names would still have made sense.

Now the implementation in this single module is getting unwieldy for a number of reasons and we want to move the code around. I already argued why I don't think renaming the tasks is appropriate, so it's just a question of how we would like to structure the code of the implementation. The current store.py + symbolication.py split is reasonable, however it made people wonder why the tasks names are different from the module names. So an alternative proposal that more clearly relates all the "storing a received event" tasks would be to instead create a sentry.tasks.store a package instead of a module.

I by far feel not as strong about where the split up code ends up living as to not renaming the tasks. I'm fine with the split PR as is. Also because I now (think I) understand how Celery finds tasks and it is fairly easy to make sure the tasks are available in the workers after moving things around I think if further restructuring of code is desired for something it is not something that will be that difficult.

I think that the greater problem that this unearths is that perhaps such names shouldn't be assigned any meaning if they're expected to stay static through a task's entire lifetime. If names were generated with zero or minimal association with the actual task's purpose, then it wouldn't make sense to change task names under most circumstances, barring maybe something like an accidental naming collision. The change in this PR at least opens up the ability to re-assign names if needed, buying us time to at least have a discussion about the naming scheme of tasks and if the issue outlined above warrants some reconsideration about how tasks are named.

This is maybe going slightly off topic, but naming tasks logically makes sense to me. It makes understanding the code and architecture much easier. It makes debugging easier, it facilitates prod hot fixes and prodding in the prod data during incidents etc.

Base automatically changed from ref/split-store to master October 20, 2021 17:58
@relaxolotl relaxolotl requested a review from a team October 20, 2021 17:58
@relaxolotl relaxolotl requested a review from a team as a code owner October 20, 2021 17:58
@relaxolotl
Copy link
Contributor Author

relaxolotl commented Oct 20, 2021

The Celery documentation describes in details how it can auto-name tasks based on the modules. I wouldn't claim (or at least I didn't see) that it encourages that.

I'm making a nitpicky point here, but it states this underneath the linked section:

A best practice is to use the module name as a name-space, this way names won’t collide if there’s already a task with that name defined in another module.

Which reads pretty heavily to me as encouragement, or a recommendation. The documentation does explain how its auto-naming behaviour behaviour works as you've mentioned, but this is a recommendation outside of that explanation to use the module name.

I think the sentry code base chose a very good pattern of making the task names explicit and to not let them depend on the modules where the code happens to live. This exactly decouples the task naming from the module naming which is great for refactoring.

I'm confused about this assertion, particularly the bit about decoupling: As far as I can tell, a large number of task names are scoped by the modules they live in: everything in store, everything in activity, app_store_connect, commits, deletion, reprocessing2, etc. It is true that some tasks do not exhibit this coupling, such as those in code_owners, beacon and reprocessing, but I would say that closer to more than half, if not a majority of task names are scoped by their module. That's a large number of tasks which are coupled with the module that they live in, meaning that any refactors that change the module they live in would be affected by the rename problem that this is attempting to solve.

Regarding the actual naming choice of store.py, I feel a large part of this section is constructed off of the assertion that "store" is an appropriate name for the tasks in store.py:

We have a bunch of tasks that are all related to the part of the pipeline that's involved in "storing" a received event: sentry.tasks.store.*.

So an alternative proposal that more clearly relates all the "storing a received event" tasks would be to instead create a sentry.tasks.store a package instead of a module.

Perhaps this is splitting hairs, I would be hesitant to agree that a majority of the tasks that live in store.py and symbolication.py are necessarily related to "storing (events)" at this point. Looking at the composition of these files, there are 4 functions between store.py and symbolication.py that concern themselves with the actual act of storing an event: save_event, _do_save_event, submit_save_event and create_failed_event. One could argue that delete_raw_event falls into this as well, based on its name. Something to note is that only save_event is a task out of that set of functions.

All of the other functions in both files are concerned about processing events prior to their storage, and simply mutate events while letting those 4-5 aforementioned functions take care of the actual work of storing them. preprocess_event is responsible for determining how an event should be routed for processing, process_event applies data scrubbing, symbolication, and plugin-specific changes to events. symbolicate_event fills in symbolication data in an event. All of these defer to save_event once they have completed, and therefore aren't particularly concerned with how or where their events are stored.

This means that approximately 5/23 functions in these files, or 1/10 tasks are actually related to event storage. As a result, this leads me to believe that "store" is more of a misnomer than an accurate name for these tasks. Despite this, I'm not willing to die on a hill for this specific name change, and if you feel strongly enough that "store" is an appropriate name then I can drop those parts of the diff from the PR.

Hair splitting about names aside, my main concern here is how to make it easier to rename tasks. I would be open to other ideas on how one could restructure existing modules that contain tasks while avoiding the issues outlined in the PR description (ie event dropping). In other words, I'm curious as to what you mean by the below statement:

it is fairly easy to make sure the tasks are available in the workers after moving things around I think if further restructuring of code is desired for something it is not something that will be that difficult.

I'm assuming that this is saying that it's actually fairly easy to ensure that tasks whose names have been changed will not experience a loss in events after the change is deployed. Could you elaborate on this, if you feel that it's a better solution to the problem than what's proposed in the PR?

This is maybe going slightly off topic, but naming tasks logically makes sense to me. It makes understanding the code and architecture much easier. It makes debugging easier, it facilitates prod hot fixes and prodding in the prod data during incidents etc.

I feel this at least agrees with what this PR is trying to accomplish. All it's trying to do is to make it easier for task names to correctly reflect their purpose and/or location in the codebase, instead of legacy names that cannot be easily changed.

The point that paragraph is making is that I don't think that a value like a task's name should be used in a field that is expected to be set in stone and immutable after conception. I think a potential solution could be to use a stable, less meaningful identifier for a task name for the purposes of registering it with celery, and a secondary name that prioritizes human-readability.

@flub
Copy link
Contributor

flub commented Oct 21, 2021

The Celery documentation describes in details how it can auto-name tasks based on the modules. I wouldn't claim (or at least I didn't see) that it encourages that.

I'm making a nitpicky point here, but it states this underneath the linked section:

A best practice is to use the module name as a name-space, this way names won’t collide if there’s already a task with that name defined in another module.

Which reads pretty heavily to me as encouragement, or a recommendation. The documentation does explain how its auto-naming behaviour behaviour works as you've mentioned, but this is a recommendation outside of that explanation to use the module name.

Ok, fine. I'll go a step further and say I disagree with celery's assertion in that case 😄

I think the sentry code base chose a very good pattern of making the task names explicit and to not let them depend on the modules where the code happens to live. This exactly decouples the task naming from the module naming which is great for refactoring.

I'm confused about this assertion, particularly the bit about decoupling: As far as I can tell, a large number of task names are scoped by the modules they live in: [...]

What I mean is that every invocation of @instrumeted_task carries a name='...' argument (as far as I could tell). That is why I say it is decoupled because the name is not automatically inferred from the module the function names the code happens to live in. And #28973 shows that this allows us to move code around without having to deal with renamed tasks.

Hair splitting about names aside, my main concern here is how to make it easier to rename tasks.

My main assertion is that is it not desirable to ever rename a task. My secondary assertion is that task names do not need to match module and function names and that these will diverge over time due to refactoring, and that this is fine.

I would be open to other ideas on how one could restructure existing modules that contain tasks while avoiding the issues outlined in the PR description (ie event dropping). In other words, I'm curious as to what you mean by the below statement:

it is fairly easy to make sure the tasks are available in the workers after moving things around I think if further restructuring of code is desired for something it is not something that will be that difficult.

By doing what was done in #28973: you can move the implementation of tasks without renaming the tasks.


I've tried to explain to explain why I think it is a bad idea to rename tasks. If none of the reasoning is convincing and you really believe this PR is important than go ahead, I'm now assuming it is because you disagree with me rather than that I failed to explain myself.

@relaxolotl
Copy link
Contributor Author

considering the only usage of this change has now been removed, i'm closing this PR.

@relaxolotl relaxolotl closed this Oct 21, 2021
@github-actions github-actions bot locked and limited conversation to collaborators Nov 6, 2021
@mgaeta mgaeta deleted the ref/legacy-task-names branch March 18, 2022 20:29
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants