-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
ref: Move all symbolicator-related code in store.py to a separate file (NATIVE-259) #28973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
untitaker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you!
|
generally LGTM. I put this in NATIVE-259 if you want to link the ticket btw |
5fc9d7f to
755aaec
Compare
c760372 to
023645f
Compare
ingest can stop being pulled in for reviews on PRs that do not require their feedback or should not require their feedback NATIVE-259
023645f to
a2212cd
Compare
src/sentry/tasks/symbolication.py
Outdated
|
|
||
| def _continue_to_process_event(): | ||
| process_task = process_event_from_reprocessing if from_reprocessing else process_event | ||
| # TODO: this uses a private "store" function. is there a way to not do this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nts: figure out an answer to this, namely "is this ok?" and how to make it ok if it isn't
also probably good to think about the the line that has been drawn by splitting all of the symbolicator stuff out into its own file: are there other places where it's using internal store.py stuff? is there symbolication logic in store.py that could and should be moved out into a method in this file? what are the low hanging fruit for those?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now, the compromise is to rename _do_process_event to do_process_event and focus on just splitting the code in this PR. patching up some of the lines drawn by this split, such as using what is supposed to be an internal store.py function in symbolicate.py can come in a follow-up PR.
a2212cd to
c760372
Compare
c760372 to
7fd4abc
Compare
|
@untitaker can you take a second look at this when you have time? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is one change to the task name attribute that really needs to be reverted and tested (probably manually), otherwise I believe this is good to go
src/sentry/tasks/symbolication.py
Outdated
|
|
||
|
|
||
| @instrumented_task( | ||
| name="sentry.tasks.symbolication.symbolicate_event_low_priority", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I previously said we need to import/re-export all of those new tasks from tasks/store.py, then retracted that comment because that's not how celery dispatch actually works. I mistakenly assumed that Celery works off of import paths (it can do that, it's just not how we set it up). However, if we rename those identifiers away from sentry.tasks.store to sentry.tasks.symbolication I think we have a problem again: In-flight symbolication tasks will fail to dispatch when spawned from old code and run with new codebase.
To test this:
- check out
master, do not start or havesentry devserverorsentry run workerrunning - open
sentry shell, import one of the tasks from this module and spawn it (mytask.delay(...)) -- you don't really need to provide correct arguments, you just need to demonstrate whether celery finds the right code to execute, in any case you should not get an error in the shell - optional verification step:
sentry queues listshould list non-zero task count for the symbolication queue (depending on which task you picked). If it is zero, you either didn't spawn the task successfully (step 2 failed) or have some background process running (step 1 failed) - check out this branch
- run
sentry devserver --workersor runsentry run worker(doesn't matter) - that command will spew an error message that the task could not be imported
if you repeat this test without switching between branches, celery will either execute the task successfully or at least tell you that you spawned it with the wrong arguments (depending on what you passed into .delay()) -- that's why your tests pass
this matters because during deployment we have the same scenario: the code that spawns the celery task can be old while the worker that is supposed to process the task may already run on the new version
If I remember correctly preserving the old value for name= is enough. Having a way to define a legacy alias would be nice too. I think this can be easily patched into the instrumented_task handler, but I am not sure.
We investigated and talked about this in a call but didn't define clear action items afterwards and didn't write anything down (sorry)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can confirm this explodes as described:
17:24:30 worker | sentry/.venv/lib/python3.6/site-packages/celery/fixups/django.py:206: UserWarning: Using settings.DEBUG leads to a memory
17:24:30 worker | leak, never use this setting in production environments!
17:24:30 worker | leak, never use this setting in production environments!''')
17:24:31 worker | Traceback (most recent call last):
17:24:31 worker | File "sentry/.venv/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 562, in on_task_received
17:24:31 worker | strategy = strategies[type_]
17:24:31 worker | KeyError: 'sentry.tasks.store.symbolicate_event_low_priority'
17:24:31 worker | 17:24:31 [ERROR] celery.worker.consumer.consumer: Received unregistered task of type 'sentry.tasks.store.symbolicate_event_low_priority'.
17:24:31 worker | The message has been ignored and discarded.
17:24:31 worker |
17:24:31 worker | Did you remember to import the module containing this task?
17:24:31 worker | Or maybe you're using relative imports?
17:24:31 worker |
17:24:31 worker | Please see
17:24:31 worker | http://docs.celeryq.org/en/latest/internals/protocol.html
17:24:31 worker | for more information.
for now i'll keep the old names - will follow up with a second PR that explores adding in legacy names. the hardest part will be to figure out when to axe legacy names, but it's better than having to lose events every time a rename happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a strong -1 on renaming any of these tasks
| "sentry.tasks.sentry_apps", | ||
| "sentry.tasks.servicehooks", | ||
| "sentry.tasks.store", | ||
| "sentry.tasks.symbolication", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes 👍
this is necessary because if celery is not instructed to import the module, it may be that the tasks cannot be discovered by their task name. the tests and probably local manual testing work regardless of this change because celery probably imports sentry.tasks.symbolication as a side effect of loading another module
flub
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| from sentry.tasks.store import should_demote_symbolication |
| # The names of tasks and metrics in this file point to tasks.store instead of tasks.symbolicator | ||
| # for legacy reasons, namely to prevent celery from dropping older tasks and needing to | ||
| # update metrics tooling (e.g. DataDog). All (as of 19/10/2021) of these tasks were moved | ||
| # out of tasks/store.py, hence the "store" bit of the name. | ||
| # | ||
| # New tasks and metrics are welcome to use the correct naming scheme as they are not | ||
| # burdened by aforementioned legacy concerns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since it's unlikely that somebody in the future will be reading all of the comments in a PR, i've added this in to explain why the naming scheme in this file continues to use tasks.store instead of tasks.symbolication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have we considered a different split instead?
src/sentry/tasks/store/process.pysrc/sentry/tasks/store/symbolication.py
That would also make sense and avoid this kerfuffle as well plus paves the way for more structure in the whole store subtree of tasks.
(apologies to bring this up so late, but this isn't a complete drastic change i think)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i can make this change so that the "store" bit isn't completely misleading - i'll continue the conversation in #29421 instead of here, just so this can be merged and consequently unblock a PR that depends on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jk i'm not doing this that would require me to update imports in 22 files for the store.py rename
|
@flub imports are now updated, and should be ready for a re-review.
i think the conversation in #28973 (comment) covers this, and i'm keeping the original task names (and metrics names for consistency & discoverability) so things continue running smoothly as a result. |
aka reduce the number of angry markus "why is my review requested on this" comments on native PRs.
Ingest can stop being pulled in for reviews on PRs that do not
require their feedback or should not require their feedback.