Skip to content

Conversation

@relaxolotl
Copy link
Contributor

@relaxolotl relaxolotl commented Sep 29, 2021

aka reduce the number of angry markus "why is my review requested on this" comments on native PRs.

Ingest can stop being pulled in for reviews on PRs that do not
require their feedback or should not require their feedback.

Copy link
Member

@untitaker untitaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!

@flub
Copy link
Contributor

flub commented Sep 30, 2021

generally LGTM. I put this in NATIVE-259 if you want to link the ticket btw

@flub flub changed the title WIP: ref: Move all symbolicator-related code in store.py to a separate file WIP: ref: Move all symbolicator-related code in store.py to a separate file (NATIVE-259) Sep 30, 2021
relaxolotl and others added 2 commits October 1, 2021 23:11
ingest can stop being pulled in for reviews on PRs that do not
require their feedback or should not require their feedback

NATIVE-259

def _continue_to_process_event():
process_task = process_event_from_reprocessing if from_reprocessing else process_event
# TODO: this uses a private "store" function. is there a way to not do this?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nts: figure out an answer to this, namely "is this ok?" and how to make it ok if it isn't

also probably good to think about the the line that has been drawn by splitting all of the symbolicator stuff out into its own file: are there other places where it's using internal store.py stuff? is there symbolication logic in store.py that could and should be moved out into a method in this file? what are the low hanging fruit for those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now, the compromise is to rename _do_process_event to do_process_event and focus on just splitting the code in this PR. patching up some of the lines drawn by this split, such as using what is supposed to be an internal store.py function in symbolicate.py can come in a follow-up PR.

@relaxolotl
Copy link
Contributor Author

@untitaker can you take a second look at this when you have time?

Copy link
Member

@untitaker untitaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is one change to the task name attribute that really needs to be reverted and tested (probably manually), otherwise I believe this is good to go



@instrumented_task(
name="sentry.tasks.symbolication.symbolicate_event_low_priority",
Copy link
Member

@untitaker untitaker Oct 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I previously said we need to import/re-export all of those new tasks from tasks/store.py, then retracted that comment because that's not how celery dispatch actually works. I mistakenly assumed that Celery works off of import paths (it can do that, it's just not how we set it up). However, if we rename those identifiers away from sentry.tasks.store to sentry.tasks.symbolication I think we have a problem again: In-flight symbolication tasks will fail to dispatch when spawned from old code and run with new codebase.

To test this:

  1. check out master, do not start or have sentry devserver or sentry run worker running
  2. open sentry shell, import one of the tasks from this module and spawn it (mytask.delay(...)) -- you don't really need to provide correct arguments, you just need to demonstrate whether celery finds the right code to execute, in any case you should not get an error in the shell
  3. optional verification step: sentry queues list should list non-zero task count for the symbolication queue (depending on which task you picked). If it is zero, you either didn't spawn the task successfully (step 2 failed) or have some background process running (step 1 failed)
  4. check out this branch
  5. run sentry devserver --workers or run sentry run worker (doesn't matter)
  6. that command will spew an error message that the task could not be imported

if you repeat this test without switching between branches, celery will either execute the task successfully or at least tell you that you spawned it with the wrong arguments (depending on what you passed into .delay()) -- that's why your tests pass

this matters because during deployment we have the same scenario: the code that spawns the celery task can be old while the worker that is supposed to process the task may already run on the new version

If I remember correctly preserving the old value for name= is enough. Having a way to define a legacy alias would be nice too. I think this can be easily patched into the instrumented_task handler, but I am not sure.

We investigated and talked about this in a call but didn't define clear action items afterwards and didn't write anything down (sorry)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can confirm this explodes as described:

17:24:30 worker  | sentry/.venv/lib/python3.6/site-packages/celery/fixups/django.py:206: UserWarning: Using settings.DEBUG leads to a memory
17:24:30 worker  |             leak, never use this setting in production environments!
17:24:30 worker  |   leak, never use this setting in production environments!''')
17:24:31 worker  | Traceback (most recent call last):
17:24:31 worker  |   File "sentry/.venv/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 562, in on_task_received
17:24:31 worker  |     strategy = strategies[type_]
17:24:31 worker  | KeyError: 'sentry.tasks.store.symbolicate_event_low_priority'
17:24:31 worker  | 17:24:31 [ERROR] celery.worker.consumer.consumer: Received unregistered task of type 'sentry.tasks.store.symbolicate_event_low_priority'.
17:24:31 worker  | The message has been ignored and discarded.
17:24:31 worker  | 
17:24:31 worker  | Did you remember to import the module containing this task?
17:24:31 worker  | Or maybe you're using relative imports?
17:24:31 worker  | 
17:24:31 worker  | Please see
17:24:31 worker  | http://docs.celeryq.org/en/latest/internals/protocol.html
17:24:31 worker  | for more information.

for now i'll keep the old names - will follow up with a second PR that explores adding in legacy names. the hardest part will be to figure out when to axe legacy names, but it's better than having to lose events every time a rename happens.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a strong -1 on renaming any of these tasks

"sentry.tasks.sentry_apps",
"sentry.tasks.servicehooks",
"sentry.tasks.store",
"sentry.tasks.symbolication",
Copy link
Member

@untitaker untitaker Oct 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes 👍

this is necessary because if celery is not instructed to import the module, it may be that the tasks cannot be discovered by their task name. the tests and probably local manual testing work regardless of this change because celery probably imports sentry.tasks.symbolication as a side effect of loading another module

Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from sentry.tasks.store import should_demote_symbolication
also needs to be updated with this move

Comment on lines +31 to +37
# The names of tasks and metrics in this file point to tasks.store instead of tasks.symbolicator
# for legacy reasons, namely to prevent celery from dropping older tasks and needing to
# update metrics tooling (e.g. DataDog). All (as of 19/10/2021) of these tasks were moved
# out of tasks/store.py, hence the "store" bit of the name.
#
# New tasks and metrics are welcome to use the correct naming scheme as they are not
# burdened by aforementioned legacy concerns.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it's unlikely that somebody in the future will be reading all of the comments in a PR, i've added this in to explain why the naming scheme in this file continues to use tasks.store instead of tasks.symbolication.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have we considered a different split instead?

  • src/sentry/tasks/store/process.py
  • src/sentry/tasks/store/symbolication.py

That would also make sense and avoid this kerfuffle as well plus paves the way for more structure in the whole store subtree of tasks.

(apologies to bring this up so late, but this isn't a complete drastic change i think)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can make this change so that the "store" bit isn't completely misleading - i'll continue the conversation in #29421 instead of here, just so this can be merged and consequently unblock a PR that depends on this.

Copy link
Contributor Author

@relaxolotl relaxolotl Oct 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jk i'm not doing this that would require me to update imports in 22 files for the store.py rename

@relaxolotl relaxolotl requested review from flub and untitaker October 19, 2021 20:09
@relaxolotl
Copy link
Contributor Author

@flub imports are now updated, and should be ready for a re-review.

did we test locally if events still get processed and symbolicated etc?

i think the conversation in #28973 (comment) covers this, and i'm keeping the original task names (and metrics names for consistency & discoverability) so things continue running smoothly as a result.

@relaxolotl relaxolotl merged commit 0c5c663 into master Oct 20, 2021
@relaxolotl relaxolotl deleted the ref/split-store branch October 20, 2021 17:58
@github-actions github-actions bot locked and limited conversation to collaborators Nov 5, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants