Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

investigate oop with tasks in celery #62

Closed
hancush opened this issue Apr 25, 2018 · 2 comments
Closed

investigate oop with tasks in celery #62

hancush opened this issue Apr 25, 2018 · 2 comments

Comments

@hancush
Copy link
Member

hancush commented Apr 25, 2018

we've defined a base task class with dynamic shared context for all of our delayed work in e52602f. this context is contigent on the standardized file we're operating on, e.g., each task expects a standardized file id.

however, the need for dynamic context introduces a challenge, because "the __init__ constructor (of the Task class) will only be called once per process."

this means we cannot use the __init__ method to establish context, given a standardized file id. instead, we define a setup method that accepts this id and sets class attributes accordingly.

we run this method each time a task is issued via celery's task_prerun signal. this signal provides access to the pending task (sender), as well as its args and kwargs, with which we can run setup prior to executing any task code.

this has the effect of giving us access to those contextual attributes in the task, without having to call setup at the top of each one.

however, it feels a little hacky.

when a task method is bound to a base task class, the code in the bound task is injected into that class as the run method. however, because we are not in a class context in our task method, it's not possible to define a common run method the base class and extend it via super() in the method because the task code is injected as the run method of the base class, running super(BaseClass, self).run() calls the run method of celery's Task class (the base class's parent), which raises a NotImplementedError.

source:

 def run(self, *args, **kwargs):
        """The body of the task executed by workers."""
        raise NotImplementedError('Tasks must define the run method.')

exception:

tp = <class 'celery.backends.base.NotImplementedError'>
value = NotImplementedError('Tasks must define the run method.',), tb = None

    def reraise(tp, value, tb=None):
        """Reraise exception."""
        if value.__traceback__ is not tb:
            raise value.with_traceback(tb)
>       raise value
E       celery.backends.base.NotImplementedError: Tasks must define the run method.

is there something else we should hook into, to run common code prior to task execution? or are the conventions here just unconventional?

@hancush
Copy link
Member Author

hancush commented Apr 25, 2018

update:

we can achieve super() functionality if we write a base class, and an empty pass-through class that inherits from the base class, and bind the task methods to the pass-through class, like so:

class BaseClass(celery.Task):
    def run(self):
        self.foo = 'bar'

class PassThroughClass(BaseClass):
    pass

@shared_task(bind=True, base=PassThroughClass)
def a_task(self):
    super(PassThroughClass, self).run()

    print(self.foo)  # prints bar

@fgregg, i think i find this pattern preferable. what do you think?

@hancush
Copy link
Member Author

hancush commented Apr 25, 2018

we're sticking with our signal approach! i've extended the docstrings in 0cdbf06.

@hancush hancush closed this as completed Apr 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant