Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Class-based task definitions #158

Closed
piercefreeman opened this issue Jan 4, 2020 · 2 comments
Closed

Class-based task definitions #158

piercefreeman opened this issue Jan 4, 2020 · 2 comments

Comments

@piercefreeman
Copy link

Hi there -

I'm exploring tasktiger for a larger project. The sticking point thus far has been the function-based definitions of each worker handler. Is it possible to instead to instantiate a task as a class instance, so we're able to customize the given task with instance variables configured during initialization?

ie - something like the following...

class MyTask:
    __name__ = "MyTask"

    def __init__(self, echo):
        self.echo = echo

    def __call__(self, obj):
        print("Hello", [obj] * self.echo)

task = MyTask(echo=3)
my_task = tiger.task(task)

if __name__ == "__main__":
    tiger.run_worker()
from tiger_task import tiger, my_task

for _ in range(200):
    my_task.delay(abc=True)

This errors out with TypeError: __init__() got an unexpected keyword argument 'abc', since it's trying to re-initialize the actual class during the fork instead of forking the instantiated object and then running its call method.

From digging into the code so far, this looks like a pretty fundamental assumption made within the function serialization logic to grab the right function from the module path / global namespace. But wondering if anyone has experience in a workaround that would allow for more of an object-oriented calling like in the above code.

@AlecRosenbaum
Copy link
Contributor

In order to pass the function to the worker, we need to be able to serialize a dotted path to the task, which is then used to import and run the task on the worker side.

If you define tasks like this, my_task.__name__ == 'MyTask', so the worker ends up running module.MyTask(*args, **kwargs). I'm not aware of a way to programmatically generate a dotted path to a callable global variable, which is essentially what my_task is in this example. I have a feeling it's not possible to do from the object-definition side because you could end up with ambiguous scenarios when you have multiple references to the same object. It might be possible if you try to get a stack trace and inspect it looking for an assignment, but that would be really messy and unreliable.

There's a pretty nasty way to work around it that I wouldn't recommend, which would be setting __name__ after init to match the global name:

task = MyTask(echo=3)
task.__name__ = 'my_task'
my_task = tiger.task(task)

I'm sure your real scenario is more complicated, but in this example (assuming I couldn't change MyTask) I'd probably write something like this:

@tiger.task
def my_task(*args, **kwargs):
	return MyTask(echo=3)(*args, **kwargs)

@AlecRosenbaum
Copy link
Contributor

In short, I don't foresee us adding support for this type of usage primarily because I'm not aware of a reliable way to get an importable path to the task when they're defined like this.

For now I'm going to close the issue but feel free to reopen if there's a good way to solve the problem I posed above that I'm just not aware of. Alternatively if you can provide more details on the real use case that invalidates my approach above, I'll try to respond to any additional comments when I get a chance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants