Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support passing lambdas into tasks #99

Open
emlys opened this issue Nov 1, 2023 · 3 comments
Open

Support passing lambdas into tasks #99

emlys opened this issue Nov 1, 2023 · 3 comments
Labels
enhancement New feature or request on hold

Comments

@emlys
Copy link
Member

emlys commented Nov 1, 2023

Callables passed into TaskGraph.add_task currently have to be named functions defined in the global scope. There are many cases where it would be convenient to pass in a lambda function defined in-line. I'm specifically thinking about raster_map, e.g.

taskgraph.add_task(
    func=pygeoprocessing.raster_map,
    kwargs=dict(
        op=<...>,
        rasters=rasters,
        target_path=target_path),
    ...

It would be convenient to write op=lambda x: ..., but that breaks when n_workers > 0 because the args must be pickled. And pickle cannot pickle lambdas or local objects.

Taskgraph could support lambdas and local callables by using multiprocess, a fork of python's multiprocessing that supports pickling more types. I briefly tried replacing multiprocessing with multiprocess in Task.py and it worked (test suite passed and was able to pass lambdas into taskgraph). But there could be other implications of using multiprocess, like conflicts with multiprocessing if both were used.

It's also worth noting that, while python supports pickling functools.partials, taskgraph raises an error because it relies on the __name__ attribute. It would be nice to support partials too.

Traceback (most recent call last):
  File "/Users/emily/miniconda3/envs/main/lib/python3.10/site-packages/taskgraph/Task.py", line 625, in add_task
    new_task = Task(
  File "/Users/emily/miniconda3/envs/main/lib/python3.10/site-packages/taskgraph/Task.py", line 1003, in __init__
    scrubbed_value = _scrub_task_args(arg, self._target_path_list)
  File "/Users/emily/miniconda3/envs/main/lib/python3.10/site-packages/taskgraph/Task.py", line 1459, in _scrub_task_args
    return '%s:%s' % (base_value.__name__, source_code)
AttributeError: 'functools.partial' object has no attribute '__name__'. Did you mean: '__ne__'?
@emlys emlys added the enhancement New feature or request label Nov 1, 2023
@dcdenu4
Copy link
Member

dcdenu4 commented Nov 6, 2023

Related to #83

@emlys
Copy link
Member Author

emlys commented Nov 9, 2023

We talked about this today and decided to wait until after the taskgraph 1.0 release. Taskgraph has been stable for a while now, and we want to do the release before mixing anything up.

@emlys emlys added the on hold label Nov 9, 2023
@emlys
Copy link
Member Author

emlys commented Nov 9, 2023

There might be more minimal ways to add support for lambdas, like patching multiprocessing without using multiprocess, or passing in lambdas as strings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request on hold
Projects
None yet
Development

No branches or pull requests

2 participants