Try out Dask-Jobqueue for computing your graphs on a cluster with minimal code modification.
See this post by @ericmjl for a tutorial on how to use this package.
Items passed to your function from a bag can only be iterated over once (they are itertools.chain
objects). You would have to list(items)
them to iterate over multiple times.
The following has given me a port in use forever looping error:
from dask.distributed import Client
with Client() as client:
client.compute(delayed(my_func)(), sync=True)
Embedding in a __main__
check resolves:
from dask.distributed import Client
if __name__ == '__main__':
with Client() as client:
client.compute(delayed(my_func)(), sync=True)
Calling delayed(func).compute()
for func
that doesn't take any parameters instead of delayed(func)().compute()
.