I'm adding extra tasks on the optimizing step to save some task results to disk. The extra tasks are run on the dask single-machine scheduler, while they are not run on the distributed scheduler. It might not be a bug, but the intended behavior. In that case, how should have I done it?
Thanks!
Minimal Complete Verifiable Example:
Example 1:
import dask
def optimize(dsk, keys):
print("Optimizing graph")
dsk = dask.utils.ensure_dict(dsk)
# Adding an extra task to the graph
dsk["extra_task"] = (print, "Running extra task")
return dsk
dask.config.set(delayed_optimize=optimize)
dask.delayed(print)("Running task").compute()
Running example 1 prints:
Optimizing graph
Running extra task
Running task
Example 2:
Start dask-scheduler and a dask-worker:
dask-scheduler & dask-worker localhost:8786 &
Add to example 1 the following:
import dask
import distributed
client = distributed.Client("localhost:8786")
def optimize(dsk, keys):
...
Example 2 prints "Optimizing graph" in the python process running the file, and "Running task" in the dask-worker process, but it doesn't print "Running extra task".
Environment:
- Dask and distributed versions: 2021.04.0
- Python version: 3.8.8
- Operating System: Ubuntu 20.04
- Install method (conda, pip, source): conda
New conda environment from environment-3.8.yaml. Cloned and "pip -e" installed dask and distributed, tags 2021.04.0.
I'm adding extra tasks on the optimizing step to save some task results to disk. The extra tasks are run on the dask single-machine scheduler, while they are not run on the distributed scheduler. It might not be a bug, but the intended behavior. In that case, how should have I done it?
Thanks!
Minimal Complete Verifiable Example:
Example 1:
Running example 1 prints:
Example 2:
Start dask-scheduler and a dask-worker:
Add to example 1 the following:
Example 2 prints "Optimizing graph" in the python process running the file, and "Running task" in the dask-worker process, but it doesn't print "Running extra task".
Environment:
New conda environment from environment-3.8.yaml. Cloned and "pip -e" installed dask and distributed, tags 2021.04.0.