Different behaviour in distributed vs single-machine scheduler: missing task

I'm adding extra tasks on the optimizing step to save some task results to disk. The extra tasks are run on the dask single-machine scheduler, while they are not run on the distributed scheduler. It might not be a bug, but the intended behavior. In that case, how should have I done it?

Thanks!


**Minimal Complete Verifiable Example**:

**Example 1:**

```python
import dask

def optimize(dsk, keys):
    print("Optimizing graph")
    dsk = dask.utils.ensure_dict(dsk)
    # Adding an extra task to the graph
    dsk["extra_task"] = (print, "Running extra task")
    return dsk

dask.config.set(delayed_optimize=optimize)

dask.delayed(print)("Running task").compute()
```

Running example 1 prints:
```
Optimizing graph
Running extra task
Running task
```

**Example 2:**

Start dask-scheduler and a dask-worker:

```bash
dask-scheduler & dask-worker localhost:8786 &
```

Add to example 1 the following:

```python
import dask
import distributed

client = distributed.Client("localhost:8786")

def optimize(dsk, keys):
...
```

Example 2 prints "Optimizing graph" in the python process running the file, and "Running task" in the dask-worker process, but it doesn't print "Running extra task".

**Environment**:

- Dask and distributed versions: 2021.04.0
- Python version: 3.8.8
- Operating System: Ubuntu 20.04
- Install method (conda, pip, source): conda

New conda environment from [environment-3.8.yaml](https://github.com/dask/distributed/blob/main/continuous_integration/environment-3.8.yaml). Cloned and "pip -e" installed dask and distributed, tags 2021.04.0.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Different behaviour in distributed vs single-machine scheduler: missing task #4750

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Different behaviour in distributed vs single-machine scheduler: missing task #4750

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions