Client released keys make running tasks get lost

In my use of dask, where I have some very long running tasks, I often encounter the following situation:

- submit long running tasks (as part of a compute graph or otherwise)
- an error occurs or I change my mind and cancel the future or release my Client
- the Scheduler transitions the tasks from processing->released->forgotten
- any task that was already executing are still running on the worker, but since the scheduler sees that worker as available, the scheduler assigns new work to that worker, even if there are other idle workers
- I do not see the tasks that are actually still executing in the dashboard or workers.html page, they are no where to be seen. Even the Call Stacks from the worker dashboard does not show the old cancelled task that is actually still running
- the cluster appears to have work, and some idle workers, yet appears hung since it is not proceeding

Since I run dask workers with num-threads==1, I can simply restart the workers that are currently processing tasks when the client chooses to cancel or disconnects from the scheduler, but when I tried to use a WorkerPlugin, I did not see any updates when the task transitioned from processing to released.  Is this intentional?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Client released keys make running tasks get lost #3681

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Client released keys make running tasks get lost #3681

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions