Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise informative warning when rescheduling an unknown task #2916

merged 2 commits into from Aug 3, 2019


Copy link

@jrbourbeau jrbourbeau commented Aug 2, 2019

When tasks are quickly asked to be rescheduled, and also quickly canceled, it can lead to the task key being removed from the scheduler, but before the task can be canceled, the task begins running on the worker. Then when Reschedule is raised, the scheduler no longer has a key for the rescheduled task. Here's a (contrived) example of this behavior:

from distributed import Client, Reschedule

if __name__ == "__main__":

    def f(x):
        raise Reschedule

    with Client(processes=False, threads_per_worker=1) as client:
        for i in range(100):
            z = client.submit(f, i)

which raises a KeyError when the scheduler attempts to reschedule the (already canceled) task:

distributed.core - ERROR - 'f-eb18a99a0c19477232670692e952f322'
Traceback (most recent call last):
  File "/Users/jbourbeau/github/Quansight-Labs/distributed/distributed/", line 416, in handle_comm
    result = await result
  File "/Users/jbourbeau/github/Quansight-Labs/distributed/distributed/", line 1489, in add_worker
    await self.handle_worker(comm=comm, worker=address)
  File "/Users/jbourbeau/github/Quansight-Labs/distributed/distributed/", line 2404, in handle_worker
    await self.handle_stream(comm=comm, extra={"worker": worker})
  File "/Users/jbourbeau/github/Quansight-Labs/distributed/distributed/", line 477, in handle_stream
    handler(**merge(extra, msg))
  File "/Users/jbourbeau/github/Quansight-Labs/distributed/distributed/", line 4366, in reschedule
    ts = self.tasks[key]
KeyError: 'f-eb18a99a0c19477232670692e952f322'

This PR adds an informative warning if attempting to reschedule a key not found on the scheduler and aborts the rescheduling process.

Copy link
Member Author

@jrbourbeau jrbourbeau commented Aug 2, 2019

Hm not sure why there's a seg fault for the Python 3.5 Travis build ( I wanna say it's unrelated to the changes here


Copy link
Member Author

@jrbourbeau jrbourbeau commented Aug 3, 2019

Seems the failure was transient (thanks to whoever restarted the build)


@mrocklin mrocklin merged commit 20ba1a7 into dask:master Aug 3, 2019
2 checks passed
Copy link

@mrocklin mrocklin commented Aug 3, 2019

Thanks @jrbourbeau !


@jrbourbeau jrbourbeau deleted the quick-reschedule branch Aug 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants