-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tackle dask issues #518
Tackle dask issues #518
Conversation
@@ -111,12 +111,13 @@ def start_task_on_dataset_split( | |||
deps = [d.done_event for d in dependencies] if dependencies is not None else [] | |||
if not all(deps): | |||
raise ValueError("Can't wait for an unstarted Module.") | |||
self.done_event = Event(name="-".join(map(str, self.task_id)), client=client) | |||
self.done_event = Event(name=self.task_id, client=client) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason, here it was a "-" and not a "_". Not sure why and if it was causing any issues.
@JosephMarinier @Dref360 I was unsuccessful at solving the issue. Still, LMK what you think of this PR. I think we should at least merge the Dask Dashboard support, and the |
586f0dc
to
7a7be05
Compare
Description:
We are now often seeing
distributed.comm.core.CommClosedError: in <TCP (closed) ConnectionPool.broadcast local=tcp://127.0.0.1:62774 remote=tcp://127.0.0.1:62755>: Stream is closed
errors in the logs, which causes tasks to be lost. They are often trigger whenself.client.run(ArtifactManager.clear_cache)
is called.dask
/distributed
; however, I can't get the router tests to work with it.Checklist:
You should check all boxes before the PR is ready. If a box does not apply, check it to acknowledge it.
ran
pre-commit run --all-files
at the end.our users.
README
files and our wiki for any big design decisions, if relevant.