Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

local: extend started_cbs AFTER state is initialized #4460

Merged
merged 1 commit into from Feb 9, 2019

Conversation

@crepererum
Copy link
Contributor

@crepererum crepererum commented Feb 5, 2019

  • Tests added / passed
  • Passes flake8 dask

I have observed a case where an exception seems to be raised very early and therefore state is not initialized in the finally block, leading to an UnboundLocalError and swallowing the original exception (which would have been helpful to have). I think in general, the state here should be initialized first, no matter what the root cause of my local problem (which I will report once I have found it).

I don't know if this needs a test or what a good test would be to trigger this issue.

@mrocklin
Copy link
Member

@mrocklin mrocklin commented Feb 5, 2019

Thanks for catching this @crepererum . Do you happen to have a minimal example for this that could serve as a test? Otherwise I'm afraid that future maintainers might reverse this decision without realizing it.

Also cc @jcrist for review

@crepererum crepererum force-pushed the fix/local_race_condition branch from 79f45f5 to 9fd528d Feb 5, 2019
@crepererum
Copy link
Contributor Author

@crepererum crepererum commented Feb 5, 2019

@mrocklin I've managed to design a regression test :)

dask/tests/test_threaded.py Outdated Show resolved Hide resolved
@crepererum crepererum force-pushed the fix/local_race_condition branch from 9fd528d to 531bcf4 Feb 6, 2019
@mrocklin
Copy link
Member

@mrocklin mrocklin commented Feb 6, 2019

Merging this in tomorrow if there are no further comments (cc @jcrist )

@mrocklin mrocklin merged commit 2376da2 into dask:master Feb 9, 2019
2 checks passed
@mrocklin
Copy link
Member

@mrocklin mrocklin commented Feb 9, 2019

Thanks for the fix @crepererum ! This is in.

@rainwoodman
Copy link
Contributor

@rainwoodman rainwoodman commented Mar 4, 2019

After this exact PR is merged, we start to see errors like this raised from cachey:

dask/cachey#15

I suspect it is because we are now registering the same key twice into cachey, which it doesn't currently support. But how could this happen?

@@ -405,15 +405,16 @@ def get_async(apply_async, num_workers, dsk, result, cache=None,
started_cbs = []
succeeded = False
try:
keyorder = order(dsk)

state = start_state_from_dask(dsk, cache=cache, sortkey=keyorder.get)
Copy link
Member

@jcrist jcrist Mar 4, 2019

Sorry for the late review. This change breaks caching because the start callback needs to run before we initialize state, otherwise we don't replace tasks with computed keys properly, resulting in an invalid state (we'll compute dependencies for cached task x, then the callback will replace task x with its computed value, invalidating the computed dependencies).

@jcrist
Copy link
Member

@jcrist jcrist commented Mar 4, 2019

I have observed a case where an exception seems to be raised very early and therefore state is not initialized in the finally block, leading to an UnboundLocalError and swallowing the original exception (which would have been helpful to have). I think in general, the state here should be initialized first, no matter what the root cause of my local problem (which I will report once I have found it).

I'm not sure I fully understand the issue here, but a different fix would just be to set state = {} before the try block, and go on from there. No unbound local, and keeps things working as is.

rainwoodman added a commit to rainwoodman/dask that referenced this issue Mar 4, 2019
jorge-pessoa pushed a commit to jorge-pessoa/dask that referenced this issue May 14, 2019
jorge-pessoa pushed a commit to jorge-pessoa/dask that referenced this issue May 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants