Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cached function executed in parallel yield corrupted cache and unnecessary recomputations #490

Closed
lesteve opened this issue Feb 3, 2017 · 0 comments
Assignees
Labels
bug

Comments

@lesteve
Copy link
Member

@lesteve lesteve commented Feb 3, 2017

From scikit-learn/scikit-learn#7990 (comment).

Snippet reproducing the problem:

from joblib import Parallel, delayed, Memory

mem = Memory('/tmp/joblib', verbose=10)


def func():
    complex_obj = ['a'*1000] * 1000
    return complex_obj


func_cached = mem.cache(func)

Parallel(n_jobs=-1)(delayed(func_cached)() for i in range(100))

Run with:

rm -rf /tmp/joblib/ && ipython /tmp/test.py 2>&1 | tee log

Plenty of warnings like this:

WARNING:root:[MemorizedFunc(func=<function func at 0x7f41f372d048>, cachedir='/tmp/joblib/joblib')]: Exception
 Traceback (most recent call last):
  File "/home/lesteve/dev/joblib/joblib/memory.py", line 514, in _cached_call
    verbose=self._verbose)
  File "/home/lesteve/dev/joblib/joblib/memory.py", line 138, in _load_output
    result = numpy_pickle.load(filename, mmap_mode=mmap_mode)
  File "/home/lesteve/dev/joblib/joblib/numpy_pickle.py", line 578, in load
    obj = _unpickle(fobj, filename, mmap_mode)
  File "/home/lesteve/dev/joblib/joblib/numpy_pickle.py", line 508, in _unpickle
    obj = unpickler.load()
  File "/home/lesteve/miniconda3/lib/python3.5/pickle.py", line 1037, in load
    raise EOFError
EOFError
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.