Conversation
Codecov Report
@@ Coverage Diff @@
## master #914 +/- ##
==========================================
- Coverage 95.4% 94.66% -0.75%
==========================================
Files 45 45
Lines 6460 6462 +2
==========================================
- Hits 6163 6117 -46
- Misses 297 345 +48
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #914 +/- ##
==========================================
- Coverage 95.42% 95.41% -0.01%
==========================================
Files 45 45
Lines 6494 6462 -32
==========================================
- Hits 6197 6166 -31
+ Misses 297 296 -1
Continue to review full report at Codecov.
|
|
Also, @mrocklin feel free to chime and share your thoughts :) |
|
@pierreglaser I just tried this patch with the following code: import numpy as np
from dask.distributed import Client, LocalCluster
from joblib import Parallel, delayed, parallel_backend
def sum_values(array, scalar):
return (array + scalar).sum()
cluster = LocalCluster()
client = Client(cluster)
with parallel_backend("dask"):
results = Parallel()(delayed(sum_values)(np.zeros(i), 5) for i in range(50000, 50010))
print(results)And now It does not get hung and finishes as expected :) 🎉🎉 It used to get hung forever until I Ctrl-C. If anyone can double-check this, it would be really helpful |
|
Thanks for the feedback -- I'm looking at it. |
|
@pierreglaser sorry, edited my previous comment. It DOES work now :) (was testing the wrong branch) |
|
Great! |
|
It also work for me with the snippet provided by @julioasotodv. 🎉🎉 (cc @samronsin). |
|
I've also tried the snippet provided by @ogrisel on #852 (that originally comes from Python packages usedbackcall==0.1.0 bokeh==0.13.0 Click==7.0 cloudpickle==1.2.1 dask==1.2.2 decorator==4.4.0 -e git+https://github.com/jjerphan/distributed.git@6ea010bcf21db7445bd26286966f59a7e75ab390#egg=distributed HeapDict==1.0.0 ipython==7.7.0 ipython-genutils==0.2.0 jedi==0.14.1 Jinja2==2.10.1 -e git+https://github.com/pierreglaser/joblib.git@f0e687901adf309dc7e7f47af5a8d901383e67eb#egg=joblib MarkupSafe==1.1.1 msgpack==0.6.1 numpy==1.15.4 packaging==19.0 pandas==0.23.4 parso==0.5.1 pexpect==4.7.0 pickleshare==0.7.5 prompt-toolkit==2.0.9 psutil==5.6.3 ptyprocess==0.6.0 Pygments==2.4.2 pyparsing==2.4.1.1 python-dateutil==2.8.0 pytz==2019.1 PyYAML==5.1.1 scikit-learn==0.20.3 scipy==1.1.0 six==1.12.0 sortedcontainers==2.1.0 tblib==1.4.0 toolz==0.10.0 tornado==5.1.1 traitlets==4.3.2 wcwidth==0.1.7 zict==1.0.0 |
|
I have tested this PR and verified it also fixes the problem I reported in dask/dask#2665 |
|
@pierreglaser , is this PR still waiting on something or is it ready to merge? |
|
It's missing a review by a core dev. @ogrisel should be back from vacations in a week or so. |
|
Ok I was trying to see if you could do a full async / no-blocking version but it's sounds too cumbersome and would probably require a significant refactoring of joblib backends. I will push a what's new entry to re-trigger CI and then merge this PR. |
|
Thanks @ogrisel and @pierreglaser! 😀 |
|
Thanks @ogrisel and @pierreglaser! 😀 — bis |
|
Thanks all
…On Tue, Sep 10, 2019 at 7:00 AM Julien Jerphanion ***@***.***> wrote:
Thanks @ogrisel <https://github.com/ogrisel> and @pierreglaser
<https://github.com/pierreglaser>! 😀 — bis
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#914?email_source=notifications&email_token=AACKZTGKY5KKOXRXVV4CNXDQI6R6VA5CNFSM4IFHPQFKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6LGAUQ#issuecomment-529948754>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AACKZTEPF3OKR2IZON6FBBLQI6R6VANCNFSM4IFHPQFA>
.
|
Release 0.14.0 Improved the load balancing between workers to avoid stranglers caused by an excessively large batch size when the task duration is varying significantly (because of the combined use of joblib.Parallel and joblib.Memory with a partially warmed cache for instance). joblib/joblib#899 Add official support for Python 3.8: fixed protocol number in Hasher and updated tests. Fix a deadlock when using the dask backend (when scattering large numpy arrays). joblib/joblib#914 Warn users that they should never use joblib.load with files from untrusted sources. Fix security related API change introduced in numpy 1.6.3 that would prevent using joblib with recent numpy versions. joblib/joblib#879 Upgrade to cloudpickle 1.1.1 that add supports for the upcoming Python 3.8 release among other things. joblib/joblib#878 Fix semaphore availability checker to avoid spawning resource trackers on module import. joblib/joblib#893 Fix the oversubscription protection to only protect against nested Parallel calls. This allows joblib to be run in background threads. joblib/joblib#934 Fix ValueError (negative dimensions) when pickling large numpy arrays on Windows. joblib/joblib#920 Upgrade to loky 2.6.0 that add supports for the setting environment variables in child before loading any module. joblib/joblib#940 Fix the oversubscription protection for native libraries using threadpools (OpenBLAS, MKL, Blis and OpenMP runtimes). The maximal number of threads is can now be set in children using the inner_max_num_threads in parallel_backend. It defaults to cpu_count() // n_jobs.
Builds on top of #910
Fixes #852
(I'm still not 100% sure of whats going on here). This PR ensures the synchronous execution of distributed
scatteroperations by not running joblib's callbacks directly intodistributedclient event loop. In this situation, the operations can safely be made blocking.I tried running the sklearn example in #852 and it runs fine. Hopefully the results are correct also.
@ogrisel