New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Document threads_per_worker #7181
base: main
Are you sure you want to change the base?
Conversation
Can one of the admins verify this patch? Admins can comment |
cc @ncclementi |
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 15 files ± 0 15 suites ±0 5h 51m 4s ⏱️ - 43m 47s For more details on these failures, see this check. Results for commit 43c3b0d. ± Comparison against base commit 8f25111. |
This LGTM, I see a good amount of tests failing that definitely are not related to this PR, @jrbourbeau are these known issues? Can we completely ignore them? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @larsoner!
I see a good amount of tests failing that definitely are not related to this PR, @jrbourbeau are these known issues? Can we completely ignore them?
Yeah, the CI failures here are unrelated to the changes in this PR
Number of threads per each worker | ||
The number of threads in each worker's | ||
:class:`python:ThreadPoolExecutor` where tasks are executed. | ||
See also :doc:`dask:array-best-practices` for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem like the right place to point to array best practices docs.
See also :doc:`dask:array-best-practices` for more information. |
I'm also wondering if this is still an issue since recent versions of distributed
should limit numpy
s concurrency by default
distributed/distributed/distributed.yaml
Lines 191 to 194 in 5dccad4
# Numpy configuration | |
OMP_NUM_THREADS: 1 | |
MKL_NUM_THREADS: 1 | |
OPENBLAS_NUM_THREADS: 1 |
In dask/dask#9588 you mentioned you're using dask=2022.05.0
-- I'm curious if you run into the same thread oversubscription issue with the latest dask=2022.10.0
release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem like the right place to point to array best practices docs.
How about Notes
instead? It would be nice to have it in this class docstring somewhere so that people thinking about threads can find complementary information. I would have found the link I think if it had been on the same page as threads_per_worker
somewhere...
I'm also wondering if this is still an issue since recent versions of distributed should limit numpys concurrency by default
Based on a similar suggestion/question earlier from @ncclementi I checked and indeed it's fixed for me in 2022.10.0 :)
@@ -87,7 +87,7 @@ Specifically for connectivity problems (e.g., timeout exceptions in the worker | |||
logs), you will need to diagnose your networking infrastructure, which is more | |||
complicated than can be described here. Commonly, it may involve logging into | |||
the machine running the affected worker | |||
(although you can :ref:`ipylaunch`). | |||
(although you can ``ipylaunch``). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to clean this up 👍
Closes dask/dask#9588
Tests added / passedpre-commit run --all-files
I also worked around a couple of other warnings I saw while running
make html
, but I could remove those if not desired.