-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Dask][macOS] distributed.nanny - WARNING - Restarting worker and shows no worker #4625
Comments
I just ran the training 100 times in a row. Does the hang happen every time for you? |
For the version |
@orcahmlee are you using your system's |
I follow the instructions to install the building tools (build-from-sources, apple-clang). I installed brew install cmake
brew install libomp
git clone --recursive https://github.com/microsoft/LightGBM.git
cd LightGBM/python-package
python setup.py install The information of $ gcc -v
Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 12.0.5 (clang-1205.0.22.9)
Target: x86_64-apple-darwin20.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin $ brew info libomp
libomp: stable 12.0.1 (bottled)
LLVM's OpenMP runtime library
https://openmp.llvm.org/
/usr/local/Cellar/libomp/12.0.1 (9 files, 1.5MB) *
Poured from bottle on 2021-09-17 at 11:43:33
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/libomp.rb
License: MIT
==> Dependencies
Build: cmake ✘
==> Analytics
install: 72,407 (30 days), 257,039 (90 days), 1,153,144 (365 days)
install-on-request: 9,466 (30 days), 31,639 (90 days), 136,392 (365 days)
build-error: 0 (30 days) |
@orcahmlee I'm pretty sure |
Thanks for this inoframtion(#4229), it's let me allow to downgraded to Since I downgraded to BTW, I just saw In short, in this case on my machine:
|
Just to be sure, it doesn't work with |
No, it doesn't work with This time I didn't see the
|
Thanks very much for working with us on this. I think what you found in #4625 (comment) is consistent with the ongoing investigation happening over in #4229 (comment)... using LightGBM with You can subscribe to #4229 to track and contribute to the investigation of this issue. |
Thanks so much. |
It seems like this issue can be closed, since the root cause was the ongoing issues with newer versions of Thanks very much for the thorough reports @orcahmlee ! |
Thanks for your contribution. |
This issue has been automatically locked since there has not been any recent activity since it was closed. |
Description
I recently tried using
lightgbm.dask
withlightgbm==3.2.1
but I hit the errorLightGBMError: Socket recv error, code: 54
randomly, detail as #4116 (comment).Thanks for @jameslamb suggestion, I tried install the
lightgbm
from the source(54facc4) to verify whether or not I would keep hittingLightGBMError: Socket recv error, code: 54
. However, I hit the new situation.I ran a simple regression example on JupyterLab on macOS with
lightgbm==3.2.1
andlightgbm==3.2.1.99
(54facc4). There are not rich logs I can paste it, but I will describe as clear as possible what is happening.3.2.1
array
->dict
->_train_part
~60 MiB
per worker3.2.1.99
(54facc4)array
->dict
->_train_part
~60 MiB
per workerdeserialized_find_n_ports
->find_n_ports
->array
->dict
no-worker
distributed.nanny - WARNING - Restarting worker
and wait foreverReproducible example
Output from Jupyter
Environment info
LightGBM version or commit hash: 54facc4
Command(s) you used to install LightGBM:
git clone --recursive https://github.com/microsoft/LightGBM.git cd LightGBM/python-package python setup.py install
Additional Comments
None
The text was updated successfully, but these errors were encountered: