-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opinion on LightGBM scalability #1225
Comments
@Laurae2 It seems you are using a very sparse dataset. |
@guolinke Do you have any large dense dataset with enough features? (something like 1M x 1K, or smaller) Higgs is too small in number of features. |
@Laurae2 I think you can generate some new features in higgs dataset. |
I started one run today using mul of all features. 33.3 GB matrix. Will come back later for results. |
@guolinke The difference is huge with CPU pinning on Higgs 33.3GB. |
@Laurae2 so for dense dataset, the pinned solution is better. |
@guolinke What's your opinion on the current LightGBM scalability? Visual Studio has a strange behavior without CPU affinity, which makes it faster than if it had CPU affinity setup.
https://medium.com/@Laurae2/getting-the-most-of-xgboost-and-lightgbm-speed-compiler-cpu-pinning-374c38d82b86
I did some tests using R, and Visual Studio is still best for LightGBM as we saw before.
Note that when increasing the max depth/leaves to a large number, it takes forever to learn using MinGW.
Does Visual Studio has optimizations on Windows to be better without CPU pinning/affinity? It seems strange a roaming CPU is faster than with a pinned CPU.
The text was updated successfully, but these errors were encountered: