Opinion on LightGBM scalability #1225

Laurae2 · 2018-01-30T10:10:18Z

@guolinke What's your opinion on the current LightGBM scalability? Visual Studio has a strange behavior without CPU affinity, which makes it faster than if it had CPU affinity setup.

https://medium.com/@Laurae2/getting-the-most-of-xgboost-and-lightgbm-speed-compiler-cpu-pinning-374c38d82b86

I did some tests using R, and Visual Studio is still best for LightGBM as we saw before.

Note that when increasing the max depth/leaves to a large number, it takes forever to learn using MinGW.

Does Visual Studio has optimizations on Windows to be better without CPU pinning/affinity? It seems strange a roaming CPU is faster than with a pinned CPU.

guolinke · 2018-01-30T10:36:24Z

@Laurae2 It seems you are using a very sparse dataset.
Do you have the results on dense dataset ?

Laurae2 · 2018-01-30T10:40:04Z

@guolinke Do you have any large dense dataset with enough features? (something like 1M x 1K, or smaller)

Higgs is too small in number of features.

guolinke · 2018-01-30T12:57:52Z

@Laurae2 I think you can generate some new features in higgs dataset.
e.g. sum/sub/mul/div of two features.

Laurae2 · 2018-02-03T10:14:55Z

I started one run today using mul of all features. 33.3 GB matrix. Will come back later for results.

Laurae2 · 2018-02-03T18:44:37Z

@guolinke The difference is huge with CPU pinning on Higgs 33.3GB.

guolinke · 2018-02-04T00:20:37Z

@Laurae2 so for dense dataset, the pinned solution is better.
I guess the reason is the effect of memory bandwidth (latency).
For dense dataset, LightGBM prefers the high memory bandwidth, so pin solution is better.
While for sparse dataset, the memory bandwidth is not the bottleneck.

Laurae2 assigned guolinke Jan 30, 2018

Laurae2 added the question label Feb 4, 2018

Laurae2 closed this as completed Feb 6, 2018

Laurae2 mentioned this issue Jun 10, 2018

LightGBM does not like NUMA (large performance impact on servers) #1441

Closed

lock bot locked as resolved and limited conversation to collaborators Mar 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opinion on LightGBM scalability #1225

Opinion on LightGBM scalability #1225

Laurae2 commented Jan 30, 2018

guolinke commented Jan 30, 2018

Laurae2 commented Jan 30, 2018

guolinke commented Jan 30, 2018 •

edited

Loading

Laurae2 commented Feb 3, 2018

Laurae2 commented Feb 3, 2018

guolinke commented Feb 4, 2018

Opinion on LightGBM scalability #1225

Opinion on LightGBM scalability #1225

Comments

Laurae2 commented Jan 30, 2018

guolinke commented Jan 30, 2018

Laurae2 commented Jan 30, 2018

guolinke commented Jan 30, 2018 • edited Loading

Laurae2 commented Feb 3, 2018

Laurae2 commented Feb 3, 2018

guolinke commented Feb 4, 2018

guolinke commented Jan 30, 2018 •

edited

Loading