New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some questions #1
Comments
Number of iterations is different depending on hyperparameters used. Results must be taken with caution.
Using best iteration.
I think NAs are the reason for such difference. I have some preprocessing to set them as 0 for both xgboost / LightGBM. All values are pushed to be positive except NAs. Then, they are converted to sparse matrices:
As there are many NAs/0s for both Bosch and Higgs in my case, not handling NAs on both sides per feature will hit the metric. When I presented my results, I expected LightGBM would do better on Higgs (because it's a synthetic dataset), but my preprocessing hit its performance while xgboost is able to use the preprocessing to get an edge. On Higgs, LightGBM converged too fast, not being able to use the extra NA information xgboost can use.
I have some preprocessing used for NAs which hurt the performance. |
As the trees grow deeper, making use of NAs become essential. You can see the issue in the convergence table below: Full pic: N.B: parameters reported are the indexes you can find here under "Hyperparameters used": https://sites.google.com/view/lauraepp/benchmarks |
@Laurae2 |
@guolinke I will re-run benchmarks later with:
Setup I will use:
I will give up Higgs because you told me in microsoft/LightGBM#512 that LightGBM is parallelizing over columns. Got similar issue in xgboost. All of the runs I will use |
@Laurae2 can you update the accuracy results since LightGBM is capable of missing value handle now. |
@guolinke Do you know which exact commit do you want me to use for benchmarks? My current xgboost benchmarks are going to end next week and I will be able to do full runs on LightGBM (ETA 1 week for full runs). Some results below: Bosch with simple test:
Setup:
Parameters:
LightGBM run: # SET YOUR WORKING DIRECTORY
library(R.utils)
setwd("D:/Data Science/Bosch_mini")
library(lightgbm)
train <- lgb.Dataset("bosch_train_lgb.data")
test <- lgb.Dataset("bosch_test_lgb.data")
gc(verbose = FALSE)
set.seed(11111)
Laurae::timer_func_print({temp_model <- lgb.train(params = list(num_threads = 4,
learning_rate = 0.02,
max_depth = 6,
num_leaves = 63,
max_bin = 255,
min_gain_to_split = 1,
min_sum_hessian_in_leaf = 1,
min_data_in_leaf = 1,
bin_construct_sample_cnt = 1000000L),
data = train,
nrounds = 800,
valids = list(test = test),
objective = "binary",
metric = "auc",
verbose = 2)})
library(data.table)
perf <- as.numeric(rbindlist(temp_model$record_evals$test$auc))
max(perf)
which.max(perf) xgboost run: # SET YOUR WORKING DIRECTORY
library(R.utils)
setwd("D:/Data Science/Bosch_mini")
library(xgboost)
train <- xgb.DMatrix("bosch_train_xgb.data")
test <- xgb.DMatrix("bosch_test_xgb.data")
gc(verbose = FALSE)
set.seed(11111)
Laurae::timer_func_print({temp_model <- xgb.train(params = list(nthread = 4,
eta = 0.02,
max_depth = 6,
max_leaves = 63,
max_bin = 255,
gamma = 1,
min_child_weight = 1,
objective = "binary:logistic",
booster = "gbtree",
tree_method = "hist",
grow_policy = "lossguide"),
data = train,
watchlist = list(test = test),
eval_metric = "auc",
nrounds = 800,
verbose = 2)})
max(temp_model$evaluation_log$test_auc)
which.max(temp_model$evaluation_log$test_auc) Raw log: xgboost fast histogram depthwise: [14:35:40] amalgamation/../src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 24 extra nodes, 0 pruned nodes, max_depth=6
[800] test-auc:0.719032
The function ran in 662934.079 milliseconds.
[1] 662934.1
> max(temp_model$evaluation_log$test_auc)
[1] 0.71948
> which.max(temp_model$evaluation_log$test_auc)
[1] 603 xgboost fast histogram lossguide: [14:49:44] amalgamation/../src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 24 extra nodes, 0 pruned nodes, max_depth=6
[800] test-auc:0.719032
The function ran in 656224.842 milliseconds.
[1] 656224.8
> max(temp_model$evaluation_log$test_auc)
[1] 0.71948
> which.max(temp_model$evaluation_log$test_auc)
[1] 603 LightGBM master: devtools::install_github("Microsoft/LightGBM/R-package")
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=19 and max_depth=6
[800]: test's auc:0.716801
The function ran in 720430.291 milliseconds.
[1] 720430.3
> perf <- as.numeric(rbindlist(temp_model$record_evals$test$auc))
> max(perf)
[1] 0.7176366
> which.max(perf)
[1] 427 LightGBM v2.0: devtools::install_github("Microsoft/LightGBM/R-package@v2.0")
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=18 and max_depth=6
[800]: test's auc:0.714139
The function ran in 673870.612 milliseconds.
[1] 673870.6
> perf <- as.numeric(rbindlist(temp_model$record_evals$test$auc))
> max(perf)
[1] 0.7147095
> which.max(perf)
[1] 763 LightGBM v1.0: devtools::install_github("Microsoft/LightGBM/R-package@v1")
[LightGBM] [Info] No further splits with positive gain, best gain: -inf, leaves: 24
[800]: test's auc:0.716018
The function ran in 642500.495 milliseconds.
[1] 642500.5
> perf <- as.numeric(rbindlist(temp_model$record_evals$test$auc))
> max(perf)
[1] 0.7167827
> which.max(perf)
[1] 716 |
@Laurae2 strange, I think the master will be much faster than v2.0 and v1. |
@Laurae2 |
@guolinke For LightGBM I created the binned dataset before calling For xgboost, I can't control when the binned dataset is created (for fast histogram, it is created everytime I will re-run with your |
@Laurae2 another issue is the |
@Laurae2 Did you re-create the lgb.Dataset by the master branch ? its structure of Dataset is far different with V1 and V2.0 . |
@guolinke I recreate new datasets everytime I change xgboost/LightGBM version. LightGBM v2 datasets cause crashes for LightGBM v1. I will reupdate soon with:
xgboost depthwise vs lossguide AUC strangeness below: https://public.tableau.com/views/gbt_benchmarks/AUC-Data?:showVizHome=no |
@Laurae2 BTW, can you also try to without setting |
@guolinke Using $construct() seems to lead to whole different datasets, interesting (number of bins change, performance out of one iteration using MSE also changes significantly). I will follow what you recommended me when I will test LightGBM on larger tests. So far I will do this for the short test (should be over next hour) for LightGBM:
By the way, do you know how to override LightGBM feature selection? (bypassing feature removal during training). I would like to keep the number of features constant when data is sparse on a new large test I would like to add (about 10,000 selected features). Or should I leave LightGBM choose features? It seems I can reproduce the number of bins / features selected, so I think this is not too much an issue. |
@Laurae2 I think maybe the construct may have some bugs... It is hard to override LightGBM feature selection, there are many related codes.. |
@Laurae2 |
@Laurae2 I also worry about the cpu usage of MinGW version as well. |
@guolinke On MinGW + LightGBM I'm at 90% CPU, on MinGW + xgboost CPU usage is slightly lower at around 80%. |
@Laurae2 OK. I will paste some my test results tomorrow (about 12 hour later). |
@guolinke |
@Laurae2 it is ok to use. |
New results, using NEW (custom callback):
OLD (no custom callback):
xgboost depthwise: [18:30:30] amalgamation/../src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 24 extra nodes, 0 pruned nodes, max_depth=6
[800] test-auc:0.719032
The function ran in 694043.139 milliseconds.
[1] 694043.1
> max(temp_model$evaluation_log$test_auc)
[1] 0.7194797
> which.max(temp_model$evaluation_log$test_auc)
[1] 603 xgboost lossguide: [18:17:27] amalgamation/../src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 24 extra nodes, 0 pruned nodes, max_depth=6
[800] test-auc:0.719032
The function ran in 696006.274 milliseconds.
[1] 696006.3
> max(temp_model$evaluation_log$test_auc)
[1] 0.7194797
> which.max(temp_model$evaluation_log$test_auc)
[1] 603 LightGBM v1: [LightGBM] [Info] No further splits with positive gain, best gain: -inf, leaves: 24
[800]: test's auc:0.716018
The function ran in 670116.550 milliseconds.
[1] 670116.5
> perf <- as.numeric(rbindlist(temp_model$record_evals$test$auc))
> max(perf)
[1] 0.7167827
> which.max(perf)
[1] 716 LightGBM v2: [LightGBM] [Info] No further splits with positive gain, best gain: -inf, leaves: 19
[800]: test's auc:0.71364
The function ran in 675590.982 milliseconds.
[1] 675591
> perf <- as.numeric(rbindlist(temp_model$record_evals$test$auc))
> max(perf)
[1] 0.7146346
> which.max(perf)
[1] 636 LightGBM master: [LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=23 and max_depth=6
[800]: test's auc:0.717726
The function ran in 655582.692 milliseconds.
[1] 655582.7
> perf <- as.numeric(rbindlist(temp_model$record_evals$test$auc))
> max(perf)
[1] 0.7180479
> which.max(perf)
[1] 796 |
@Laurae2 why the best iteration changes so much in LightGBM ? |
@guolinke I think it is because dataset becomes a lot different when using |
OK. v2.0 result:
master result:
master is far faster than v2.0 . |
R version: Script:
v2.0 result:
master result:
|
@guolinke When I'll get my bigger server with the 20 cores available I will re-test again. |
@Laurae2 |
@Laurae2 |
@guolinke weird, because I wipe the binary datasets and recreate them manually everytime. Note that I'm using a custom R installation with gcc 7.1, unlike default R with gcc 4.9. I wonder if there are differences for LightGBM on gcc 7.1 vs gcc 4.9. |
@Laurae2 Can your result on master branch much faster the v2.0 branch now ? |
@guolinke I'll access my laptop in about 11h. At that time I will be able to check speed of v2.0 and master. (I doubt my 20 core server will be available) |
@guolinke I think MinGW is better for low amount of cores while VS might be better (untested) on more cores. Using 4 threads on i7-4600U, no CPU throttling. I must test on more cores when my 20 core server is free.
Configuration:
v2.0 CLI (O3) [LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=27 and max_depth=6
[LightGBM] [Info] Iteration:799, valid_1 binary_logloss : 0.0305089
[LightGBM] [Info] 645.962530 seconds elapsed, finished iteration 799
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=36 and max_depth=6
[LightGBM] [Info] Iteration:800, valid_1 binary_logloss : 0.0305101
[LightGBM] [Info] 646.770656 seconds elapsed, finished iteration 800
[LightGBM] [Info] Finished training v2.0 R (O2) [LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=13 and max_depth=6
[799]: test's binary_logloss:0.0305097
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=20 and max_depth=6
[800]: test's binary_logloss:0.0305099
The function ran in 673568.532 milliseconds.
[1] 673568.5 master CLI [LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=13 and max_depth=6
[LightGBM] [Info] Iteration:799, valid_1 binary_logloss : 0.0304543
[LightGBM] [Info] 587.834080 seconds elapsed, finished iteration 799
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=16 and max_depth=6
[LightGBM] [Info] Iteration:800, valid_1 binary_logloss : 0.0304542
[LightGBM] [Info] 588.445944 seconds elapsed, finished iteration 800
[LightGBM] [Info] Finished training master R [LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=21 and max_depth=6
[799]: test's binary_logloss:0.0304331
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=23 and max_depth=6
[800]: test's binary_logloss:0.0304333
The function ran in 607209.395 milliseconds.
[1] 607209.4 master CLI Visual Studio 2017 [LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=16 and max_depth=6
[LightGBM] [Info] Iteration:799, valid_1 binary_logloss : 0.030431
[LightGBM] [Info] 615.327239 seconds elapsed, finished iteration 799
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=11 and max_depth=6
[LightGBM] [Info] Iteration:800, valid_1 binary_logloss : 0.0304305
[LightGBM] [Info] 615.863466 seconds elapsed, finished iteration 800
[LightGBM] [Info] Finished training |
2x 10 core CPU (Dual Xeon Ivy Bridge, 3.3/2.7 GHz), 40 threads: master = microsoft/LightGBM@1d5867b
When using many cores, Visual Studio is significantly faster. But for MinGW, it is not good. So far, master branch is really fast (but I don't have the wide gap you have). Core scaling is way better with the master branch. v2.0 R (O2), CPU usage approx 25% [LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=13 and max_depth=6
[799]: test's binary_logloss:0.0305097
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=20 and max_depth=6
[800]: test's binary_logloss:0.0305099
The function ran in 244542.445 milliseconds.
[1] 244542.4 master R (O2): CPU usage approx 55% [LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=14 and max_depth=6
[799]: test's binary_logloss:0.0304216
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=24 and max_depth=6
[800]: test's binary_logloss:0.0304225
The function ran in 174157.502 milliseconds.
[1] 174157.5 master CLI (O3) with 40 threads: CPU usage approx 45% [LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=13 and max_depth=6
[LightGBM] [Info] Iteration:799, valid_1 binary_logloss : 0.0304543
[LightGBM] [Info] 164.173974 seconds elapsed, finished iteration 799
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=16 and max_depth=6
[LightGBM] [Info] Iteration:800, valid_1 binary_logloss : 0.0304542
[LightGBM] [Info] 164.336424 seconds elapsed, finished iteration 800
[LightGBM] [Info] Finished training master CLI Visual Studio 2017 with 40 threads: CPU usage approx 100% [LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=16 and max_depth=6
[LightGBM] [Info] Iteration:799, valid_1 binary_logloss : 0.030431
[LightGBM] [Info] 139.108287 seconds elapsed, finished iteration 799
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=11 and max_depth=6
[LightGBM] [Info] Iteration:800, valid_1 binary_logloss : 0.0304305
[LightGBM] [Info] 139.214094 seconds elapsed, finished iteration 800
[LightGBM] [Info] Finished training master CLI (O3) with 20 threads: CPU usage approx 33% [LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=13 and max_depth=6
[LightGBM] [Info] Iteration:799, valid_1 binary_logloss : 0.0304543
[LightGBM] [Info] 224.811707 seconds elapsed, finished iteration 799
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=16 and max_depth=6
[LightGBM] [Info] Iteration:800, valid_1 binary_logloss : 0.0304542
[LightGBM] [Info] 225.045553 seconds elapsed, finished iteration 800
[LightGBM] [Info] Finished training master CLI Visual Studio 2017 with 20 threads: CPU usage approx 50% [LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=16 and max_depth=6
[LightGBM] [Info] Iteration:799, valid_1 binary_logloss : 0.030431
[LightGBM] [Info] 162.647329 seconds elapsed, finished iteration 799
[LightGBM] [Info] No further splits with positive gain, best gain: -inf
[LightGBM] [Info] Trained a tree with leaves=11 and max_depth=6
[LightGBM] [Info] Iteration:800, valid_1 binary_logloss : 0.0304305
[LightGBM] [Info] 162.792439 seconds elapsed, finished iteration 800
[LightGBM] [Info] Finished training |
@guolinke Can I run the long tests on microsoft/LightGBM@3089f0b with Visual Studio 2017, or do you have any specific commit you would like me to benchmark? My full xgboost runs (including exact method which take forever) are ending this week. |
@Laurae2 |
@guolinke I'll use microsoft/LightGBM@a8673bd (latest master branch) then. |
@guolinke I have one server which finished running my benchmarks. I'll repost here when I get time to create a dashboard you will be able to explore (probably tomorrow). |
@Laurae2 BTW, I find the speed of LightGBM in VM(Azure) will be about 2x-3x slower than "real" machine, in multi-threading, when using the same CPU. |
@guolinke Here for the new benchmarks, tested on i7-7700K and 20 core Xeon: https://sites.google.com/view/lauraepp/new-benchmarks On VMs I noticed if the host machine is not rebooted frequently, the CPU performance sinks. I remember a server I did not reboot for 1 year whose performance was 50% of original performance. A simple reboot got it back to 100% (Cinebench R15 score of 350 instead of 700). |
@Laurae2 |
@guolinke I improved the chart layout if you want to check more in details, because I put too many charts on single pages. Now I separated them. |
I find the accuracy on Bosch have a quite big gap. Did you know why ? The two round finding for NAs ?
BTW, I run it in CLI version, it seems the gap is not so big.
The text was updated successfully, but these errors were encountered: