Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results #1

Open
sh1ng opened this issue Feb 20, 2019 · 4 comments
Open

Results #1

sh1ng opened this issue Feb 20, 2019 · 4 comments

Comments

@sh1ng
Copy link
Owner

sh1ng commented Feb 20, 2019

Criteo dataset

40M dataset - 5000 threes

python3 src/criteo_speed_test.py xgboost ; python3 src/criteo_speed_test.py lightgbm; python3 src/criteo_speed_test.py arboretum
reading data....
startring benchmark xgboost
2388.4275090694427
roc auc train:0.8894494708973515 cv:0.782368838905777
reading data....
startring benchmark lightgbm
[LightGBM] [Warning] Starting from the 2.1.2 version, default value for the "boost_from_average" parameter in "binary" objective is true.
This may cause significantly different results comparing to the previous versions of LightGBM.
Try to set boost_from_average=false, if your old models produce bad results
[LightGBM] [Info] Number of positive: 1031045, number of negative: 30968955
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 3872
[LightGBM] [Info] Number of data: 32000000, number of used features: 39
[LightGBM] [Info] Using GPU Device: GeForce GTX 1070, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 12
[LightGBM] [Info] 38 dense feature groups (1220.70 MB) transfered to GPU in 0.967707 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.032220 -> initscore=-3.402412
[LightGBM] [Info] Start training from score -3.402412
2522.5609214305878
roc auc train:0.8973754627179997 cv:0.7764168713748687
reading data....
startring benchmark arboretum
feature 0 has been reduced to 15 bits 
feature 1 has been reduced to 13 bits 
feature 2 has been reduced to 10 bits 
feature 3 has been reduced to 15 bits 
feature 4 has been reduced to 12 bits 
feature 5 has been reduced to 10 bits 
feature 6 has been reduced to 9 bits 
feature 7 has been reduced to 14 bits 
feature 8 has been reduced to 10 bits 
feature 9 has been reduced to 4 bits 
feature 10 has been reduced to 8 bits 
feature 11 has been reduced to 19 bits 
feature 12 has been reduced to 11 bits 
max feature size 19 
Total bytes 8513978368 available 2080964608 
Memory usage estimation 220 per record 7040000000 in total 
copied features data 13 from 13 
copied category features 1 from 26 


roc auc train:0.8108035778617945 cv:0.7864887547868098

10M dataset - 5000 threes

reading data....
startring benchmark xgboost
662.0523405075073
roc auc train:0.965875942403954 cv:0.756494298707683
reading data....
startring benchmark lightgbm
[LightGBM] [Warning] Starting from the 2.1.2 version, default value for the "boost_from_average" parameter in "binary" objective is true.
This may cause significantly different results comparing to the previous versions of LightGBM.
Try to set boost_from_average=false, if your old models produce bad results
[LightGBM] [Info] Number of positive: 247552, number of negative: 7752448
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 3844
[LightGBM] [Info] Number of data: 8000000, number of used features: 39
[LightGBM] [Info] Using GPU Device: GeForce GTX 1070, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 12
[LightGBM] [Info] 38 dense feature groups (305.18 MB) transfered to GPU in 0.311981 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.030944 -> initscore=-3.444143
[LightGBM] [Info] Start training from score -3.444143
805.6181969642639
roc auc train:0.9904928384666543 cv:0.7458449429346047
reading data....
startring benchmark arboretum
feature 0 has been reduced to 14 bits 
feature 1 has been reduced to 13 bits 
feature 2 has been reduced to 9 bits 
feature 3 has been reduced to 14 bits 
feature 4 has been reduced to 12 bits 
feature 5 has been reduced to 9 bits 
feature 6 has been reduced to 9 bits 
feature 7 has been reduced to 13 bits 
feature 8 has been reduced to 9 bits 
feature 9 has been reduced to 4 bits 
feature 10 has been reduced to 8 bits 
feature 11 has been reduced to 19 bits 
feature 12 has been reduced to 10 bits 
max feature size 19 
Total bytes 8513978368 available 7124615168 
Memory usage estimation 180 per record 1440000000 in total 
copied features data 13 from 13 
copied category features 26 from 26 
roc auc train:0.8339605258327059 cv:0.7800383186995146
@sh1ng
Copy link
Owner Author

sh1ng commented Feb 21, 2019

[619]	train-auc:0.845484	eval-auc:0.774196
456.7466940879822
roc auc train:0.8490025621479161 cv:0.7741655373640315

@sh1ng
Copy link
Owner Author

sh1ng commented Feb 21, 2019

1k

roc auc train:0.8002206696190001 cv:0.7734866977285518

2k

roc auc train:0.812934567821127 cv:0.7774226899799649

@sh1ng
Copy link
Owner Author

sh1ng commented Feb 21, 2019

2k - eta=0.2

roc auc train:0.8291160398329509 cv:0.7782857646696295

@sh1ng
Copy link
Owner Author

sh1ng commented Feb 21, 2019

4k -eta=0.2

roc auc train:0.8517063654176573 cv:0.7775750212924546
``

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant