Benchmarks for model builders from XGBoost and LightGBM models #36

RukhovichIV · 2020-09-25T10:36:02Z

No description provided.

…bench into latest_stable

RukhovichIV · 2020-10-02T15:03:13Z

@PetrovKP, @Alexsandruss, @ShvetsKS,
Added LightGBM and XGBoost benchmarks as a new "lib".
I tried to use common functions from other libs (e. g. from ./xgboost/bench.py), but they're unavailable from running scope, so I made a new file (./modelbuilders/bench.py), which consists of copied, slightly shortened, minimum required functions from ./xgboost/bench.py. I only added my get_accuracy() function, because it's much shorter, than what I found there.
I also pushed a MR with new configs to our rep. It works great together.
I also added single-precision-histogram and enable-experimental-json-serialization parameters to xgboost benchmarks

PetrovKP · 2020-10-05T13:02:31Z

.gitignore

@@ -11,3 +11,4 @@ __work*
 # Datasets
 dataset
 *.csv
+*.npy


PetrovKP · 2020-10-05T13:03:13Z

modelbuilders/bench.py

@@ -0,0 +1,509 @@
+import argparse


Copyright ?

PetrovKP · 2020-10-05T13:04:20Z

modelbuilders/lgbm_mb.py

+                'lgbm_predict', 'lgbm_to_daal', 'daal_compute'],
+             times=[t_creat_train, t_creat_test, t_train, t_lgbm_pred, t_trans, t_daal_pred],
+             accuracy_type=metric_name, accuracies=[0, 0, train_metric, test_metric_xgb, 0, test_metric_daal],
+             data=[X_train, X_test, X_train, X_test, X_train, X_test])


Alexsandruss · 2020-10-05T13:51:10Z

modelbuilders/bench.py

+import json
+
+
+def columnwise_score(y, yp, score_func):


bench.py file should be similar in all folder (sklearn, daal4py, etc.)

Thank you. Added same bench.py as in other folders.
I also added the utils.py file with function, that I use in both my benchmarks

Alexsandruss · 2020-10-05T13:54:04Z

modelbuilders/lgbm_mb.py

+                'lgbm_predict', 'lgbm_to_daal', 'daal_compute'],
+             times=[t_creat_train, t_creat_test, t_train, t_lgbm_pred, t_trans, t_daal_pred],
+             accuracy_type=metric_name, accuracies=[0, 0, train_metric, test_metric_xgb, 0, test_metric_daal],
+             data=[X_train, X_test, X_train, X_test, X_train, X_test])


Add newline

Alexsandruss · 2020-10-05T13:57:40Z

modelbuilders/xgb_mb.py

+
+print_output(library='modelbuilders', algorithm=f'xgboost_{task}_and_modelbuilder',
+             stages=['xgb_train_dmatrix_create', 'xgb_test_dmatrix_create', 'xgb_training', 'xgb_prediction', 
+                'xgb_to_daal_conv', 'daal_prediction'],


Use flake8 to correct formatting:
pip/conda install flake8
flake8 <folder or file names>
You can ignore 'too long line' if line is complicated

done it for every file, which I added / changed

I also used autopep8 formatting, so every line is less or equal 100 characters now

Alexsandruss · 2020-10-05T13:58:33Z

xgboost/gbt.py

+                    help='Count DMatrix creation in time measurements')
+parser.add_argument('--single-precision-histogram', default=False, action='store_true',
+                    help='Build histograms instead of double precision')
+parser.add_argument('--enable-experimental-json-serialization', default=True,


default=False is better if this feature affects perf.

It's true by default in XGBoost, so, as we discussed, I decided to leave it as is

Alexsandruss and others added 15 commits June 7, 2020 20:15

Add 'count-dmatrix' option in XGB benchmark

0792964

temp. fix cuml verbosity

cf3823d

temp. fix cuml verbosity 2

b0a87dc

Verbosity fix

9d84566

Merge branch 'master' of https://github.com/IntelPython/scikit-learn_…

1423837

…bench into latest_stable

Added modelbuilders benchmarks for xgb and lgbm

d0b6c40

Benchmarks are done

01a5c60

Removed grow policy parameter from lgbm

2e0fb59

Checking for caching

c6e738a

caching fix IntelPython#2

9d8ca66

Added two parameters to xgb benchmarks

b34c02f

Removed redundant prints

b9a9167

Fixed config parameters

b732b10

Orph. mistake fixed

05b6fb1

removed config files from bench repository

24bffab

PetrovKP reviewed Oct 5, 2020

View reviewed changes

Alexsandruss requested changes Oct 5, 2020

View reviewed changes

applying pr comments

e38ffaf

Alexsandruss approved these changes Oct 8, 2020

View reviewed changes

igor_rukhovich added 2 commits October 8, 2020 19:27

Changed the print function (makes print shorter)

f0aa477

Changed output style

8ee94d9

Alexsandruss merged commit 296a991 into IntelPython:master Oct 9, 2020

Benchmarks for model builders from XGBoost and LightGBM models #36

Benchmarks for model builders from XGBoost and LightGBM models #36

Uh oh!

Conversation

RukhovichIV commented Sep 25, 2020

Uh oh!

RukhovichIV commented Oct 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

RukhovichIV commented Oct 2, 2020 •

edited

Loading