Early stopping for XGBoost + Update Readme #63

inventormc · 2020-07-31T00:49:41Z

This PR supports early stopping for XGBoost. We leverage the incremental learning capabilities of XGBoost:

Note that this may not necessarily improve performance but instead allows us to break the training process into multiple parts.

clf = XGBClassifier(n_estimators=10, nthread=8)
base_model = None
for i in range(20):
    z = clf.fit(x_tr, y_tr, xgb_model=base_model)
    y_te = z.predict(x_te)
    print(sklearn.metrics.mean_squared_error(y_te, y_pr))
    base_model = z.get_booster()

resolves #58

inventormc · 2020-07-31T02:53:22Z

microsoft/LightGBM#3057 -init_model doesn't exist in the latest stable release yet

richardliaw · 2020-08-01T08:18:21Z

tune_sklearn/_trainable.py

+                if self.is_lgbm:
+                    self.saved_models[i] = self.estimator[i].fit(
+                        X_train, y_train, init_model=self.saved_models[i])
+                elif self.is_xgb:
+                    self.estimator[i].fit(
+                        X_train, y_train, xgb_model=self.saved_models[i])
+                    self.saved_models[i] = self.estimator[i].get_booster()
+                else:
+                    self.estimator[i].partial_fit(X_train, y_train,
+                                                  np.unique(self.y))


what if you just put this in another trainable? would it make sense there?

Why do we need to put it into another trainable?

i guess it's probably going to be unsustainable to have like 5 different special cases for different libraries, but right now it's probably fine

…er-partial merge master

…o warm-start merge master

…m-start merge master

richardliaw

OK this looks good to me. I've added a couple updates to make things clearer.

inventormc · 2020-08-31T08:46:28Z

tune_sklearn/_trainable.py

@@ -112,8 +118,14 @@ def _train(self):
                    test,
                    train_indices=train)
                if self._can_partial_fit():
-                    self.estimator_list[i].partial_fit(X_train, y_train,
-                                                       np.unique(self.y))
+                    if is_xgboost_model(self.main_estimator):


is this supposed to be under the case the we can partial_fit? I think xgboost doesn't have the that method, so maybe this should be outside

OK nice, I wrote a couple tests here.

inventormc · 2020-08-31T10:11:38Z

tune_sklearn/tune_basesearch.py

        self.estimator = estimator

-        if not self._can_early_stop() and max_iters is not None:


I think there's actually more cases we need to be checking here. Here's all the cases:

User wants to early stop, and it can be done

need to make sure the user sets max_iters, so raise a warning if this isn't done

User wants to early stop, and it cannot be done

directly throw an error, regardless of what max_iters is

maybe it would be nice the remove the warning that would come up in the if not self._can_early_stop() and max_iters > 1: since there will be duplicate errors

User does not want to early stop, and it can be done

if max_iters isn't set, do nothing

if max_iters is set, raise a warning that it is ignored because user didn't enable early stop

User does not want to early stop, and it cannot be done

if max_iters isn't set, do nothing

if max_iters is set, raise a warning that it is ignored because user didn't enable early stop

Right now

It is now reduced to this:

User wants to early stop, and it cannot be done

directly throw an error.

User wants to early stop, and it can be done, and max_iters is not set

raise a warning that it should be set.

regardless of whether it can be done, if user does not want to early stop and if max_iters is set:

raise a warning that it is ignored / set to 1.

…into other-partial

inventormc added 5 commits July 16, 2020 18:47

warm start

c81eeef

lint

a70d40f

lint and fix example

04b8fec

exclude DTs and ensembles

3022ad8

lgbm early stop

a23806c

inventormc added enhancement New feature or request wip Work in progress labels Jul 31, 2020

inventormc added 5 commits July 30, 2020 22:53

example

74890a5

xgb early stop

87aa7b9

lint

06fd20a

fix xgb

8b610e9

no early stop xgb

54c8857

richardliaw reviewed Aug 1, 2020

View reviewed changes

anthonyhsyu and others added 16 commits August 1, 2020 13:54

Quick doc update for example

313330a

xgb early stop example

c0adf2d

latest ray mac

6acf9ef

Merge branch 'master' of github.com:ray-project/tune-sklearn into oth…

9af31a8

…er-partial merge master

remove ray wheels mac

d1e5ba6

build lgbm from github

4d40d18

fix travis path

4dda46c

remove lgbm requirement

d4dd6b7

line limits

7852baa

1 boosting round

67ae046

apply suggestions

5a9d662

Merge branch 'warm-start' of github.com:inventormc/tune-sklearn-1 int…

525b4de

…o warm-start merge master

Merge branch 'master' of github.com:ray-project/tune-sklearn into war…

27ab222

…m-start merge master

apply suggestions

83051c4

Merge branch 'master' into warm-start

69d3ce1

validation_fix

38726fe

richardliaw added 5 commits August 29, 2020 18:06

soft-dep

4ae730c

fix

e18c129

sklearn

af169e6

fix-xgboost

5009199

test

6b53b43

richardliaw changed the title ~~Early stopping for other estimators~~ Early stopping for XGBoost Aug 31, 2020

richardliaw added 3 commits August 30, 2020 19:42

warn

9daf564

update

13371cc

update-test

3a78ade

richardliaw approved these changes Aug 31, 2020

View reviewed changes

richardliaw mentioned this pull request Aug 31, 2020

[Feature request] Early stopping doesn't work with XGBoost, LightGBM or CatBoost #58

Closed

richardliaw added 7 commits August 30, 2020 20:36

fixup-readme

19a2be6

fix

7c2eb2b

fix-docs

f939f12

fixup

eca2c13

fix

0713e29

exmaple

7528dd2

fix

4c5ce8f

inventormc commented Aug 31, 2020

View reviewed changes

inventormc and others added 8 commits August 31, 2020 03:48

add early stopping warnings

efa0504

update-utils

41747f7

lint

f941ef3

fix-tests

6d05e16

Merge branch 'other-partial' of github.com:inventormc/tune-sklearn-1 …

9b33896

…into other-partial

mock

eceed85

cleanup

e05e03d

fix

13b23a2

richardliaw changed the title ~~Early stopping for XGBoost~~ Early stopping for XGBoost + Update Readme Sep 1, 2020

richardliaw merged commit 483b700 into ray-project:master Sep 1, 2020

richardliaw mentioned this pull request Sep 1, 2020

Early stopping for xgboost and lgbm #79

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Early stopping for XGBoost + Update Readme #63

Early stopping for XGBoost + Update Readme #63

inventormc commented Jul 31, 2020 •

edited by richardliaw

Loading

inventormc commented Jul 31, 2020

richardliaw Aug 1, 2020

inventormc Aug 2, 2020

richardliaw Aug 2, 2020

richardliaw left a comment

inventormc Aug 31, 2020

richardliaw Sep 1, 2020

inventormc Aug 31, 2020

richardliaw Sep 1, 2020 •

edited

Loading

		self.estimator = estimator

		if not self._can_early_stop() and max_iters is not None:

Early stopping for XGBoost + Update Readme #63

Early stopping for XGBoost + Update Readme #63

Conversation

inventormc commented Jul 31, 2020 • edited by richardliaw Loading

inventormc commented Jul 31, 2020

richardliaw Aug 1, 2020

Choose a reason for hiding this comment

inventormc Aug 2, 2020

Choose a reason for hiding this comment

richardliaw Aug 2, 2020

Choose a reason for hiding this comment

richardliaw left a comment

Choose a reason for hiding this comment

inventormc Aug 31, 2020

Choose a reason for hiding this comment

richardliaw Sep 1, 2020

Choose a reason for hiding this comment

inventormc Aug 31, 2020

Choose a reason for hiding this comment

richardliaw Sep 1, 2020 • edited Loading

Choose a reason for hiding this comment

inventormc commented Jul 31, 2020 •

edited by richardliaw

Loading

richardliaw Sep 1, 2020 •

edited

Loading