[dask] Add support for early stopping in Dask interface #3712

jameslamb · 2021-01-03T22:53:41Z

Summary

DaskLGBMClassifier and DaskLGBMRegressor in the Python package should support early stopping.

Motivation

Early stopping is generally useful with gradient boosting algorithms, to avoid wasted training iterations or unnecessary growth in the model size once desirable performance has been achieved. This feature is available in the non-Dask interfaces for LightGBM, and should be available with the Dask one.

Description

This should mimic the approach XGBoost took (https://github.com/dmlc/xgboost/blob/516a93d25c3b6899558700430ffc99a29ea21e1a/python-package/xgboost/dask.py#L1386), where eval_set contains Dask collection s(Dask Array or Dask DataFrame).

References

relevant XGBoost PR (see linked issues and PRs): [FEA] DaskXGBClassifier Early Stopping dmlc/xgboost#5495
one proposal: [python] [dask] add initial dask integration #3515 (comment), h2oai/dask-lightgbm@8cc8e83

The text was updated successfully, but these errors were encountered:

jameslamb · 2021-01-03T22:54:17Z

Closing in favor of being in #2302 with other feature requests. Please leave a comment here if you'd like to work on this.

ffineis · 2021-01-12T17:33:57Z

Hey I'd like to take this if it's cool - planning mainly base this off of changes called out in #3515 and try to bring this in line with xgboost.dask.

jameslamb · 2021-01-12T17:38:39Z

sure, thank you! I'm really close to having a small reproducible example for the random "cannot bind to port XXXX" issue, will link that here when I've written it up.

ffineis · 2021-01-12T18:00:21Z

sure, thank you! I'm really close to having a small reproducible example for the random "cannot bind to port XXXX" issue, will link that here when I've written it up.

Omigod lifesaver, thank you!

…twork (fixes #3753) (#3766) * starting work * fixed port-binding issue on localhost * minor cleanup * updates * getting closer * definitely working for LocalCluster * it works, it works * docs * add tests * removing testing-only files * linting * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * remove duplicated code * remove unnecessary listen() Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

jameslamb · 2021-02-06T23:19:37Z

@ffineis do you think you'll have time to work on this this week? We're planning to do a 3.2.0 release of LightGBM in the next week or two. I didn't include this in my list of must-have Dask features for the next release (#3872 (comment)), but I'd love to try to get this change in if we can since it can have such a big impact on training runtime.

If you don't have time this week, could I take this back from you and try it out?

Thanks so much for all your help with the Dask module so far!!

ffineis · 2021-02-06T23:36:25Z

Hey! Yes sorry I’ve started planning for this, gimme another week? The only tricky part of matching xgboost.dask’s implementation is the id(training data) check. So yeah, sorry haven’t made any progress yet but was planning to make some commits this week.

…

On Sat, Feb 6, 2021 at 5:19 PM James Lamb ***@***.***> wrote: @ffineis <https://github.com/ffineis> do you think you'll have time to work on this this week? We're planning to do a 3.2.0 release of LightGBM in the next week or two. I didn't include this in my list of must-have Dask features for the next release (#3872 (comment) <#3872 (comment)>), but I'd love to try to get this change in if we can since it can have such a big impact on training runtime. If you don't have time this week, could I take this back from you and try it out? Thanks so much for all your help with the Dask module so far!! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3712 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACZIXCNZCJFB4DG5IQV7SCTS5XFBPANCNFSM4VSHUQIQ> .

jameslamb · 2021-02-07T02:53:13Z

oh yeah no problem, thanks!

jmoralez · 2021-02-08T02:00:36Z

Hi. I've been playing around with this and have a working (although horrible) implementation of this. The approach I took was using the futures of the persisted collections instead of turning the collections to lists of delayeds, so this avoids recomputing the training set in case it is in the eval_set as well (because both futures have the same key). Would you guys be interested in discussing this approach?

jameslamb · 2021-02-08T03:59:51Z

Interesting! I'll leave it to @ffineis to comment on that. Right now, I think our highest priority is supporting early stopping, and it would be ok if the first implementation of that merged to master computes the training data twice.

ffineis · 2021-02-08T04:39:03Z

Hey @jmoralez, thanks for the ideation! Honestly, don't feel bad, I think any implementation of ES will be pretty hairy given that we're using lists of delayed partitions instead of distributed lgbm.Datasets. I'm a fan of how xgboost.dask attempts to accomplish what you've mention via id - I'm just going to check whether each X in a (X, y) eval set shares the same id as that of data. If so, this will instruct _train_part (local worker training function) to just use the local data in for any eval set's parts that had originally called for data, so hoping we'll avoid re-computing the entire trainset.

This method works if the entirety of an eval X equals data, but I'm guessing your approach goes even further in that it would also include cases when an eval X is a subset of partitions of training data? I'd say in the spirit of how PR's have been going (modular, one change at a time) - hold off until a naive early stopping approach gets merged, but then open an issue and follow up with an improvement PR - does that work?

jmoralez · 2021-02-08T15:40:24Z

Yeah that sounds fair. Is there a plan to create an lgb.DaskDataset?

ffineis · 2021-02-12T05:23:44Z

@jmoralez my guess is yes - I think check out #2302? Jlamb has a running list of dask features somewhere.

PR for this coming tomorrow night.

jameslamb · 2021-02-12T05:32:52Z

yes but I haven't written it up. Will do that right now. I like what xgboost did with DaskDMatrix. https://github.com/dmlc/xgboost/blob/9a0399e8981b2279d921fe2312f7ab1b880fd3c3/python-package/xgboost/dask.py#L185

jameslamb · 2021-02-12T05:58:34Z

I've added #3944 to track the idea of a DaskDataset. That's a fairly invasive change so I don't think it should be done before the 3.2.0 release (#3872)

StrikerRUS · 2021-07-12T10:53:31Z

Closing for now due to the lack of active work on this feature and open PRs.

jameslamb added feature request dask labels Jan 3, 2021

jameslamb closed this as completed Jan 3, 2021

jameslamb mentioned this issue Jan 3, 2021

[python] [dask] add initial dask integration #3515

Merged

StrikerRUS mentioned this issue Jan 4, 2021

Feature Requests & Voting Hub #2302

Open

jameslamb reopened this Jan 12, 2021

jameslamb assigned ffineis Jan 12, 2021

ffineis mentioned this issue Feb 13, 2021

[dask] Early stopping #3952

Closed

jameslamb mentioned this issue Feb 14, 2021

[dask] Support custom metric functions #3956

Closed

jameslamb mentioned this issue Apr 28, 2021

ValueError: bytes object is too large #4240

Closed

ffineis mentioned this issue Jun 20, 2021

[dask] add support for eval sets and custom eval functions #4101

Merged

jameslamb mentioned this issue Jun 29, 2021

How to print best_iteration, best_score with lightgbm.DaskLGBMClassifier model? #4417

Closed

StrikerRUS closed this as completed Jul 12, 2021

jameslamb mentioned this issue Jul 28, 2021

[dask] early_stopping_rounds for dask #4493

Closed

jmoralez mentioned this issue Nov 17, 2023

DaskLGBMRegressor early stopping cause socket error 104 #6197

Open

jameslamb mentioned this issue Mar 5, 2024

[dask] [python-package] Early stopping causes DaskLGBMClassifier to hang #6351

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dask] Add support for early stopping in Dask interface #3712

[dask] Add support for early stopping in Dask interface #3712

jameslamb commented Jan 3, 2021

jameslamb commented Jan 3, 2021

ffineis commented Jan 12, 2021

jameslamb commented Jan 12, 2021

ffineis commented Jan 12, 2021

jameslamb commented Feb 6, 2021

ffineis commented Feb 6, 2021 via email

jameslamb commented Feb 7, 2021

jmoralez commented Feb 8, 2021

jameslamb commented Feb 8, 2021

ffineis commented Feb 8, 2021 •

edited

jmoralez commented Feb 8, 2021

ffineis commented Feb 12, 2021

jameslamb commented Feb 12, 2021

jameslamb commented Feb 12, 2021

StrikerRUS commented Jul 12, 2021

[dask] Add support for early stopping in Dask interface #3712

[dask] Add support for early stopping in Dask interface #3712

Comments

jameslamb commented Jan 3, 2021

Summary

Motivation

Description

References

jameslamb commented Jan 3, 2021

ffineis commented Jan 12, 2021

jameslamb commented Jan 12, 2021

ffineis commented Jan 12, 2021

jameslamb commented Feb 6, 2021

ffineis commented Feb 6, 2021 via email

jameslamb commented Feb 7, 2021

jmoralez commented Feb 8, 2021

jameslamb commented Feb 8, 2021

ffineis commented Feb 8, 2021 • edited

jmoralez commented Feb 8, 2021

ffineis commented Feb 12, 2021

jameslamb commented Feb 12, 2021

jameslamb commented Feb 12, 2021

StrikerRUS commented Jul 12, 2021

ffineis commented Feb 8, 2021 •

edited