Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEA Add staged_predict to HistGradientBoosting #16985

Merged
merged 3 commits into from May 17, 2020

Conversation

haochunchang
Copy link
Contributor

@haochunchang haochunchang commented Apr 21, 2020

Reference Issues/PRs

Try Fixes #16063

What does this implement/fix? Explain your changes.

Add staged_predict by emulating BaseGradientBoosting. Including private functions.

  • Add associated tests
  • Update documentation (Currently, it was copied from BaseGradientBoosting.)
  • Also add to HistGradientBoostingClassifier

Any other comments?

This is my first PR for a feature request.
I would be very appreciated for any advices or collaborations.

@NicolasHug
Copy link
Member

thanks for the PR @haochunchang , please ping me when you are ready to get reviews or if you need any help

@haochunchang
Copy link
Contributor Author

haochunchang commented Apr 25, 2020

I have also checked with an example: test_staged_predict_example.txt (adopted from here).
Although I did not convert it into a jupyter notebook with binder, the plot seems reasonable.

test

In short, I moved part of the original _raw_predict to _predict_stages.
And staged_predict calls _staged_raw_predict that uses _predict_stages by supplying part of predictors_.

I wasn't sure whether the change is what staged_predict want, but it seems it can incrementally predict the target.

I would be very appreciated for @NicolasHug or anyone who reviews this code and gives any directions for checking :)

@haochunchang haochunchang changed the title [WIP]Add staged_predict() function to HistGradientBoostingRegressor [MRG]Add staged_predict() function to HistGradientBoostingRegressor Apr 25, 2020
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @haochunchang , it looks in good shape!

I made a few comments, mostly on the style.

Do you also want to add support for the classifier?

@haochunchang haochunchang changed the title [MRG]Add staged_predict() function to HistGradientBoostingRegressor [WIP]Add staged_predict() function to HistGradientBoostingRegressor Apr 25, 2020
@NicolasHug
Copy link
Member

About the tests, should we add separated tests like

hmm, use your best judgement. It'd be fine eitherway. Also note that we don't need the first one if you do the second

@haochunchang haochunchang changed the title [WIP]Add staged_predict() function to HistGradientBoostingRegressor [WIP]Add staged_predict() function to HistGradientBoosting Apr 25, 2020
@haochunchang
Copy link
Contributor Author

haochunchang commented Apr 25, 2020

For simplicity, I added only one test for testing staged_predict.
It tests both Classifier and Regressor by parametrize:

  • Predictions at each stage equals newly-trained predictions (also test max_iter = 1).
  • The number of iterations in staged_predict is equal to gb.n_iter_.

@haochunchang haochunchang changed the title [WIP]Add staged_predict() function to HistGradientBoosting [MRG]Add staged_predict() function to HistGradientBoosting Apr 25, 2020
@haochunchang
Copy link
Contributor Author

I would be very appreciated for @NicolasHug or anyone who reviews this code and gives any directions for checking :)

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your patience @haochunchang , we've been busy with the release

This mostly looks good, I'd just suggest to also add staged_predict_proba and staged_decision_function which should be pretty easy now that you have it all laid out.

@haochunchang haochunchang changed the title [MRG]Add staged_predict() function to HistGradientBoosting [WIP]Add staged_predict() function to HistGradientBoosting May 10, 2020
@haochunchang haochunchang force-pushed the add_staged_predict branch 3 times, most recently from 2bfe890 to 700ba6e Compare May 11, 2020 01:24
@haochunchang haochunchang changed the title [WIP]Add staged_predict() function to HistGradientBoosting [MRG]Add staged_predict() function to HistGradientBoosting May 11, 2020
@haochunchang
Copy link
Contributor Author

Good effort for the release! @NicolasHug
I have also added staged_predict_proba and staged_decision_function and applied your valuable suggestions.
Thanks for your time :)

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @haochunchang , almost there

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @haochunchang

@NicolasHug
Copy link
Member

maybe @amueller @thomasjpfan can review this one?

@NicolasHug
Copy link
Member

Thanks @haochunchang !

Sorry we forgot to mention earlier: we'll need a what's new entry:

Please add an entry to the change log at doc/whats_new/v024.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself with :user:. Also since it's new you'll need to create a :mod:sklearn.ensemble section.

You can take inspiration from the 0.23 what'snew if you need

Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
doc/whats_new/v0.24.rst Outdated Show resolved Hide resolved
@NicolasHug NicolasHug changed the title [MRG]Add staged_predict() function to HistGradientBoosting FEA Add staged_predict to HistGradientBoosting May 17, 2020
@NicolasHug NicolasHug merged commit f58c2bc into scikit-learn:master May 17, 2020
@NicolasHug
Copy link
Member

Thanks for the work and for your patience @haochunchang !

Just a side note for you next contributions: you don't need to force-push changes, we're always going to squash and merge anyway. When you force push it makes things a little bit harder for us if we have already checked-out your branch ;)

@haochunchang
Copy link
Contributor Author

Got it!
Thanks @NicolasHug and @thomasjpfan for the review and all the suggestions :)

@haochunchang haochunchang deleted the add_staged_predict branch May 17, 2020 14:56
viclafargue pushed a commit to viclafargue/scikit-learn that referenced this pull request Jun 26, 2020
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Add staged_predict() function to HistGradientBoostingRegressor
3 participants