Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Monotonic Contraints for Tree-based models #13649

Merged
merged 203 commits into from
Jul 1, 2023

Conversation

samronsin
Copy link
Contributor

@samronsin samronsin commented Apr 15, 2019

Continuation of PR #7266 addressing issue #6656 and #18982.

TODO:

@NicolasHug
Copy link
Member

Very interested in this, please ping me when you have something!

@samronsin
Copy link
Contributor Author

I added a few tests to check that the implementation actually enforced the monotonicity constraints, but they are failing...
Unless my tests are wrong, this suggests that the current implementation is not correct.

@samronsin
Copy link
Contributor Author

Two issues:

  • montonicity for classification is with respect to class 0 => switch to class 1 ?
  • a small fraction of rows (~5% for 1000 rows, which is hidden in the current tests...) fail to satisfy the monotonicity constraints -- I suspect this could be due to unfortunate float64 to float32 conversions at during the predict call

@NicolasHug
Copy link
Member

@samronsin LMK when you need reviews.

I've implemented monotonic constraints for the hist-GBDTs in #15582

It's still under review so your comments are more than welcome there. I think you'll also find some tests worth using. They helped me a great deal in debugging my code

@NicolasHug
Copy link
Member

NicolasHug commented Jan 29, 2020

montonicity for classification is with respect to class 0 => switch to class 1 ?

Yes. Maybe this be done by just reversing the constraint? Not sure

a small fraction of rows (~5% for 1000 rows, which is hidden in the current tests...) fail to satisfy the monotonicity constraints -- I suspect this could be due to unfortunate float64 to float32 conversions at during the predict call

You don't seem to be constraining the values of the children based on the average value of a pair of siblings. You'll need to introduce lower and upper bounds to the nodes values for monotonic constraints to be properly enforced (see https://github.com/scikit-learn/scikit-learn/pull/15582/files#diff-9bd2ee07fb6817a0aee54f959d32de8aR139).

@samronsin
Copy link
Contributor Author

I take back what I just told you @ogrisel, training with monotonic constraints and missing values fails to produce monotonic trees for classifications (although it seems to work as is for regressions).

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you very much @samronsin!

sklearn/ensemble/_forest.py Outdated Show resolved Hide resolved
sklearn/tree/_criterion.pyx Show resolved Hide resolved
Copy link
Member

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you for your notable contribution, your persistence, and your patience, @samronsin.

@lorentzenchr: are you interested in having a look at this PR before merging it?

@samronsin
Copy link
Contributor Author

Thank you @jjerphan, @ogrisel and @NicolasHug for your invaluable help on this PR ! Very happy to see this work landing eventually !

@jjerphan
Copy link
Member

At yesterday's monthly meeting, we mentioned that it would be appropriate to test data structure consistency between HGBT's and DecisionTree's when constraints of monotonicity are used.

I think we better do it in a subsequent PR.

I will merge this PR on Friday if no-one objects in the meantime.

Copy link
Member

@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments.

doc/whats_new/v1.4.rst Outdated Show resolved Hide resolved
sklearn/ensemble/_forest.py Show resolved Hide resolved
sklearn/tree/_classes.py Outdated Show resolved Hide resolved
sklearn/tree/_classes.py Outdated Show resolved Hide resolved
sklearn/tree/_classes.py Outdated Show resolved Hide resolved
sklearn/tree/_classes.py Outdated Show resolved Hide resolved
@jjerphan
Copy link
Member

I would wait for @lorentzenchr's approval before merging.

@lorentzenchr lorentzenchr merged commit b88b539 into scikit-learn:main Jul 1, 2023
28 checks passed
@jjerphan
Copy link
Member

jjerphan commented Jul 1, 2023

🎉

punndcoder28 pushed a commit to punndcoder28/scikit-learn that referenced this pull request Jul 29, 2023
Co-authored-by: Pat O'Reilly <patrick.oreilly256@gmail.com>
Co-authored-by: dsleo <leooleds@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
Co-authored-by: Pat O'Reilly <patrick.oreilly256@gmail.com>
Co-authored-by: dsleo <leooleds@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
@samronsin samronsin deleted the monotonic-trees branch June 6, 2024 10:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet