New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG max_depth=1 should be decision stump in HistGradientBoosti… #16182
Conversation
You should look at the tests which are failing. I assume that there a lot of places where we check the number of nodes or the depth where the count will be off now. |
There is a few places where the depth is stored with the original definition in mind. For example, the This needs a test to assert that the predictors are stumps with max_depth==1. |
@thomasjpfan, the test "test_max_depth" in test_grower.py asserts the tree depth to be 1 when max_depth is set to 1. If we need something more, can you please guide me on what needs to be tested.? |
The nodes stored in the predictor uses the previous definition of depth. For this PR, one needs to go through the comments and docstrings in the _hist module to make sure it is using the updated definition of depth. |
I don't think that's true, the predictor nodes use the same convention as the grower node since we just do However yes, docstrings should be updated |
On master, the root node is set to depth 0 and the children are set according, which means the node's depth was set using the tree-depth definition: "The depth of a node is the length of the path to its root". It looks like it was impossible to create stumps before this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @SanthoshBala18 , looks good.
Please also udpate the docstrings of TreeGrower
and HistGradientBoostingClassifier
.
Let's also add a non-regression test to make sure that we have a stump when we pass max_depth=1, with e.g.
def assert_is_stump(grower):
for leaf in (grower.root.left_child, grower.root.right_child):
assert leaf.left_child is None
assert leaf.right_child is None
you can use that in test_max_depth
, or create a similar new test.
@@ -689,8 +689,7 @@ class HistGradientBoostingRegressor(RegressorMixin, BaseHistGradientBoosting): | |||
than 1. If None, there is no maximum limit. | |||
max_depth : int or None, optional (default=None) | |||
The maximum depth of each tree. The depth of a tree is the number of | |||
nodes to go from the root to the deepest leaf. Must be strictly greater | |||
than 1. Depth isn't constrained by default. | |||
nodes to go from the root to the deepest leaf. Depth isn't constrained by default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nodes to go from the root to the deepest leaf. Depth isn't constrained by default. | |
edges to go from the root to the deepest leaf. Depth isn't constrained by default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also update the classifier's docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NicolasHug Thanks for the review. I have updated the docstrings and also added a test accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @SanthoshBala18 , LGTM
Please add an entry to the change log at doc/whats_new/v0.23.rst
. Like the other entries there, please reference this pull request with :issue:
and credit yourself with :user:
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @SanthoshBala18
doc/whats_new/v0.23.rst
Outdated
@@ -90,6 +90,9 @@ Changelog | |||
:user:`Reshama Shaikh <reshamas>`, and | |||
:user:`Chiara Marmo <cmarmo>`. | |||
|
|||
- |Fix| Fixed HistGradientBoosting to allow max_depth=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- |Fix| Fixed HistGradientBoosting to allow max_depth=1 | |
- |Fix| Changed the convention for `max_depth` parameter of :class:`ensemlble.HistGradientBoostingClassifier` and :class:`ensemlble.HistGradientBoostingRegressor`. The depth now corresponds to the number of edges to go from the root to the deepest leaf. Stumps (trees with one split) are now allowed. |
(skip lines where appropriate ;) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry my bad. Would update the changes properly from next time :)
doc/whats_new/v0.23.rst
Outdated
@@ -90,6 +90,13 @@ Changelog | |||
:user:`Reshama Shaikh <reshamas>`, and | |||
:user:`Chiara Marmo <cmarmo>`. | |||
|
|||
- |Fix| Changed the convention for `max_depth` parameter of | |||
:class:`ensemlble.HistGradientBoostingClassifier` and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:class:`ensemlble.HistGradientBoostingClassifier` and | |
:class:`ensemble.HistGradientBoostingClassifier` and |
same below ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you @SanthoshBala18 for the PR! |
Reference Issues/PRs
Fixes #16124
What does this implement/fix? Explain your changes.
Modified the validation in TreeGrower to allow max_depth value as 1.