-
Notifications
You must be signed in to change notification settings - Fork 861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix] Fix tests on scikit-learn upgrade #3872
Conversation
Job PR-3872-af7d6e3 is done. |
@@ -510,7 +510,7 @@ def fit(self, X, y, sample_weight=None): | |||
bootstrap_indices = np.arange(len(y)) | |||
|
|||
est_weights = np.bincount(bootstrap_indices, minlength=len(y)) | |||
y_train_leaves = est.y_train_leaves_ | |||
y_train_leaves = est.tree_.apply(X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This situation is not so simple. While this does indeed appear to fix the problem, the issue goes deeper and should be investigated.
Specifically, according to multi-inheritance rules in Python, BaseTreeQuantileRegressor.fit
should be called when DecisionTreeQuantileRegressor(BaseTreeQuantileRegressor, DecisionTreeRegressor)
calls super().fit
.
This is the case in scikit-learn 1.3.2, which is why est.y_train_leaves
exists, because the final line in the .fit
call sets it: self.y_train_leaves_ = self.tree_.apply(X)
However, in scikit-learn 1.4, this mysteriously changes, and BaseTreeQuantileRegressor.fit
is never called. You can verify this by adding raise AssertionError
to the start of BaseTreeQuantileRegressor.fit
. With scikit-learn 1.3.2 it raises the AssertionError, in scikit-learn 1.4 it never errors, meaning the code is never entered, despite the intention of the code to be entered.
We can confirm this by removing entirely the class from the code:
class DecisionTreeQuantileRegressor(BaseTreeQuantileRegressor, DecisionTreeRegressor):
->
class DecisionTreeQuantileRegressor(DecisionTreeRegressor)
This edit works without error. (Note: It also works for scikit-learn 1.3.2)
We can go even further:
super(RandomForestQuantileRegressor, self).__init__(
DecisionTreeQuantileRegressor(),
->
super(RandomForestQuantileRegressor, self).__init__(
DecisionTreeRegressor(),
This also works with identical results, removing the entire Quantile regressor tree logic, although we may want to keep it in case we wanted to fit a tree rather than a forest.
I have been unable to determine why the fit
is not called in scikit-learn==1.4
. My understanding of Python leads me to believe the code should be entered, and yet it isn't. I also haven't figured out what fundamentally changed in scikit-learn==1.4
to warrant this.
The current solution in this PR is merely a bandaid and likely isn't a proper solution, as it replicates what should have been set during est.fit(...)
, but doing so later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I observed the same things while working on the fix.
So for now should we just cap the scikit-learn
version to 1.3.2
to get the CI running, until we figure the cause for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Job PR-3872-851b7e2 is done. |
Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-154.us-west-2.compute.internal>
Description of changes:
Upgrading to scikit-learn 1.4.0 has brought on some changes in the API. PR to address the same.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.