Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] Fix tests on scikit-learn upgrade #3872

Merged
merged 2 commits into from
Jan 23, 2024

Conversation

prateekdesai04
Copy link
Contributor

Description of changes:
Upgrading to scikit-learn 1.4.0 has brought on some changes in the API. PR to address the same.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copy link

Job PR-3872-af7d6e3 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3872/af7d6e3/index.html

@@ -510,7 +510,7 @@ def fit(self, X, y, sample_weight=None):
bootstrap_indices = np.arange(len(y))

est_weights = np.bincount(bootstrap_indices, minlength=len(y))
y_train_leaves = est.y_train_leaves_
y_train_leaves = est.tree_.apply(X)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This situation is not so simple. While this does indeed appear to fix the problem, the issue goes deeper and should be investigated.

Specifically, according to multi-inheritance rules in Python, BaseTreeQuantileRegressor.fit should be called when DecisionTreeQuantileRegressor(BaseTreeQuantileRegressor, DecisionTreeRegressor) calls super().fit.

This is the case in scikit-learn 1.3.2, which is why est.y_train_leaves exists, because the final line in the .fit call sets it: self.y_train_leaves_ = self.tree_.apply(X)

However, in scikit-learn 1.4, this mysteriously changes, and BaseTreeQuantileRegressor.fit is never called. You can verify this by adding raise AssertionError to the start of BaseTreeQuantileRegressor.fit. With scikit-learn 1.3.2 it raises the AssertionError, in scikit-learn 1.4 it never errors, meaning the code is never entered, despite the intention of the code to be entered.

We can confirm this by removing entirely the class from the code:

class DecisionTreeQuantileRegressor(BaseTreeQuantileRegressor, DecisionTreeRegressor):

->

class DecisionTreeQuantileRegressor(DecisionTreeRegressor)

This edit works without error. (Note: It also works for scikit-learn 1.3.2)

We can go even further:

        super(RandomForestQuantileRegressor, self).__init__(
            DecisionTreeQuantileRegressor(),

->

        super(RandomForestQuantileRegressor, self).__init__(
            DecisionTreeRegressor(),

This also works with identical results, removing the entire Quantile regressor tree logic, although we may want to keep it in case we wanted to fit a tree rather than a forest.

I have been unable to determine why the fit is not called in scikit-learn==1.4. My understanding of Python leads me to believe the code should be entered, and yet it isn't. I also haven't figured out what fundamentally changed in scikit-learn==1.4 to warrant this.

The current solution in this PR is merely a bandaid and likely isn't a proper solution, as it replicates what should have been set during est.fit(...), but doing so later on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I observed the same things while working on the fix.
So for now should we just cap the scikit-learn version to 1.3.2 to get the CI running, until we figure the cause for this?

Copy link
Contributor

@Innixma Innixma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Innixma Innixma merged commit 267f738 into autogluon:master Jan 23, 2024
15 checks passed
Copy link

Job PR-3872-851b7e2 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3872/851b7e2/index.html

@Innixma Innixma added this to the 1.0.1 Release milestone Feb 13, 2024
@Innixma Innixma modified the milestones: 1.0.1 Release, 1.1 Release Apr 5, 2024
@prateekdesai04 prateekdesai04 self-assigned this Apr 5, 2024
LennartPurucker pushed a commit to LennartPurucker/autogluon that referenced this pull request Jun 1, 2024
Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-154.us-west-2.compute.internal>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants