[Fix] Fix tests on scikit-learn upgrade #3872

prateekdesai04 · 2024-01-22T13:59:26Z

Description of changes:
Upgrading to scikit-learn 1.4.0 has brought on some changes in the API. PR to address the same.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

github-actions · 2024-01-22T16:51:20Z

Job PR-3872-af7d6e3 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3872/af7d6e3/index.html

Innixma · 2024-01-23T02:26:24Z

tabular/src/autogluon/tabular/models/rf/rf_quantile.py

@@ -510,7 +510,7 @@ def fit(self, X, y, sample_weight=None):
                bootstrap_indices = np.arange(len(y))

            est_weights = np.bincount(bootstrap_indices, minlength=len(y))
-            y_train_leaves = est.y_train_leaves_
+            y_train_leaves = est.tree_.apply(X)


This situation is not so simple. While this does indeed appear to fix the problem, the issue goes deeper and should be investigated.

Specifically, according to multi-inheritance rules in Python, BaseTreeQuantileRegressor.fit should be called when DecisionTreeQuantileRegressor(BaseTreeQuantileRegressor, DecisionTreeRegressor) calls super().fit.

This is the case in scikit-learn 1.3.2, which is why est.y_train_leaves exists, because the final line in the .fit call sets it: self.y_train_leaves_ = self.tree_.apply(X)

However, in scikit-learn 1.4, this mysteriously changes, and BaseTreeQuantileRegressor.fit is never called. You can verify this by adding raise AssertionError to the start of BaseTreeQuantileRegressor.fit. With scikit-learn 1.3.2 it raises the AssertionError, in scikit-learn 1.4 it never errors, meaning the code is never entered, despite the intention of the code to be entered.

We can confirm this by removing entirely the class from the code:

class DecisionTreeQuantileRegressor(BaseTreeQuantileRegressor, DecisionTreeRegressor): -> class DecisionTreeQuantileRegressor(DecisionTreeRegressor)

This edit works without error. (Note: It also works for scikit-learn 1.3.2)

We can go even further:

super(RandomForestQuantileRegressor, self).__init__( DecisionTreeQuantileRegressor(), -> super(RandomForestQuantileRegressor, self).__init__( DecisionTreeRegressor(),

This also works with identical results, removing the entire Quantile regressor tree logic, although we may want to keep it in case we wanted to fit a tree rather than a forest.

I have been unable to determine why the fit is not called in scikit-learn==1.4. My understanding of Python leads me to believe the code should be entered, and yet it isn't. I also haven't figured out what fundamentally changed in scikit-learn==1.4 to warrant this.

The current solution in this PR is merely a bandaid and likely isn't a proper solution, as it replicates what should have been set during est.fit(...), but doing so later on.

Yes, I observed the same things while working on the fix.
So for now should we just cap the scikit-learn version to 1.3.2 to get the CI running, until we figure the cause for this?

Innixma

LGTM!

github-actions · 2024-01-23T20:46:32Z

Job PR-3872-851b7e2 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3872/851b7e2/index.html

Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-154.us-west-2.compute.internal>

fix tests on scikit-learn upgrade

af7d6e3

prateekdesai04 requested review from tonyhoo, shchur and Innixma January 22, 2024 14:12

Innixma reviewed Jan 23, 2024

View reviewed changes

changing cap for scikit-learn and reverting y_train_leaves calculation

851b7e2

prateekdesai04 requested a review from Innixma January 23, 2024 18:02

Innixma approved these changes Jan 23, 2024

View reviewed changes

Innixma merged commit 267f738 into autogluon:master Jan 23, 2024
15 checks passed

prateekdesai04 mentioned this pull request Jan 25, 2024

[Tabular] Removing scikit-learn upgrade cap and handling failures in DecisionTreeRegressor #3881

Merged

Innixma added this to the 1.0.1 Release milestone Feb 13, 2024

Innixma modified the milestones: 1.0.1 Release, 1.1 Release Apr 5, 2024

prateekdesai04 self-assigned this Apr 5, 2024

LennartPurucker pushed a commit to LennartPurucker/autogluon that referenced this pull request Jun 1, 2024

[Fix] Fix tests on scikit-learn upgrade (autogluon#3872)

6f63bd2

Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-154.us-west-2.compute.internal>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] Fix tests on scikit-learn upgrade #3872

[Fix] Fix tests on scikit-learn upgrade #3872

prateekdesai04 commented Jan 22, 2024

github-actions bot commented Jan 22, 2024

Innixma Jan 23, 2024

prateekdesai04 Jan 23, 2024

Innixma left a comment

github-actions bot commented Jan 23, 2024

[Fix] Fix tests on scikit-learn upgrade #3872

[Fix] Fix tests on scikit-learn upgrade #3872

Conversation

prateekdesai04 commented Jan 22, 2024

github-actions bot commented Jan 22, 2024

Innixma Jan 23, 2024

Choose a reason for hiding this comment

prateekdesai04 Jan 23, 2024

Choose a reason for hiding this comment

Innixma left a comment

Choose a reason for hiding this comment

github-actions bot commented Jan 23, 2024