Update predict() / predict_proba() of RF to match sklearn #3609

hcho3 · 2021-03-12T16:48:10Z

Closes #3347.

Make the predict() and predict_proba() functions of RF to match those in the scikit-learn RF.

Eliminate the parameter output_class. Instead, predict() will always produce class prediction, and predict_proba() will always produce probability prediction. (This applies to binary and multi-class classifiers. Regressors will only have predict().)
Remove the threshold parameter from predict_proba().

JohnZed

LGTM. Just one small question.

JohnZed · 2021-03-12T23:33:12Z

python/cuml/dask/ensemble/randomforestclassifier.py

@@ -271,7 +271,7 @@ def fit(self, X, y, convert_dtype=False):
                  convert_dtype=convert_dtype)
        return self

-    def predict(self, X, output_class=True, algo='auto', threshold=0.5,
+    def predict(self, X, algo='auto', threshold=0.5,


should we remove this threshold too?

(it's not in sklearn)

This one is fine, since predict() produces class predictions.

JohnZed · 2021-03-15T02:58:52Z

@gpucibot merge

Closes #3682 In #3609, I unintentionally broke the function `score()` in the random forest. This PR restores the functionality. In addition, I added `score()` to one of the unit tests to ensure that `score()` does not break again. Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - John Zedlewski (https://github.com/JohnZed) URL: #3685

Update predict() / predict_proba() of RF to match sklearn

72e9e0b

hcho3 requested a review from a team as a code owner March 12, 2021 16:48

github-actions bot added the Cython / Python Cython or Python issue label Mar 12, 2021

hcho3 added 3 - Ready for Review Ready for review by team breaking Breaking change bug Something isn't working labels Mar 12, 2021

Add a small test

740e9fa

hcho3 mentioned this pull request Mar 12, 2021

[BUG] Confusing behavior of predict() / predict_proba() of RF when output_class is specified #3347

Closed

Fix style

fa789b9

JohnZed reviewed Mar 12, 2021

View reviewed changes

JohnZed approved these changes Mar 15, 2021

View reviewed changes

rapids-bot bot merged commit f5d86b9 into rapidsai:branch-0.19 Mar 15, 2021

hcho3 deleted the revise_predict_interface branch March 15, 2021 03:00

This was referenced Mar 31, 2021

[BUG] RandomForestClassifier score failing due to output_class handling #3682

Closed

Restore the functionality of RF score() #3685

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update predict() / predict_proba() of RF to match sklearn #3609

Update predict() / predict_proba() of RF to match sklearn #3609

hcho3 commented Mar 12, 2021 •

edited

JohnZed left a comment

JohnZed Mar 12, 2021

JohnZed Mar 12, 2021

hcho3 Mar 13, 2021

JohnZed commented Mar 15, 2021

Update predict() / predict_proba() of RF to match sklearn #3609

Update predict() / predict_proba() of RF to match sklearn #3609

Conversation

hcho3 commented Mar 12, 2021 • edited

JohnZed left a comment

Choose a reason for hiding this comment

JohnZed Mar 12, 2021

Choose a reason for hiding this comment

JohnZed Mar 12, 2021

Choose a reason for hiding this comment

hcho3 Mar 13, 2021

Choose a reason for hiding this comment

JohnZed commented Mar 15, 2021

hcho3 commented Mar 12, 2021 •

edited