Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update predict() / predict_proba() of RF to match sklearn #3609

Merged
merged 3 commits into from
Mar 15, 2021

Conversation

hcho3
Copy link
Contributor

@hcho3 hcho3 commented Mar 12, 2021

Closes #3347.

Make the predict() and predict_proba() functions of RF to match those in the scikit-learn RF.

  • Eliminate the parameter output_class. Instead, predict() will always produce class prediction, and predict_proba() will always produce probability prediction. (This applies to binary and multi-class classifiers. Regressors will only have predict().)
  • Remove the threshold parameter from predict_proba().

@hcho3 hcho3 requested a review from a team as a code owner March 12, 2021 16:48
@github-actions github-actions bot added the Cython / Python Cython or Python issue label Mar 12, 2021
@hcho3 hcho3 added 3 - Ready for Review Ready for review by team breaking Breaking change bug Something isn't working labels Mar 12, 2021
Copy link
Contributor

@JohnZed JohnZed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just one small question.

@@ -271,7 +271,7 @@ def fit(self, X, y, convert_dtype=False):
convert_dtype=convert_dtype)
return self

def predict(self, X, output_class=True, algo='auto', threshold=0.5,
def predict(self, X, algo='auto', threshold=0.5,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove this threshold too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(it's not in sklearn)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is fine, since predict() produces class predictions.

@JohnZed
Copy link
Contributor

JohnZed commented Mar 15, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit f5d86b9 into rapidsai:branch-0.19 Mar 15, 2021
@hcho3 hcho3 deleted the revise_predict_interface branch March 15, 2021 03:00
rapids-bot bot pushed a commit that referenced this pull request Apr 1, 2021
Closes #3682

In #3609, I unintentionally broke the function `score()` in the random forest. This PR restores the functionality. In addition, I added `score()` to one of the unit tests to ensure that `score()` does not break again.

Authors:
  - Philip Hyunsu Cho (https://github.com/hcho3)

Approvers:
  - John Zedlewski (https://github.com/JohnZed)

URL: #3685
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team breaking Breaking change bug Something isn't working Cython / Python Cython or Python issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Confusing behavior of predict() / predict_proba() of RF when output_class is specified
2 participants