New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Ensure that ROC curve starts at (0, 0) #10093

Merged
merged 4 commits into from Nov 10, 2017

Conversation

Projects
None yet
3 participants
@qinhanmin2014
Member

qinhanmin2014 commented Nov 9, 2017

Reference Issues/PRs

Fixes #9790
See also #9850

What does this implement/fix? Explain your changes.

Currently, when the first point of ROC curve is on y-axis, we don't add a point (0, 0), which is not consistent with doc & some papers & some R packages.
Reference:
(1)scikit-learn doc
thresholds : array, shape = [n_thresholds]
Decreasing thresholds on the decision function used to compute fpr and tpr. thresholds[0] represents no instances being predicted and is arbitrarily set to max(y_score) + 1.
(2)@jnothman's comment
our concern should be that in all cases, the curve includes a point that represents predicting nothing in the positive class, and that every further point represents predicting more than nothing, for every threshold at which this changes the fpr or tpr, until all are predicted.
(3)An introduction to ROC analysis cite >7000 link
(4)R package ROCR

library(ROCR)
pred <- prediction(c(0.1, 0.4, 0.35, 0.8), c(0, 0, 1, 1))
perf <- performance(pred,"tpr","fpr")
plot(perf)

Any other comments?

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Nov 9, 2017

Member

Looks good at a glance. Please add to what's new. Can this change affect the auc? If so, document carefully.

Also, perhaps add that reference on roc analysis to the docs

Member

jnothman commented Nov 9, 2017

Looks good at a glance. Please add to what's new. Can this change affect the auc? If so, document carefully.

Also, perhaps add that reference on roc analysis to the docs

qinhanmin2014 added some commits Nov 9, 2017

@qinhanmin2014

This comment has been minimized.

Show comment
Hide comment
@qinhanmin2014

qinhanmin2014 Nov 9, 2017

Member

@jnothman Thanks a lot for the instant review :)
I have updated the doc and what's new accordingly. Since the fix is only adding vertical line (overlapping with y-axis) at the beginning of the curve, I believe it will not influence roc_auc_score.

Member

qinhanmin2014 commented Nov 9, 2017

@jnothman Thanks a lot for the instant review :)
I have updated the doc and what's new accordingly. Since the fix is only adding vertical line (overlapping with y-axis) at the beginning of the curve, I believe it will not influence roc_auc_score.

@massich

This comment has been minimized.

Show comment
Hide comment
@massich

massich Nov 9, 2017

Contributor

LGTM

Contributor

massich commented Nov 9, 2017

LGTM

@@ -160,6 +161,11 @@ Metrics
- Fixed a bug due to floating point error in :func:`metrics.roc_auc_score` with
non-integer sample weights. :issue:`9786` by :user:`Hanmin Qin <qinhanmin2014>`.
- Fixed a bug where :func:`metrics.roc_curve` sometimes starts on y-axis instead
of (0, 0), which is inconsistent with the document and other implementations.

This comment has been minimized.

@jnothman

jnothman Nov 9, 2017

Member

Note that this does not affect auc

@jnothman

jnothman Nov 9, 2017

Member

Note that this does not affect auc

@qinhanmin2014

This comment has been minimized.

Show comment
Hide comment
@qinhanmin2014

qinhanmin2014 Nov 10, 2017

Member

@jnothman Thanks. Comment addressed.

Member

qinhanmin2014 commented Nov 10, 2017

@jnothman Thanks. Comment addressed.

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Nov 10, 2017

Member

I don't think this is controversial: I'll take Joan's +1 on this... Let's merge. Thanks!

Member

jnothman commented Nov 10, 2017

I don't think this is controversial: I'll take Joan's +1 on this... Let's merge. Thanks!

@jnothman jnothman merged commit 3e85359 into scikit-learn:master Nov 10, 2017

3 of 4 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
lgtm analysis: Python No alert changes
Details

@qinhanmin2014 qinhanmin2014 deleted the qinhanmin2014:roc_curve branch Nov 14, 2017

maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment