Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Added pos_label parameter to roc_auc_score function #2616

Open
wants to merge 1 commit into from

5 participants

@ilblackdragon

To be able to run roc_auc_score on binary tagets that aren't {0, 1} or {-1, 1}.

@coveralls

Coverage Status

Coverage remained the same when pulling 6ffd0be on ilblackdragon:roc_auc-add-pos_label into f642aee on scikit-learn:master.

@jaquesgrobler

+1 for merge :+1:

@larsmans larsmans commented on the diff
sklearn/metrics/metrics.py
@@ -365,7 +365,7 @@ def auc_score(y_true, y_score):
return roc_auc_score(y_true, y_score)
-def roc_auc_score(y_true, y_score):
+def roc_auc_score(y_true, y_score, pos_label=None):
"""Compute Area Under the Curve (AUC) from prediction scores
Note: this implementation is restricted to the binary classification task.
@larsmans Owner

Is this still true?

@arjoly Owner
arjoly added a note

yes since this pr is waiting to be merged #2460

@arjoly How does #2460 will handle binary case? Will it still return one value for "positive" class or return ROC for both classes (i.e. no need in pos_label in this case)?

@arjoly Owner
arjoly added a note

How does #2460 will handle binary case?

As it is at the moment, I haven't change the logic around the positive label handling.

Will it still return one value for "positive" class or return ROC for both classes (i.e. no need in pos_label in this case)?

It detects if y_true and y_score are in multilabel-indicator format. In that case, there isn't any ambiguity on the number of classes/labels. The format checking can be easily done by checking the number of dimension of y_true/y_score. Note taht It doesn't handle the problematic multiclass task.

Depending on the chosen averaging option, you will get one value for all binary tasks or one for each task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@arjoly
Owner

I think that you should have a look to the pr #2610 of @jnothman. Should we switch to a labels arguments instead of a pos_label one? Note that you should add tests for your new feature. Have a look at sklearn/metrics/tests/test_metrics.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Nov 27, 2013
  1. @ilblackdragon

    Added pos_label parameter to roc_auc_score function to be able to run…

    ilblackdragon authored
    … it on binary tagets that aren't {0, 1} or {-1, 1}.
This page is out of date. Refresh to see the latest.
Showing with 10 additions and 3 deletions.
  1. +10 −3 sklearn/metrics/metrics.py
View
13 sklearn/metrics/metrics.py
@@ -365,7 +365,7 @@ def auc_score(y_true, y_score):
return roc_auc_score(y_true, y_score)
-def roc_auc_score(y_true, y_score):
+def roc_auc_score(y_true, y_score, pos_label=None):
"""Compute Area Under the Curve (AUC) from prediction scores
Note: this implementation is restricted to the binary classification task.
@larsmans Owner

Is this still true?

@arjoly Owner
arjoly added a note

yes since this pr is waiting to be merged #2460

@arjoly How does #2460 will handle binary case? Will it still return one value for "positive" class or return ROC for both classes (i.e. no need in pos_label in this case)?

@arjoly Owner
arjoly added a note

How does #2460 will handle binary case?

As it is at the moment, I haven't change the logic around the positive label handling.

Will it still return one value for "positive" class or return ROC for both classes (i.e. no need in pos_label in this case)?

It detects if y_true and y_score are in multilabel-indicator format. In that case, there isn't any ambiguity on the number of classes/labels. The format checking can be easily done by checking the number of dimension of y_true/y_score. Note taht It doesn't handle the problematic multiclass task.

Depending on the chosen averaging option, you will get one value for all binary tasks or one for each task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@@ -374,12 +374,16 @@ def roc_auc_score(y_true, y_score):
----------
y_true : array, shape = [n_samples]
- True binary labels.
+ True binary labels in range {0, 1} or {-1, 1}. If labels are not
+ binary, pos_label should be explicitly given.
y_score : array, shape = [n_samples]
Target scores, can either be probability estimates of the positive
class, confidence values, or binary decisions.
+ pos_label : int
+ Label considered as positive and others are considered negative.
+
Returns
-------
auc : float
@@ -403,11 +407,14 @@ def roc_auc_score(y_true, y_score):
>>> y_scores = np.array([0.1, 0.4, 0.35, 0.8])
>>> roc_auc_score(y_true, y_scores)
0.75
+ >>> y_true = np.array(['No', 'No', 'Yes', 'Yes'])
+ >>> roc_auc_score(y_true, y_scores, pos_label='Yes')
+ 0.75
"""
if len(np.unique(y_true)) != 2:
raise ValueError("AUC is defined for binary classification only")
- fpr, tpr, tresholds = roc_curve(y_true, y_score)
+ fpr, tpr, tresholds = roc_curve(y_true, y_score, pos_label=pos_label)
return auc(fpr, tpr, reorder=True)
Something went wrong with that request. Please try again.