Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Adds multiclass ROC AUC #12789

Merged
merged 123 commits into from Jul 17, 2019
Merged

Conversation

@thomasjpfan
Copy link
Member

thomasjpfan commented Dec 14, 2018

Reference Issues/PRs

Resolves #3298
Continues #7663, #10481, #12311

Adds common test for multiclass label permutations, addressing some of #12309

What does this implement/fix? Explain your changes.

  • The One-vs-One implementation is 100% based on Hand and Till: https://link.springer.com/article/10.1023/A:1010920819831. The weighting by prevalence does not seem to be in literature, but it looks like a simple extension of Hand and Till.

  • The One-vs-Rest implementation is functionally the same as the multi-label case.

  • The plot_roc.py example has been refactored to bring the descriptions closer to the code, allowing a user to read through it without losing too much context.

  • y_score in test_multiclass_sample_weight_invariance was normalized, because roc_auc_score checks for this condition.

Any other comments?

Currently, roc_auc_score does not support sample_weight with multiclass="ovo". This comes from the fact that the binary_metric calls in _average_multiclass_ovo_score become dependent on each other when the masked sample weights are passed to binary_metric.

maskani-moh and others added 30 commits Jan 16, 2018
Copy link
Member

jnothman left a comment

Apart from that API decision, I'm quite happy with where this, especially the example, has come, and would be happy to Approve.

should be either equal to ``None`` or ``1.0`` as AUC ROC partial
computation currently is not supported for multiclass.
multiclass : string, 'ovr' or 'ovo', optional(default='ovr')

This comment has been minimized.

Copy link
@jnothman

jnothman Jun 20, 2019

Member
Suggested change
multiclass : string, 'ovr' or 'ovo', optional(default='ovr')
multiclass : string, 'ovr' or 'ovo', optional (default='ovr')

I do wonder if we should make the user intentionally specify ovo or ovr. Unlike say the average parameter of P/R/F, we don't need the user to tell us that the data is to be treated as multiclass (we can use the number of columns in y_score for that) but I suspect we should be trying to encourage literacy in the idea that there are several ways to extend roc_auc to multiclass.

This comment has been minimized.

Copy link
@thomasjpfan

thomasjpfan Jun 24, 2019

Author Member

I am okay with this. We can set the default to None and raising an error before going down the multi-class code path.

This comment has been minimized.

Copy link
@jnothman

jnothman Jun 25, 2019

Member

We can set the default to None and raising an error before going down the multi-class code path.

Do others have opinions on this? @amueller?

This comment has been minimized.

Copy link
@thomasjpfan

thomasjpfan Jun 25, 2019

Author Member

Since we are moving to strings as keywords more, multiclass would most likely be better as 'error' for raising an exception, or 'warn' if we decide on just warning.

Also to be consistent with the rest of sklearn, we should use multi_class.

This comment has been minimized.

Copy link
@jnothman

jnothman Jun 28, 2019

Member

Let's just do it. We can always change it later to have a default value.

@thomasjpfan

This comment has been minimized.

Copy link
Member Author

thomasjpfan commented Jun 24, 2019

How did we pick the default again? I usually feel OVR is easier to grasp and it would make the default multi-class behavior be consistent with the multi-label behavior.

@amueller The default is ovr in roc_auc_score. This PR defines both ovo and ovr in SCORERS, do you want to remove ovo from SCORERS?

thomasjpfan added 2 commits Jun 24, 2019
thomasjpfan added 2 commits Jul 2, 2019
thomasjpfan added 2 commits Jul 2, 2019
@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Jul 2, 2019

Change of heart? CI is unhappy.

@thomasjpfan

This comment has been minimized.

Copy link
Member Author

thomasjpfan commented Jul 2, 2019

Change of heart? CI is unhappy.

It should be happy now. Had to add roc_auc_score to METRIC_UNDEFINED_MULTICLASS, since it does not support multiclass by default. (Also added ovr_roc_auc and weighted_ovr_roc_auc to the common test)

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Jul 3, 2019

thomasjpfan added 2 commits Jul 3, 2019
@thomasjpfan

This comment has been minimized.

Copy link
Member Author

thomasjpfan commented Jul 4, 2019

test_explained_variance_components_10_20 fail is unrelated.

@amueller

This comment has been minimized.

Copy link
Member

amueller commented Jul 17, 2019

is there an issue for the test failures?

Copy link
Member

amueller left a comment

looks good apart from nitpicks

return _average_binary_score(
_binary_roc_auc_score, y_true, y_score, average,
sample_weight=sample_weight)
else:

This comment has been minimized.

Copy link
@amueller

amueller Jul 17, 2019

Member

when is this else active? can you add a comment maybe (unless I'm just being very slow)

This comment has been minimized.

Copy link
@thomasjpfan

thomasjpfan Jul 17, 2019

Author Member

multilabel-indicator

sample_weight=sample_weight)


def _multiclass_roc_auc_score(binary_metric, y_true, y_score, labels,

This comment has been minimized.

Copy link
@amueller

amueller Jul 17, 2019

Member

isn't binary_metric alwasy _binary_roc_auc_score?

This comment has been minimized.

Copy link
@thomasjpfan

thomasjpfan Jul 17, 2019

Author Member

_binary_roc_auc_score is only defined in the scope of roc_auc_score

labels : array, shape = [n_classes] or None, optional (default=None)
List of labels to index ``y_score`` used for multiclass. If ``None``,
the lexicon order of ``y_true`` is used to index ``y_score``.

This comment has been minimized.

Copy link
@amueller

amueller Jul 17, 2019

Member
Suggested change
the lexicon order of ``y_true`` is used to index ``y_score``.
the lexical order of ``y_true`` is used to index ``y_score``.
thomasjpfan added 2 commits Jul 17, 2019
@amueller amueller merged commit dc9955b into scikit-learn:master Jul 17, 2019
17 checks passed
17 checks passed
LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No new or fixed alerts
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: doc Your tests passed on CircleCI!
Details
ci/circleci: doc-min-dependencies Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.85%)
Details
codecov/project 96.86% (+<.01%) compared to d1c52f4
Details
scikit-learn.scikit-learn Build #20190717.44 succeeded
Details
scikit-learn.scikit-learn (Linux py35_conda_openblas) Linux py35_conda_openblas succeeded
Details
scikit-learn.scikit-learn (Linux py35_ubuntu_atlas) Linux py35_ubuntu_atlas succeeded
Details
scikit-learn.scikit-learn (Linux pylatest_conda_mkl_pandas) Linux pylatest_conda_mkl_pandas succeeded
Details
scikit-learn.scikit-learn (Linux32 py35_ubuntu_atlas_32bit) Linux32 py35_ubuntu_atlas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py35_pip_openblas_32bit) Windows py35_pip_openblas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py37_conda_mkl) Windows py37_conda_mkl succeeded
Details
scikit-learn.scikit-learn (macOS pylatest_conda_mkl) macOS pylatest_conda_mkl succeeded
Details
@GaelVaroquaux

This comment has been minimized.

Copy link
Member

GaelVaroquaux commented Jul 17, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.