Add multioutput-multiclass support to metrics #3453

Open
arjoly opened this Issue Jul 20, 2014 · 9 comments

Comments

Projects
None yet
5 participants
Owner

arjoly commented Jul 20, 2014

Some estimators such as trees support multi-output multiclass. However, there isn't any metric yet to assess those tasks. Here a list of metrics that could be easily extended to handle this format:

  1. accuracy_score or subset accuracy and zero_one_loss or subset zero-one loss
  2. hamming_loss

Ideally, it would be one pull request for point (1) and (2).

@arjoly arjoly added Easy labels Jul 20, 2014

Owner

mblondel commented Jul 22, 2014

>>> type_of_target(np.array([[0, 1], [1, 1]]))
'multilabel-indicator'

Are multilabel-indicator and binary-multioutput (= multiclass-multioutput with only 2 classes) always the same? Or are there metrics where we need to differentiate them?

Owner

jnothman commented Jul 22, 2014

Our conclusion in designing type_of_target was that multilabel and
binary-multioutput are identical evaluation problems (even if they may
exploit different learning paradigms in some cases).

On 22 July 2014 14:32, Mathieu Blondel notifications@github.com wrote:

type_of_target(np.array([[0, 1], [1, 1]]))
'multilabel-indicator'

Are multilabel-indicator and binary-multioutput (= multiclass-multioutput
with only 2 classes) always the same? Or are there metrics where we need to
differentiate them?


Reply to this email directly or view it on GitHub
#3453 (comment)
.

Owner

jnothman commented Jul 22, 2014

I guess what you're asking though is: is it necessarily the case that a
multiclass-multioutput metric must return the same result as its multilabel
equivalent when presented with the degenerative binary-multioutput case?

Well, I bloody well hope so.

On 22 July 2014 15:11, Joel Nothman joel.nothman@gmail.com wrote:

Our conclusion in designing type_of_target was that multilabel and
binary-multioutput are identical evaluation problems (even if they may
exploit different learning paradigms in some cases).

On 22 July 2014 14:32, Mathieu Blondel notifications@github.com wrote:

type_of_target(np.array([[0, 1], [1, 1]]))
'multilabel-indicator'

Are multilabel-indicator and binary-multioutput (= multiclass-multioutput
with only 2 classes) always the same? Or are there metrics where we need to
differentiate them?


Reply to this email directly or view it on GitHub
#3453 (comment)
.

Owner

arjoly commented Jul 22, 2014

Yes, it should give the same result.

Owner

mblondel commented Jul 22, 2014

Yes, that was my question. Thanks for the clarification.

Owner

arjoly commented Jul 22, 2014

To be more precise, this will be case for accuracy_score, hamming_loss and zero_one_loss.

Contributor

akshayah3 commented Aug 3, 2014

@arjoly I would like to work on this one. Could you point out how I can add those to trees and where exactly do they go in?

Owner

arjoly commented Aug 4, 2014

Could you point out how I can add those to trees and where exactly do they go in?

This issue is related to metrics.

The enhancement will be coded in sklearn/metrics/classification.py, with specific tests in sklearn/metrics/tests/test_classification.py and with general tests for multioutput-multiclass sklearn/metrics/tests/test_common.py.

mitar commented May 29, 2017

Any chance of getting this in, there is a pull request for it open #3681.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment