Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Add sample weight support to more metrics #3450

Open
arjoly opened this Issue Jul 20, 2014 · 26 comments

Comments

Projects
None yet
8 participants
Owner

arjoly commented Jul 20, 2014

Most metrics now supports sample_weights except for the following one:

  • confusion_matrix
  • hamming_loss
  • hinge_loss
  • jaccard_similarity_score
  • log_loss
  • matthews_corrcoef_score
  • median_absolute_error

Note that there is already a general test in sklearn/metrics/tests/test_common.py for sample_weight.

Ideally, it would be one pull request per metric to ease review.

Contributor

jatinshah commented Jul 28, 2014

I want to get involved in contributing to scikit-learn. This seems straightforward, would like to take it up. Let me know if this works and if this is still an issue that we want to be implemented.

Owner

arjoly commented Jul 28, 2014

You can take any metric and add sample_weight support. Thanks @jatinshah !

Owner

jnothman commented Jul 28, 2014

But for any multilabel metrics, you may be better off waiting for #3395,
which I should make an effort to wrap up.

On 28 July 2014 21:01, Arnaud Joly notifications@github.com wrote:

You can take any metric and add sample_weight support. Thanks @jatinshah
https://github.com/jatinshah !


Reply to this email directly or view it on GitHub
#3450 (comment)
.

Contributor

jatinshah commented Jul 29, 2014

Thanks, @jnothman. I will keep it in mind.

@arjoly, Here is a pull request for jaccard_similarity_score #3497. Let me know if there are any issues I will work on a few others as well.

Owner

arjoly commented Jul 30, 2014

Thanks to @jatinshah, jaccard similarity support sample weight!

Contributor

jatinshah commented Aug 5, 2014

I am working on adding sample_weight to log_loss. The implementation is straightforward, but tests are failing.

log_loss input parameters are y_true as labels and y_pred as probabilities. However, the tests for sample_weights assume that both y_true and y_pred are labels. Is there a simple way to resolve this issue?

Owner

arjoly commented Aug 5, 2014

I would say that the test need to be enhance/modify to take this new case into account.

Contributor

jatinshah commented Aug 5, 2014

OK, modified the tests. Here is the pull request: #3531

Contributor

jatinshah commented Aug 5, 2014

For matthews_corrcoef_score, I need to replace numpy.corrcoef with a weighted correlation function. Since numpy does not have a weighted version, I am planning to place the function in utils/wcorrcoef.py and related unit tests in utils/tests/test_wcorrcoef.py.

Does that work? Is there a better way to organize the code?

def wcorrcoef(X, Y, w):
    mX = np.average(X, weights=w)
    mY = np.average(Y, weights=w)
    covXY = np.average((X-mX)*(Y-mY), weights=w)
    covXX = np.average((X-mX)*(X-mX), weights=w)
    covYY = np.average((Y-mY)*(Y-mY), weights=w)
    return covXY/np.sqrt(covXX * covYY)
Owner

jnothman commented Aug 5, 2014

IMO keep it locally in metrics, unless you get a patch supporting sample
weights accepted to numpy or scipy and instead it goes in
sklearn.utils.fixes. Then again, I don't know how complicated the
implementation is.

On 6 August 2014 04:10, Jatin Shah notifications@github.com wrote:

For matthews_corrcoef_score, I need to replace numpy.corrcoef with a
weight correlation function. Since numpy does not have a weighted version,
I am planning to place the function in utils/wcorrcoef.py and related unit
tests in utils/tests/test_wcorrcoef.py.

Does that work? Is there a better way to organize the code?


Reply to this email directly or view it on GitHub
#3450 (comment)
.

Owner

arjoly commented Aug 6, 2014

Sample weight added to the log_loss thanks to @jatinshah in #3531.

Contributor

jatinshah commented Aug 6, 2014

I am trying to add sample_weight to matthews_corrcoef_score and I noticed some strange stuff. roc_auc_score and average_precision_score are not tested in tests_common.py for binary inputs. They are added to METRIC_UNDEFINED_MULTICLASS and these functions are skipped in both binary and multi class unit tests (see here).

Second, I am unable to change average and sample_weight parameters for both roc_auc_score and average_precision_score.

>>> from sklearn.metrics import roc_auc_score
>>> import bumpy as np
>>> y_true = np.array([0, 0, 1, 1])
>>> y_scores = np.array([0.1, 0.4, 0.35, 0.8])
>>> roc_auc_score(y_true, y_scores, average='micro')
TypeError: roc_auc_score() got an unexpected keyword argument 'average'
Owner

arjoly commented Aug 6, 2014

I am trying to add sample_weight to matthews_corrcoef_score and I noticed some strange stuff. roc_auc_score and average_precision_score are not tested in tests_common.py for binary inputs.

Those metrics doesn't support multi-class either by definition (samples averaged metrics) or it's not implemented.

Owner

arjoly commented Aug 6, 2014

The micro-averaged roc auc doesn't support binary output (only multilabel output).
But the binary (traditional) roc auc supports it

In [3]: import numpy as np

In [4]:  y_true = np.array([0, 0, 1, 1])

In [5]: y_scores = np.array([0.1, 0.4, 0.35, 0.8])

In [6]: roc_auc_score(y_true, y_scores)
Out[6]: 0.75
Owner

arjoly commented Aug 6, 2014

Second, I am unable to change average and sample_weight parameters for both roc_auc_score and average_precision_score.

In order to use the average argument, you need to have multilabel data

Contributor

SaurabhJha commented Oct 19, 2014

Here is a pull request for median_absolute_error. #3784

kubami commented Nov 30, 2014

I am working on this issue, and currently focusing on the hamming_loss.
At first I thought it might not make sense for this metric to be weighted. Now I think it would be usable to be able weight specific transitions in the hamming distance:

If we define normal hamming distance as:

\sum_t \delta(y_t, z_t)

we can add a transition weight, w as follows:

\sum weight(y_t, z_t) \delta(y_t, z_t)

It would be nicer to add it to the
http://docs.scipy.org/doc/scipy-0.14.0/reference/spatial.distance.html

as whamming(u,v,w).

and then add it to the hamming_loss here.
What do you think?

see example of this being used in
http://aclweb.org/anthology/N/N13/N13-1102.pdf

Owner

arjoly commented Dec 2, 2014

If I understand correctly, what you suggest is to weight differently each dimension (here label) differently. The issue is about sample-wise weight.

For instance,

>>> hamming_loss(np.array([[0, 1], [1, 1]]), np.zeros((2, 2)))
0.75

While with sample_weight=[0.5, 1], it should give

>>> hamming_loss(np.array([[0, 1], [1, 1]]), np.zeros((2, 2)), sample_weight=[0.5, 1])
0.625 # (1 / 2 * 0.5 + 2 / 2 * 1) / 2

I'd like to begin contributing to scikit-learn, and I'm interested in adding sample weights to the confusion matrix, however I'd like to clarify how sample weights are applied here. Is it done by simply apllying the weight to each sample's corresponding cell (sample i with weight wi is a true positive. So I add wi to the TP cell instead of adding 1), or is there anything else that should be done?

Owner

jnothman commented Dec 24, 2014

Yes, I think that's the correct understanding.

On 24 December 2014 at 07:05, Bernardo Vecchia Stein <
notifications@github.com> wrote:

I'd like to begin contributing to scikit-learn, and I'm interested in
adding sample weights to the confusion matrix, however I'd like to clarify
how sample weights are applied here. Is it done by simply apllying the
weight to each sample's corresponding cell (sample i with weight wi is a
true positive. So I add wi to the TP cell instead of adding 1), or is
there anything else that should be done?


Reply to this email directly or view it on GitHub
#3450 (comment)
.

Thanks @jnothman! I have finished changing the code, however the tests in test_common.py are failing because they expect the result from the metric to be a float, not an array (as is the case for the confusion matrix). How should I proceed in this case? Should I write tests exclusively for confusion matrix, or is there something else we can do?

Owner

jnothman commented Dec 24, 2014

I think confusion matrix is currently excluded from other invariance tests.
Yes, it might require its own explicit testing, or there may be a cunning
way to fix up the existing test.

On 25 December 2014 at 04:56, Bernardo Vecchia Stein <
notifications@github.com> wrote:

Thanks @jnothman https://github.com/jnothman! I have finished changing
the code, however the tests in test_common.py are failing because they
expect the result from the metric to be a float, not an array (as is the
case for the confusion matrix). How should I proceed in this case? Should I
write tests exclusively for confusion matrix, or is there something else we
can do?


Reply to this email directly or view it on GitHub
#3450 (comment)
.

Member

raghavrv commented Jan 12, 2015

@jatinshah Do you plan to go ahead with adding sample weight support to the matthews_corrcoef_score? If not would you mind if I proceed with your implementation to submit a PR for the same?

Contributor

jatinshah commented Jan 13, 2015

@ragv I don't intend to work on it at the moment. Please go ahead and submit a PR

Member

raghavrv commented Jan 17, 2015

Thanks! now that completes the todo ( counting the PRs raised for respective metrics )

Contributor

achab commented Oct 19, 2015

@othersparticipantsofparissprint : I'm working on this one

@achab achab added a commit to achab/scikit-learn that referenced this issue Oct 19, 2015

@achab achab fixed issue #3450 for hamming loss bf8f3f1

@agramfort agramfort added a commit that referenced this issue Oct 21, 2015

@agramfort agramfort Merge pull request #5465 from massil94/dev-massil
[MRG+1] partly fixed issue #3450 for hamming loss
b6a5ed2

@glouppe glouppe added a commit to glouppe/scikit-learn that referenced this issue Oct 26, 2015

@achab @glouppe achab + glouppe fixed issue #3450 for hamming loss c1deb59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment