# [MRG+2] ENH multiclass balanced accuracy #10587

## Conversation

### jnothman added some commits Feb 5, 2018

 ENH multiclass balanced accuracy 
Includes computationally simpler implementation and logically simpler description.
 a09e7ac 
 COSMIT 
 362c3cc 
 COSMIT 
 d5a065c 
 Try fix tests on earliest dependencies 
 da8d27b 
 Improve ignore_warnings scope 
 11ad2d7 
 Fix use of ignore_warnings as context manager 
 3d9919b 
### jnothman commented Feb 5, 2018

 Ahh... passing tests.

### jnothman added some commits Feb 5, 2018

 corrected -> adjusted 
 62021a2 
 DOC 
 2239eac 
 DOC TeX 
 df2ebbc 
 DOC 
 05a98e4 
 DOC 
 23e3976 

### jnothman reviewed Feb 5, 2018

 @@ -1357,6 +1357,8 @@ functions or non-estimator constructors. equal weight by giving each sample a weight inversely related to its class's prevalence in the training data: n_samples / (n_classes * np.bincount(y)). **Note** however that this rebalancing does not take the weight of samples in each class into account.

#### jnothman Feb 5, 2018 • edited

Perhaps we should have a "weight-balanced" option for class_weight. It would be interesting to see if that improved imbalanced boosting.

#### jnothman Feb 6, 2018

Apparently my phone wrote "weight-loss card" (!) there. Amended.

 .. math:: \texttt{balanced-accuracy}(y, \hat{y}) = \frac{1}{2} \left(\frac{\sum_i 1(\hat{y}_i = 1 \land y_i = 1)}{\sum_i 1(y_i = 1)} + \frac{\sum_i 1(\hat{y}_i = 0 \land y_i = 0)}{\sum_i 1(y_i = 0)}\right) \hat{w}_i = \frac{w_i}{\sum_j{1(y_j = y_i) w_j}}

#### jnothman Feb 5, 2018

Should I give the equation assuming w_i=1?

I think it's fine if we let the general formula.

 DOC what's new 
 906d066 

### maskani-moh reviewed Feb 6, 2018

sklearn/metrics/classification.py Outdated
### jnothman commented Feb 6, 2018 • edited

 While I'm interested in your critique of the docs and implementation, @maskani-moh, I'd mostly like you to verify that this interpretation of balanced accuracy, as accuracy with sample weights assigned to give equal total weight to each class, makes the choice of a multiclass generalisation clear.
 Fix typo 
 34d9ba3 

### glemaitre reviewed Feb 7, 2018

sklearn/metrics/classification.py Outdated
 Simpler implementation using confusion_matrix 
 301d475 
### glemaitre commented Feb 8, 2018

 The implementation with the confusion matrix seems really straight forward. It looks like an average of the TPR per classes. The generalization from binary to multi-class look good to me. I don't see a case where it would not be correct.

### glemaitre reviewed Feb 8, 2018

doc/modules/model_evaluation.rst Outdated

### glemaitre reviewed Feb 8, 2018

doc/modules/model_evaluation.rst Outdated
sklearn/metrics/classification.py Outdated
doc/modules/model_evaluation.rst Outdated
 Address comments from guillaume 
 1dcc881 

### glemaitre commented Feb 13, 2018

 LGTM. @maskani-moh Could you have a look and tell us WYT?

### glemaitre referenced this pull request Apr 19, 2018

Open

#### FIX use balanced accuracy from scikit-learn #128

 Merge branch 'master' into balacc-multiclass 
 28a034d 
### jnothman commented Jul 26, 2018

 This should be quick to review if someone (other than @glemaitre who has given his +1) is keen to throw it into 0.20.

### qinhanmin2014 reviewed Jul 26, 2018 • edited

LGTM at a glance. I need (and promise) to double check the code and refs tomorrow.
Some small comments, feel free to ignore if you think current version is fine.
My LGTM on the PR is based on the fact that the function is there. Honestly, I don't like the idea of including such a function, which can simply be implemented using recall.
Tagging 0.20.

doc/modules/model_evaluation.rst Outdated
doc/modules/model_evaluation.rst Outdated
doc/modules/model_evaluation.rst Outdated
sklearn/metrics/classification.py Outdated

### qinhanmin2014 reviewed Jul 26, 2018

 assert balanced == pytest.approx(macro_recall) adjusted = balanced_accuracy_score(y_true, y_pred, adjusted=True) chance = balanced_accuracy_score(y_true, np.full_like(y_true, y_true[0])) assert adjusted == (balanced - chance) / (1 - chance)

#### qinhanmin2014 Jul 26, 2018

Any reason we can't use == when adjusted=False?

### qinhanmin2014 reviewed Jul 26, 2018

doc/modules/model_evaluation.rst Outdated

### qinhanmin2014 approved these changes Jul 27, 2018

LGTM apart from the comments above.

doc/whats_new/v0.20.rst Outdated

### jnothman commented Jul 27, 2018

 Honestly, I don't like the idea of including such a function, which can be simply implemented by recall. The adjusted metric can't just be implemented by recall. But really, we've had years of people asking for balanced accuracy, and not realising that they could implement it with recall....
### jnothman commented Jul 27, 2018

 I don't have time to fix these up right away...​
### qinhanmin2014 commented Jul 27, 2018

 @jnothman Do you mind if I push some cosmetic changes and merge this one?
### jnothman commented Jul 27, 2018

 I don't mind if you're confident about them
 mostly formatting, I guess you won't be unhappy :) 
 9cf3979 

### qinhanmin2014 approved these changes Jul 27, 2018

LGTM, thanks @jnothman

### qinhanmin2014 merged commit e888c0d into scikit-learn:master Jul 27, 2018 4 of 5 checks passed

### jnothman commented Jul 29, 2018

 Removing those backslashes broke CircleCI on master.

