Macro vs micro-averaging switched up in user guide #28585

uhoenig · 2024-03-06T15:52:14Z

Describe the issue linked to the documentation

Hi guys,
In the "ROC curve using micro-averaged OvR" part of the doc (https://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html#roc-curve-using-micro-averaged-ovr)

it says:
"In a multi-class classification setup with highly imbalanced classes, micro-averaging is preferable over macro-averaging. In such cases, one can alternatively use a weighted macro-averaging, not demoed here."

I believe it should say:
In a multi-class classification setup with highly imbalanced classes, macro-averaging is preferable over micro-averaging. In such cases, one can alternatively use a weighted macro-averaging, not demoed here.

If correct, I believe it could spare users some confusion. Thanks for all your work, Im just trying to help :) !!!

Suggest a potential alternative/fix

I believe it should say:
In a multi-class classification setup with highly imbalanced classes, macro-averaging is preferable over micro-averaging. In such cases, one can alternatively use a weighted macro-averaging, not demoed here.

adrinjalali · 2024-03-07T07:01:04Z

cc @ogrisel @lorentzenchr @GaelVaroquaux who might have a better intuition here.

fkdosilovic · 2024-03-11T07:03:27Z

@uhoenig You are right.

In micro averaging, all examples are treated equally (we compare predictions and ground truth for each example and compute the necessary metrics), while for macro averaging all classes are treated equally (we compare prediction and ground truth of examples for each class, compute the metrics for each class, and average those class metrics to get the macro average). See slide 41.

As such, in a multi-class setting with imbalanced classes, if the overall performance of a classifier is important to us, we should opt for macro-based evaluation.

On that note, it seems that few sentences above it should also be macro instead of micro:

Micro-averaging aggregates the contributions from all the classes (using numpy.ravel) to compute the average metrics as follows:

should be

Macro-averaging aggregates the contributions from all the classes (using numpy.ravel) to compute the average metrics as follows:

glemaitre · 2024-03-11T21:33:49Z

Micro-average works will aggregate all instances (samples from all classes) to compute the metric. So a data point from a "minority" or "majority" class will have the same impact.

Macro-average will group sample by class and aggregate (using the mean) afterwards. Therefore, you increase the importance of data points from under-represented classes because you consider them as important of highly populated classes.

Having these aspects in mind, I cannot tell that one metric is particularly better than the other; it all boils down of the application and the setup where the classifier will be used.

glemaitre · 2024-03-11T21:34:50Z

I assume that we should just remove the statement and instead make it explicit what are the consequence to use one or another type of average.

GaelVaroquaux · 2024-03-11T21:38:33Z

Having these aspects in mind, I cannot tell that one metric is particularly better than the other; it all boils down of the application and the setup where the classifier will be used.

Yes, indeed.

GaelVaroquaux · 2024-03-11T21:40:44Z

On that note, it seems that few sentences above it should also be macro instead of micro: Micro-averaging aggregates the contributions from all the classes (using numpy.ravel) to compute the average metrics as follows: should be Macro-averaging aggregates the contributions from all the classes (using numpy.ravel) to compute the average metrics as follows:

It seems you're right: macro-averaging does class-level averaging (this is why it's called macro), and micro-averaging does instance-level averaging.

jmarintur · 2024-03-12T17:04:10Z

I'd love to contribute to this issue. It may also be worth mentioning that in a multi-class classification setup with balanced classes, both macro and micro averaging could produce comparable results when evaluating ROC curves (as can be seen for the Iris plants dataset in the documentation).

uhoenig · 2024-03-12T17:21:43Z

I appreciate the discussion regarding the documentation on micro and macro averaging for classification in multiclass scenarios. I concur with glemaitre's viewpoint on neutrally explaining the effects of choosing either averaging method, as it allows users to make informed decisions based on their specific needs without swaying them towards one option.

However, I would like to emphasize the value of providing guidance, particularly regarding handling imbalanced datasets. Real-world datasets are seldom perfectly balanced, making tips on effective strategies—like this library provides for: stratified sampling, class weighting, and the appropriate use of macro averaging in multiclass settings—not just useful but crucial for achieving reliable results. These recommendations align with the library's consistent efforts to equip users to tackle imbalanced classes effectively.

Removing explicit statements about the superiority of one method over another is sensible. Yet, maintaining practical advice reflects the realities users face and supports the library's broader goal of fostering effective and informed machine learning practices.

lorentzenchr · 2024-03-12T18:35:19Z

I would like to emphasize the value of providing guidance, particularly regarding handling imbalanced datasets.

Agreed. In that case the recommendation is:

Choose a metric as close as possible to the business/use case outcome you hope to achieve.
To compare (classification) models and measure predictive performance, use consistent scoring rules. They are not negatively affected by imbalanced classes, nor do you need to choose beween micro vs macro averaging.

lorentzenchr · 2024-05-30T16:40:01Z

PR welcome to fix the documentation.

uhoenig added Documentation Needs Triage Issue requires triage labels Mar 6, 2024

glemaitre removed the Needs Triage Issue requires triage label Mar 11, 2024

lorentzenchr added the help wanted label May 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Macro vs micro-averaging switched up in user guide #28585

Macro vs micro-averaging switched up in user guide #28585

uhoenig commented Mar 6, 2024

adrinjalali commented Mar 7, 2024

fkdosilovic commented Mar 11, 2024

glemaitre commented Mar 11, 2024

glemaitre commented Mar 11, 2024

GaelVaroquaux commented Mar 11, 2024 via email

GaelVaroquaux commented Mar 11, 2024 via email

jmarintur commented Mar 12, 2024

uhoenig commented Mar 12, 2024

lorentzenchr commented Mar 12, 2024

lorentzenchr commented May 30, 2024

Macro vs micro-averaging switched up in user guide #28585

Macro vs micro-averaging switched up in user guide #28585

Comments

uhoenig commented Mar 6, 2024

Describe the issue linked to the documentation

Suggest a potential alternative/fix

adrinjalali commented Mar 7, 2024

fkdosilovic commented Mar 11, 2024

glemaitre commented Mar 11, 2024

glemaitre commented Mar 11, 2024

GaelVaroquaux commented Mar 11, 2024 via email

GaelVaroquaux commented Mar 11, 2024 via email

jmarintur commented Mar 12, 2024

uhoenig commented Mar 12, 2024

lorentzenchr commented Mar 12, 2024

lorentzenchr commented May 30, 2024