-
-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Macro vs micro-averaging switched up in user guide #28585
Comments
cc @ogrisel @lorentzenchr @GaelVaroquaux who might have a better intuition here. |
@uhoenig You are right. In micro averaging, all examples are treated equally (we compare predictions and ground truth for each example and compute the necessary metrics), while for macro averaging all classes are treated equally (we compare prediction and ground truth of examples for each class, compute the metrics for each class, and average those class metrics to get the macro average). See slide 41. As such, in a multi-class setting with imbalanced classes, if the overall performance of a classifier is important to us, we should opt for macro-based evaluation. On that note, it seems that few sentences above it should also be macro instead of micro:
should be
|
Micro-average works will aggregate all instances (samples from all classes) to compute the metric. So a data point from a "minority" or "majority" class will have the same impact. Macro-average will group sample by class and aggregate (using the mean) afterwards. Therefore, you increase the importance of data points from under-represented classes because you consider them as important of highly populated classes. Having these aspects in mind, I cannot tell that one metric is particularly better than the other; it all boils down of the application and the setup where the classifier will be used. |
I assume that we should just remove the statement and instead make it explicit what are the consequence to use one or another type of average. |
Having these aspects in mind, I cannot tell that one metric is particularly better than the other; it all boils down of the application and the setup where the classifier will be used.
Yes, indeed.
|
On that note, it seems that few sentences above it should also be macro instead of micro:
Micro-averaging aggregates the contributions from all the classes (using numpy.ravel) to compute the average metrics as follows:
should be
Macro-averaging aggregates the contributions from all the classes (using numpy.ravel) to compute the average metrics as follows:
It seems you're right: macro-averaging does class-level averaging (this is why it's called macro), and micro-averaging does instance-level averaging.
|
I'd love to contribute to this issue. It may also be worth mentioning that in a multi-class classification setup with balanced classes, both macro and micro averaging could produce comparable results when evaluating ROC curves (as can be seen for the Iris plants dataset in the documentation). |
I appreciate the discussion regarding the documentation on micro and macro averaging for classification in multiclass scenarios. I concur with glemaitre's viewpoint on neutrally explaining the effects of choosing either averaging method, as it allows users to make informed decisions based on their specific needs without swaying them towards one option. However, I would like to emphasize the value of providing guidance, particularly regarding handling imbalanced datasets. Real-world datasets are seldom perfectly balanced, making tips on effective strategies—like this library provides for: stratified sampling, class weighting, and the appropriate use of macro averaging in multiclass settings—not just useful but crucial for achieving reliable results. These recommendations align with the library's consistent efforts to equip users to tackle imbalanced classes effectively. Removing explicit statements about the superiority of one method over another is sensible. Yet, maintaining practical advice reflects the realities users face and supports the library's broader goal of fostering effective and informed machine learning practices. |
Agreed. In that case the recommendation is:
|
PR welcome to fix the documentation. |
Describe the issue linked to the documentation
Hi guys,
In the "ROC curve using micro-averaged OvR" part of the doc (https://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html#roc-curve-using-micro-averaged-ovr)
it says:
"In a multi-class classification setup with highly imbalanced classes, micro-averaging is preferable over macro-averaging. In such cases, one can alternatively use a weighted macro-averaging, not demoed here."
I believe it should say:
In a multi-class classification setup with highly imbalanced classes, macro-averaging is preferable over micro-averaging. In such cases, one can alternatively use a weighted macro-averaging, not demoed here.
If correct, I believe it could spare users some confusion. Thanks for all your work, Im just trying to help :) !!!
Suggest a potential alternative/fix
I believe it should say:
In a multi-class classification setup with highly imbalanced classes, macro-averaging is preferable over micro-averaging. In such cases, one can alternatively use a weighted macro-averaging, not demoed here.
The text was updated successfully, but these errors were encountered: