Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New feature request: Matthews correlation coefficient #43

Open
aalexandersson opened this issue Aug 6, 2019 · 1 comment
Open

New feature request: Matthews correlation coefficient #43

aalexandersson opened this issue Aug 6, 2019 · 1 comment

Comments

@aalexandersson
Copy link

Please add Matthews correlation coefficient (MCC) as an additional statistic for the confusion table:

      TP * TN - FP * FN
MCC = -----------------------------------------------------
      [(TP + FP) * (FN + TN) * (FP + TN) * (TP + FN)]^(1/2)

The MCC is useful as an overall measure of the linkage quality. The MCC is better than Accuracy and the F1-score for imbalanced data because it adjusts for the balance ratios of the four confusion table categories (TP, TN, FP, and FN). In practice, I find that most linkage data are imbalanced by having mostly TN.

Wikipedia: https://en.wikipedia.org/wiki/Matthews_correlation_coefficient
Matthew's article (1975): https://doi.org/10.1016/0005-2795(75)90109-9

Matthew, page 445:

"A correlation of:
   C =  1 indicates perfect agreement,
   C =  0 is expected for a prediction no better than random, and
   C = -1 indicates total disagreement between prediction and observation".

Mentioned in Tharwat's article (2018): https://doi.org/10.1016/j.aci.2018.08.003
Recommended by Luque et al (2019): https://doi.org/10.1016/j.patcog.2019.02.023

Anders

@aalexandersson
Copy link
Author

aalexandersson commented Nov 13, 2021

Recommended by Canbek et al. (2021): https://rdcu.be/cvT7d

Conclusion:

In conclusion, this study proposes a new comprehensive benchmarking method to analyze the robustness of performance metrics and ranks 15 performance metrics in the literature. Researchers can use MCC as the most robust metric for general objective purposes to be on the safe side.

Full reference:
Canbek, G., Taskaya Temizel, T. & Sagiroglu, S. BenchMetrics: a systematic benchmarking method for binary classification performance metrics. Neural Computing and Applications 33, 14623–14650 (2021). https://doi.org/10.1007/s00521-021-06103-6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant