New feature request: Matthews correlation coefficient #43

aalexandersson · 2019-08-06T15:13:41Z

Please add Matthews correlation coefficient (MCC) as an additional statistic for the confusion table:

      TP * TN - FP * FN
MCC = -----------------------------------------------------
      [(TP + FP) * (FN + TN) * (FP + TN) * (TP + FN)]^(1/2)

The MCC is useful as an overall measure of the linkage quality. The MCC is better than Accuracy and the F1-score for imbalanced data because it adjusts for the balance ratios of the four confusion table categories (TP, TN, FP, and FN). In practice, I find that most linkage data are imbalanced by having mostly TN.

Wikipedia: https://en.wikipedia.org/wiki/Matthews_correlation_coefficient
Matthew's article (1975): https://doi.org/10.1016/0005-2795(75)90109-9

Matthew, page 445:

"A correlation of:
   C =  1 indicates perfect agreement,
   C =  0 is expected for a prediction no better than random, and
   C = -1 indicates total disagreement between prediction and observation".

Mentioned in Tharwat's article (2018): https://doi.org/10.1016/j.aci.2018.08.003
Recommended by Luque et al (2019): https://doi.org/10.1016/j.patcog.2019.02.023

Anders

The text was updated successfully, but these errors were encountered:

aalexandersson · 2021-11-13T01:41:14Z

Recommended by Canbek et al. (2021): https://rdcu.be/cvT7d

Conclusion:

In conclusion, this study proposes a new comprehensive benchmarking method to analyze the robustness of performance metrics and ranks 15 performance metrics in the literature. Researchers can use MCC as the most robust metric for general objective purposes to be on the safe side.

Full reference:
Canbek, G., Taskaya Temizel, T. & Sagiroglu, S. BenchMetrics: a systematic benchmarking method for binary classification performance metrics. Neural Computing and Applications 33, 14623–14650 (2021). https://doi.org/10.1007/s00521-021-06103-6

aalexandersson mentioned this issue Jul 29, 2023

New link accuracy chart moj-analytical-services/splink#1478

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New feature request: Matthews correlation coefficient #43

New feature request: Matthews correlation coefficient #43

aalexandersson commented Aug 6, 2019

aalexandersson commented Nov 13, 2021 •

edited

New feature request: Matthews correlation coefficient #43

New feature request: Matthews correlation coefficient #43

Comments

aalexandersson commented Aug 6, 2019

aalexandersson commented Nov 13, 2021 • edited

aalexandersson commented Nov 13, 2021 •

edited