-
-
Notifications
You must be signed in to change notification settings - Fork 26.1k
Description
For the multi-class case, the micro
average option seems to result in mathematically equivalent definitions for precision_score
and recall_score
(and as a result equivalent f1_score
, and fbeta_score
, and accuracy_score
).
Am I missing something? Here is my argument:
For the multi-class setting, let p_m and r_m denote the micro precision. If we define
p_m = tp / (tp + fp)
r_m = tp / (tp + fn)
where tp, fp, tn, fn
are the total number of global true positives, false positives, true negatives, and false negatives (this seems to be my understanding of how this is implemented), then by definition fp = tn
because in for an incorrect prediction (A, B)
if you are a false positive with respect to class A
, then you are a true negative with respect to class B
so summing over them result in all the incorrect predictions. Hence, they are the same as accuracy
Here is the code that got me thinking about this.
from sklearn.metrics import *
import random
y_pred = [random.randint(0, 2) for i in range(100)]
y_true = [random.randint(0, 2) for i in range(100)]
print precision_score(y_true, y_pred, average='micro')
print recall_score(y_true, y_pred, average='micro')
print f1_score(y_true, y_pred, average='micro')
print accuracy_score(y_true, y_pred)
0.34000000000000002
0.34000000000000002
0.34000000000000002
0.34000000000000002