## F1 Score

In [1]:
import numpy as np

In [2]:
def f1_score(precision, recall):
    try:
        return 2 * precision * recall / (precision + recall)
    except:
        return 0.0

In [3]:
precision = 0.5
recall = 0.5
f1_score(precision, recall)

0.5

In [4]:
precision = 0.1
recall = 0.9
f1_score(precision, recall)

0.18000000000000002

如果此时是算术平均值，得到的结果还是0.5，但是使用调和平均值得到的结果为0.18，远远小于0.5，所以这也是调和平均值的优势。当其中某一个值特别小的时候，F1 Score 的值也特别小。

In [5]:
precision = 0.0
recall = 1.0
f1_score(precision, recall)

0.0

## 使用真实数据

In [9]:
from sklearn import datasets

digits = datasets.load_digits()
X = digits.data
y = digits.target.copy()

y[digits.target==9] = 1
y[digits.target!=9] = 0

In [10]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=666)

In [11]:
from sklearn.linear_model import LogisticRegression

log_reg = LogisticRegression()
log_reg.fit(X_train, y_train)
log_reg.score(X_test, y_test)

0.9755555555555555

In [12]:
y_predict = log_reg.predict(X_test)

In [13]:
from sklearn.metrics import confusion_matrix
# 混淆矩阵
confusion_matrix(y_test, y_predict)

array([[403,   2],
       [  9,  36]], dtype=int64)

In [14]:
from sklearn.metrics import precision_score
# 精准率
precision_score(y_test, y_predict)

0.9473684210526315

In [15]:
from sklearn.metrics import recall_score
# 召回率
recall_score(y_test, y_predict)

0.8

### 计算 F1 Score

In [16]:
from sklearn.metrics import f1_score

f1_score(y_test, y_predict)

0.8674698795180723

我们发现 F1 Score 远远要低于分类准确度求出来的 97.5%,而且也要低于精准率的 94.7%，这是因为 F1 Score 综合了精准率和召回率。所以在这种情况下，F1 Score 是能更好的反映算法的准确度的。