### * F1 Score

In [1]:
import numpy as np

In [2]:
def f1_score(precision, recall):
    try:
        return 2 * precision * recall / (precision + recall)
    except:
        return 0.0

### * precision 和 recall 经过F1 Score调和之后

In [3]:
precision = 0.5
recall = 0.5
f1_score(precision, recall)

0.5

In [4]:
precision = 0.1
recall = 0.9
f1_score(precision, recall)

0.18000000000000002

In [5]:
precision = 0.0
recall = 1.0
f1_score(precision, recall)

0.0

### * 先准备好数据集

In [6]:
from sklearn import datasets

digits = datasets.load_digits()
X = digits.data
y = digits.target.copy()
# 上述y之所以为target的副本，是因为如果不加copy()的话，下面就会将digits.target作更改

y[digits.target == 9] = 1
y[digits.target != 9] = 0
# 上述这种方式就使得我们原本sklearn中算是均匀分布的这样一个数据集改成了一个极度偏斜的数据集
# 其中这个数据集只有两个分类，并且我们将分类为1的样本占少数

In [7]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=666)

### * 使用LogisticRegression

In [8]:
from sklearn.linear_model import LogisticRegression

log_reg = LogisticRegression()
log_reg.fit(X_train, y_train)
log_reg.score(X_test, y_test)

0.9755555555555555

注意：上述的结果是准确率的值

In [9]:
y_predict = log_reg.predict(X_test)

#### * 使用scikit-learn中的f1_score()求出F1 Score

In [10]:
from sklearn.metrics import f1_score
f1_score(y_test, y_predict)

0.8674698795180723

#### 上述f1_score()是由precision_score和recall_score算出来的，所以值会受两者的影响

In [11]:
from sklearn.metrics import precision_score
precision_score(y_test, y_predict)

0.9473684210526315

In [12]:
from sklearn.metrics import recall_score
recall_score(y_test, y_predict)

0.8

上述f1_score的值为0.86是因为被上述的召回率0.8给拉低的