# 在模型选择中使用评估指标

我们通常希望，在使用GridSearchCV或 cross_val_score进行模型选择时能够使用AUC等指标。幸运的是，sctkit-learn提供了一种非常简单的实现方法,就是scoring参数， 它可以同时用于GridSearchCV和cross_val_score。你只需提供一个字符串，用于描述想要使用的评估指标。举个例子，我们想用AUC分数对digits数据集中“9 与其他”任务上的SVM分类器进行评估。想要将分数从默认值(精度)修改为AUC ,可以提供"roc_jauc"作为scoring参数的值：

In [1]:
# 分类问题的默认评分是精度
from sklearn.model_selection import cross_val_score
from sklearn.svm import SVC
from sklearn.datasets import load_digits

digits = load_digits()
print("Defualt scoring: {}".format(cross_val_score(SVC(), digits.data, digits.target == 9)))

# 指定 scoring="accuracy" 不会改便结果
explicit_accuracy = cross_val_score(SVC(), digits.data, digits.target == 9, scoring="accuracy")


# 使用适当的格式字符串将每个分数保留三位小数并格式化为字符串
formatted_scores = ["{:.3f}".format(score) for score in explicit_accuracy]     

#将格式化后的字符串连接起来
explicit_accuracy = ", ".join(formatted_scores) 
print("Explicit accuracy scoring: [{}]".format(explicit_accuracy))

roc_auc = cross_val_score(SVC(), digits.data, digits.target == 9, scoring="roc_auc")
# 使用适当的格式字符串将每个分数保留三位小数并格式化为字符串
formatted_score = ["{:.3f}".format(scores) for scores in roc_auc]     

#将格式化后的字符串连接起来
roc_auc = ", ".join(formatted_score)
print("AUC scoring: [{}]".format(roc_auc))

Defualt scoring: [0.975      0.99166667 1.         0.99442897 0.98050139]
Explicit accuracy scoring: [0.975, 0.992, 1.000, 0.994, 0.981]
AUC scoring: [0.997, 0.999, 1.000, 1.000, 0.984]


类似地，我们可以改变GridSearchCV中用于选择最佳参数的指标:

In [2]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target == 9, random_state=0)

In [3]:

from sklearn.model_selection import GridSearchCV
from sklearn.metrics import roc_auc_score
# 我们给出「不太好的网格来说明：
param_grid = {'gamma': [0.0001, 0.01, 0.1, 1, 10]}
# 使甬默认的精度；
grid = GridSearchCV(SVC(), param_grid=param_grid) 
grid.fit(X_train , y_train) 
print("Grid-Search with accuracy:") 
print("Best parameters:", grid.best_params_) 
print("Best cross-validation score (accuracy)) : {:.3f}".format(grid.best_score_)) 
print("Test set AUC: {:.3f}".format(roc_auc_score(y_test, grid.decision_function(X_test))))
print("Test set accuracy: {:.3f}".format(grid.score(X_test, y_test)))

Grid-Search with accuracy:
Best parameters: {'gamma': 0.0001}
Best cross-validation score (accuracy)) : 0.976
Test set AUC: 0.992
Test set accuracy: 0.973


In [4]:
# 使用AUC评分来代替：
grid = GridSearchCV(SVC(), param_grid=param_grid, scoring="roc_auc") 
grid.fit(X_train , y_train) 
print("\nGrid-Search with AUC") 
print("Best parameters:", grid.best_params_) 
print("Best cross-validation score (AUC): {:.3f}".format(grid.best_score_)) 
print("Test set AUC: {:.3f}".format(roc_auc_score(y_test, grid.decision_function(X_test)))) 
print("Test set accuracy: {:.3f}".format(grid.score(X_test, y_test)))


Grid-Search with AUC
Best parameters: {'gamma': 0.01}
Best cross-validation score (AUC): 0.998
Test set AUC: 1.000
Test set accuracy: 1.000


在使用精度时,选择的参数是gamma=0.001，而使用AUC时选择的参数是gamma=0.0l。在两种情况下，交叉验证精度与测试集精度是一致的。但是，使用AUC找到的参数设置，对应的AUC更高，甚至对应的精度也更高。

对于分类问题,scoring参数最重要的取值包括:accuracy（默认值）、roc_auc （ROC曲线下方的面积）、average_prectsion（准确率-召回率曲线下方的积）、f1, f1_macro, f1_micro和f1_weighted（这四个是二分类的f-分数以及各种加权变体）。对于回归问题最常用的取值包括:R^2（ R^2 分数），mean_squared_error（均方误差）和mean_absolute_error（平均绝对误差）。

In [5]:
from sklearn.metrics._scorer import SCORERS
print("Available scorers: \n {}".format(sorted(SCORERS.keys())))

Available scorers: 
 ['accuracy', 'adjusted_mutual_info_score', 'adjusted_rand_score', 'average_precision', 'balanced_accuracy', 'completeness_score', 'explained_variance', 'f1', 'f1_macro', 'f1_micro', 'f1_samples', 'f1_weighted', 'fowlkes_mallows_score', 'homogeneity_score', 'jaccard', 'jaccard_macro', 'jaccard_micro', 'jaccard_samples', 'jaccard_weighted', 'matthews_corrcoef', 'max_error', 'mutual_info_score', 'neg_brier_score', 'neg_log_loss', 'neg_mean_absolute_error', 'neg_mean_absolute_percentage_error', 'neg_mean_gamma_deviance', 'neg_mean_poisson_deviance', 'neg_mean_squared_error', 'neg_mean_squared_log_error', 'neg_median_absolute_error', 'neg_negative_likelihood_ratio', 'neg_root_mean_squared_error', 'normalized_mutual_info_score', 'positive_likelihood_ratio', 'precision', 'precision_macro', 'precision_micro', 'precision_samples', 'precision_weighted', 'r2', 'rand_score', 'recall', 'recall_macro', 'recall_micro', 'recall_samples', 'recall_weighted', 'roc_auc', 'roc_auc_ovo'