We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
为了了解模型的泛化能力, 我们需要用某个指标来衡量,这就是性能度量的意义
TP: 预测为1, 预测正确,即实际1 FP: 预测为1, 预测错误,即实际0 FN: 预测为0, 预测错误,即实际1 TN: 预测为0, 预测正确,即实际0
准确率 准确率 = ( TP + TN)/( TP + TN + FP + FN) 样本不均衡会导致准确率失效
精准率 精准率 = TP / ( TP + FP)
召回率 召回率 = TP / (TP + FN)
召回率越高,代表实际坏用户被预测出来的概率越高,宁可错杀一千,绝不放过一个
真正率和假正率
TPR 越高, 同时FPR 越低,即ROC 曲线越陡, 模型性能就越好
ROC 曲线无视样本不平衡
AUC 判断标准
# ROC/ AUC python 实现 from sklearn import metrics from sklearn.metrics import auc import numpy as np y = np.array([1, 1, 2, 2]) scores = np.array([0.1, 0.4, 0.35, 0.8]) fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2) metrics.auc(fpr, tpr)
The text was updated successfully, but these errors were encountered:
No branches or pull requests
性能度量
为了了解模型的泛化能力, 我们需要用某个指标来衡量,这就是性能度量的意义
二分类的模型,把预测和实际情况所有结果两两混合,出现四种情况
TP: 预测为1, 预测正确,即实际1
FP: 预测为1, 预测错误,即实际0
FN: 预测为0, 预测错误,即实际1
TN: 预测为0, 预测正确,即实际0
准确率
准确率 = ( TP + TN)/( TP + TN + FP + FN)
样本不均衡会导致准确率失效
精准率
精准率 = TP / ( TP + FP)
召回率
召回率 = TP / (TP + FN)
召回率越高,代表实际坏用户被预测出来的概率越高,宁可错杀一千,绝不放过一个
F1 分数 = 2 * 查准率 * 查全率 /(查准率 + 查全率)
ROC/AUC 的概念
= TP / (TP + FN)
= TN / (FP + TN)
= 灵敏度 = TP / ( TP + FN)
= 1- 特异度 = FP/ (FP + TN)
ROC 曲线
真正率和假正率
TPR 越高, 同时FPR 越低,即ROC 曲线越陡, 模型性能就越好
ROC 曲线无视样本不平衡
AUC 曲线下面积(Area Under Curve)
AUC 判断标准
The text was updated successfully, but these errors were encountered: