Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

代码中有一个bug,导致算出cv的auc不正常 #1

Open
QingGo opened this issue Apr 30, 2019 · 1 comment
Open

代码中有一个bug,导致算出cv的auc不正常 #1

QingGo opened this issue Apr 30, 2019 · 1 comment

Comments

@QingGo
Copy link

QingGo commented Apr 30, 2019

跟着作者的代码一直执行到最后,在cv这一步输出的auc的确是挺高的,达到了0.79,然而当我自行分割训练集和验证集,并且用同样的模型参数训练模型时,效果却不如人意。

from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, roc_curve, auc, roc_auc_score

voting = VotingClassifier(estimators = estimators, voting='soft')
X_train_new, X_val, y_train_new, y_val = train_test_split(X_train,y_train,test_size=0.2,random_state=0)
voting.fit(X_train_new, y_train_new)
y_train_predit = voting.predict(X_train_new)
y_val_predit = voting.predict(X_val)

print(classification_report(y_train_new, y_train_predit))
print(roc_auc_score(y_train_new, y_train_predit))
print(classification_report(y_val, y_val_predit))
print(roc_auc_score(y_val, y_val_predit))

输出如下:

              precision    recall  f1-score   support

           0       0.94      1.00      0.97     21212
           1       1.00      0.00      0.00      1247

   micro avg       0.94      0.94      0.94     22459
   macro avg       0.97      0.50      0.49     22459
weighted avg       0.95      0.94      0.92     22459

0.5008019246190858
              precision    recall  f1-score   support

           0       0.94      1.00      0.97      5306
           1       0.00      0.00      0.00       309

   micro avg       0.94      0.94      0.94      5615
   macro avg       0.47      0.50      0.49      5615
weighted avg       0.89      0.94      0.92      5615

0.49962306822465136

模型的auc只有0.5不到,而且recall基本为0。因为这是一个不平衡预测集,违约人数较少,此时模型可能是把所有样本判断为0,虽然准确率很高,但是这样的模型却是没意义的。

那么到底是哪里出了问题呢,为什么前面的交叉验证显示的auc这么高呢。观察代码,我发现了一个bug。在这个地方:

cv = StratifiedKFold(n_splits=3, shuffle=True)

def estimate(estimator, name='estimator'):
    auc = cross_val_score(estimator, X_train, y_train, scoring='roc_auc', cv=cv).mean()
    accuracy = cross_val_score(estimator, X_train, y_train, scoring='accuracy', cv=cv).mean()
    recall = cross_val_score(estimator, X_train, y_train, scoring='recall', cv=cv).mean()

    print("{}: auc:{:f}, recall:{:f}, accuracy:{:f}".format(name, auc, recall, accuracy))

作者在cross_val_score的cv参数传入了一个 StratifiedKFold的实例。阅读代码发现,如果传入数字cross_val_score也会默认使用StratifiedKFold(cv)来对数据集进行分割,但是不会传入shuffle=True。另外计算三个指标分别进行三次交叉验证计算也不合常理。于是我尝试把代码改成如下:

def estimate(estimator, name='estimator'):
    scoring = {'roc_auc': 'roc_auc',
               'accuracy': 'accuracy',
               'recall': 'recall'}
    
    scoring_result_dict= cross_validate(estimator, X_train, y_train, scoring=scoring, cv=3, return_estimator=True)
    auc = scoring_result_dict['test_roc_auc'].mean()
    accuracy = scoring_result_dict['test_accuracy'].mean()
    recall = scoring_result_dict['test_recall'].mean()
    print(scoring_result_dict)
    print("{}: auc:{:f}, recall:{:f}, accuracy:{:f}".format(name, auc, recall, accuracy))

此时算出来的auc只有0.5左右,符合上面的结果。同时我也尝试传入cv = StratifiedKFold(n_splits=3, shuffle=True),算出来的auc也只有0.5左右。我猜测,shuffle=True是造成auc偏高的原因,但具体原因我还没找到。

作者在数据清洗和特征工程做了大量的工作,还是能给人不少启发的,不过最后在模型调参这一部分就显得有点粗糙了。

@thulorry
Copy link

应该是你用y_val_predit的问题,计算auc的时候应该使用y_pred[:,1],就是预测为1的分数。
改成y_val_predit = voting.predict_proba(X_val)[:,1]试试

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants