Pycaret 설치

pip install pycaret

# Pycaret Classification Tutorial for Intermediate

## 목차
### 1. `Library ` : 사용할 패키지를 불러옵니다.
### 2. `Load Dataset ` : 사전에 만들어둔 데이터를 불러옵니다.
### 3. `Setup Environment ` : Pycaret을 활용하기 위한 환경을 만들어줍니다.
### 4. `Compare Models ` : 데이터에 적합한 모델을 찾기 위해 다양한 모델을 비교합니다.
### 5. `Create & Tune Model ` : 모델을 만들고 튜닝을 진행합니다.
### 6. `Ensemble Model ` : Bagging, Boosting, Voting, Stacking을 통해 다양한 앙상블을 해봅니다.
### 7. `Predict ` : Validation Set을 Predict해 좋은 모델을 골라냅니다.
### 8. `Save Best Models ` : 성능이 가장 좋은 모델들을 저장합니다. (Load 생략)
### 9. `Finalize Best Models ` : train_size=1로 다시 학습합니다.
### 10. `Predict & Submit ` : test data를 예측하고, 제출합니다.

## `1. Library`

In [1]:
import pandas as pd
from pycaret.classification import *
from time import time
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import LabelEncoder

data_path = '../data/'

## `2. Load Dataset`
- Intermediate에서는 Raw Data를 가지고 setup에서 데이터 처리를 진행하도록 하겠습니다.

In [2]:
train = pd.read_csv(data_path+'final_train.csv')
test = pd.read_csv(data_path+'final_test.csv')

display(train.head())
display(test.head())

Unnamed: 0,index,QaA,QaE,QbA,QbE,QcA,QcE,QdA,QdE,QeA,...,wr_12_0,wr_12_1,wr_13_0,wr_13_1,wf_01_0,wf_01_1,wf_02_0,wf_02_1,wf_03_0,wf_03_1
0,0,0.563033,1.931109,0.557456,2.106971,0.574759,2.067591,0.549449,2.070962,0.538197,...,0.0,1.0,0.0,1.0,1.0,0.0,1.0,0.0,1.0,0.0
1,1,0.500437,2.011416,0.627831,2.101794,0.570634,2.211346,0.544993,2.196815,0.537259,...,0.0,1.0,0.0,1.0,1.0,0.0,1.0,0.0,1.0,0.0
2,2,0.494825,2.127356,0.487898,2.116312,0.499742,2.070592,0.561308,2.210925,0.538739,...,0.0,1.0,0.0,1.0,1.0,0.0,1.0,0.0,0.0,1.0
3,3,0.563033,1.977486,0.561204,2.168581,0.530138,2.066955,0.548998,2.206644,0.537259,...,0.0,1.0,0.0,1.0,1.0,0.0,1.0,0.0,1.0,0.0
4,4,0.564892,2.05835,0.487898,2.023194,0.574759,1.990961,0.537439,2.07554,0.537259,...,0.0,1.0,0.0,1.0,1.0,0.0,0.0,1.0,1.0,0.0


Unnamed: 0,index,QaA,QaE,QbA,QbE,QcA,QcE,QdA,QdE,QeA,...,wr_12_0,wr_12_1,wr_13_0,wr_13_1,wf_01_0,wf_01_1,wf_02_0,wf_02_1,wf_03_0,wf_03_1
0,0,0.563033,736,0.522095,2941,0.570634,4621,0.549449,4857,0.538197,...,0.0,1.0,0.0,1.0,1.0,0.0,1.0,0.0,1.0,0.0
1,1,0.563033,514,0.522095,1952,0.570634,1552,0.548998,821,0.534566,...,1.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0
2,2,0.563033,500,0.522095,2507,0.530138,480,0.537439,614,0.538197,...,0.0,1.0,0.0,1.0,1.0,0.0,1.0,0.0,1.0,0.0
3,3,0.564892,669,0.487898,1050,0.574759,1435,0.537439,2252,0.538739,...,0.0,1.0,0.0,1.0,1.0,0.0,1.0,0.0,1.0,0.0
4,4,0.540892,499,0.487898,1243,0.574759,845,0.537439,1666,0.538197,...,0.0,1.0,0.0,1.0,1.0,0.0,1.0,0.0,1.0,0.0


## `3.Pycaret 시작하기 : Setup Environment`
- session_id : random_state와 같은 파라미터입니다.
- normalize : True로 설정하면 Z-Score를 바탕으로 
- transformation : True로 설정하면 Gaussian과 같은 형태로 바꿔줍니다.
- transformation_method : transformation이 True일때만 작동하며, default 'yeo-johnson'이고 'quantile'로 바꿔줄 수 있습니다.
- ignore_low_variance : True로 설정하면 범주형 변수 중에 변수중요도가 낮은 변수를 삭제합니다.
- combine_rare_levels : True로 설정하면 범주형 변수 중에 적게 나타난 범주를 하나로 합칩니다. 
- rare_level_threshold : default 0.1로 설정되어 있으며, catetory별 통합 기준을 0.1로 설정합니다.
- remove_multicollinearity : True로 설정하면 다중공선성을 가지고 있는 변수 두개 중 target과 correlation이 더 낮은 변수를 삭제합니다.
- multicollinearity_threshold : default 0.95로 설정되어 있으며, 다중공선성의 삭제 기준을 0.95로 설정합니다.
- group_features : 비슷한 특징을 가진 변수를 묶을 수 있습니다.
- fix_imbalance : True로 설정하면 SMOTE를 적용합니다.
- log_experiment : MLFlow에 log가 기록됩니다.
- experiment_name : MLFlow에 log를 기록할 이름을 입력합니다.

In [4]:
clf = setup(data = train, target='voted',
            session_id = 20210302,
#             categorical_features = cat_columns,
#             normalize = True,
#             transformation = True, transformation_method='yeo-johnson',
            ignore_low_variance = True,
            combine_rare_levels = True, rare_level_threshold = 0.1,
            remove_multicollinearity = True, multicollinearity_threshold = 0.90,
#             fix_imbalance = True,
#             log_experiment=True, experiment_name='Pycaret_Intermediate_Jayhong',
            silent = True
           )

Setup Succesfully Completed!


Unnamed: 0,Description,Value
0,session_id,20210302
1,Target Type,Binary
2,Label Encoded,"0: 0, 1: 1"
3,Original Data,"(45532, 94)"
4,Missing Values,False
5,Numeric Features,58
6,Categorical Features,35
7,Ordinal Features,False
8,High Cardinality Features,False
9,High Cardinality Method,


## `4. Compare Models`

In [5]:
top5_models = compare_models(n_select=5, fold=5,  sort = 'AUC')

Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC,TT (Sec)
0,CatBoost Classifier,0.6972,0.7664,0.665,0.7525,0.706,0.3963,0.3995,28.7816
1,Gradient Boosting Classifier,0.6957,0.7662,0.6566,0.755,0.7024,0.3942,0.3983,18.7764
2,Light Gradient Boosting Machine,0.6951,0.7642,0.6513,0.7572,0.7002,0.3936,0.3982,0.8677
3,Linear Discriminant Analysis,0.6917,0.763,0.6699,0.7415,0.7038,0.3842,0.3863,0.4433
4,Logistic Regression,0.6879,0.761,0.6856,0.7279,0.7061,0.374,0.3748,0.665
5,Extra Trees Classifier,0.6928,0.761,0.6562,0.7507,0.7002,0.3881,0.3919,3.1403
6,Ada Boost Classifier,0.6924,0.7575,0.6642,0.7456,0.7025,0.3863,0.3891,4.1167
7,Extreme Gradient Boosting,0.6755,0.7426,0.6685,0.7185,0.6926,0.35,0.351,4.9026
8,Naive Bayes,0.6703,0.7253,0.6583,0.716,0.6859,0.3402,0.3416,0.0639
9,Quadratic Discriminant Analysis,0.6641,0.7151,0.6262,0.7238,0.6703,0.3314,0.336,0.1392


## `5.Create & Tune Model`

In [6]:
tuned_models = []

In [7]:
model_catboost = create_model('catboost', fold = 5)
model_catboost = tune_model(model_catboost, fold=5, optimize = 'AUC', choose_better = True)
tuned_models.append(model_catboost)

Unnamed: 0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,0.7029,0.7723,0.6718,0.7574,0.7121,0.4075,0.4106
1,0.69,0.7616,0.6543,0.7474,0.6978,0.3825,0.3861
2,0.6972,0.774,0.65,0.7614,0.7013,0.3982,0.4033
3,0.6989,0.7726,0.6597,0.7582,0.7055,0.4006,0.4047
4,0.6971,0.7712,0.6597,0.7553,0.7042,0.3966,0.4005
Mean,0.6972,0.7704,0.6591,0.7559,0.7042,0.3971,0.401
SD,0.0042,0.0045,0.0073,0.0047,0.0048,0.0082,0.0082


In [8]:
model_gbc = create_model('gbc', fold = 5)
model_gbc = tune_model(model_gbc, fold=5, optimize = 'AUC', choose_better = True)
tuned_models.append(model_gbc)

Unnamed: 0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,0.6831,0.7515,0.6761,0.7257,0.7,0.3651,0.3662
1,0.6731,0.7437,0.6641,0.7172,0.6896,0.3454,0.3466
2,0.6884,0.7561,0.6658,0.7387,0.7004,0.3777,0.3799
3,0.6892,0.7558,0.6801,0.7324,0.7053,0.3775,0.3787
4,0.6821,0.7456,0.6643,0.7301,0.6956,0.3646,0.3664
Mean,0.6832,0.7505,0.6701,0.7288,0.6982,0.3661,0.3675
SD,0.0058,0.0051,0.0067,0.0072,0.0053,0.0118,0.012


In [9]:
model_lightgbm = create_model('lightgbm', fold = 5)
model_lightgbm = tune_model(model_lightgbm, fold=5, optimize = 'AUC', choose_better = True)
tuned_models.append(model_lightgbm)

Unnamed: 0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,0.6995,0.7661,0.6621,0.7577,0.7067,0.4014,0.4052
1,0.6838,0.755,0.6322,0.7502,0.6862,0.3721,0.3777
2,0.6914,0.7673,0.6397,0.7582,0.6939,0.3872,0.3929
3,0.6992,0.766,0.6562,0.7608,0.7047,0.4017,0.4062
4,0.6971,0.7649,0.6542,0.7585,0.7025,0.3973,0.4018
Mean,0.6942,0.7639,0.6489,0.7571,0.6988,0.3919,0.3968
SD,0.006,0.0045,0.0111,0.0036,0.0077,0.0112,0.0106


In [10]:
model_lda = create_model('lda', fold = 5)
model_lda = tune_model(model_lda, fold=5, optimize = 'AUC', choose_better = True)
tuned_models.append(model_lda)

Unnamed: 0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,0.6944,0.7636,0.6816,0.7393,0.7093,0.3884,0.3898
1,0.6839,0.7536,0.6652,0.7322,0.6971,0.3682,0.3701
2,0.69,0.7683,0.6566,0.7461,0.6985,0.3821,0.3855
3,0.6933,0.7658,0.6677,0.7449,0.7042,0.3877,0.3902
4,0.6939,0.7641,0.6714,0.7438,0.7058,0.3886,0.3908
Mean,0.6911,0.7631,0.6685,0.7413,0.703,0.383,0.3853
SD,0.0039,0.005,0.0082,0.0051,0.0045,0.0078,0.0078


In [11]:
model_lr = create_model('lr', fold = 5)
model_lr = tune_model(model_lr, fold=5, optimize = 'AUC', choose_better = True)
tuned_models.append(model_lr)

Unnamed: 0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,0.6907,0.7627,0.6386,0.7577,0.693,0.3858,0.3916
1,0.6836,0.752,0.6308,0.7508,0.6856,0.372,0.3777
2,0.6865,0.7664,0.615,0.7657,0.6822,0.3801,0.3891
3,0.6939,0.7623,0.6319,0.7672,0.693,0.3935,0.4009
4,0.6939,0.7624,0.6419,0.7609,0.6963,0.3922,0.398
Mean,0.6897,0.7612,0.6316,0.7605,0.69,0.3847,0.3914
SD,0.0041,0.0048,0.0093,0.0059,0.0053,0.008,0.0081


## `6.Ensemble Model`

In [12]:
prediction_models = []

### bagging

In [13]:
bag_catboost_10 = ensemble_model(model_catboost, n_estimators = 10, fold=5, optimize = 'AUC')
prediction_models.append(bag_catboost_10)

Unnamed: 0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,0.6995,0.7705,0.6727,0.7516,0.71,0.4001,0.4027
1,0.6875,0.7602,0.6538,0.7438,0.6959,0.3773,0.3806
2,0.695,0.7731,0.6558,0.7545,0.7017,0.3928,0.3969
3,0.7005,0.7715,0.6677,0.756,0.7091,0.4029,0.4062
4,0.6977,0.7696,0.6648,0.7533,0.7063,0.3973,0.4006
Mean,0.696,0.769,0.663,0.7518,0.7046,0.3941,0.3974
SD,0.0046,0.0045,0.0072,0.0043,0.0052,0.009,0.0089


In [14]:
bag_catboost_50 = ensemble_model(model_catboost, n_estimators = 50, fold=5, optimize = 'AUC')
prediction_models.append(bag_catboost_50)

Unnamed: 0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,0.6998,0.7705,0.6733,0.7518,0.7104,0.4007,0.4033
1,0.6891,0.7607,0.6575,0.7442,0.6981,0.3802,0.3833
2,0.6988,0.7736,0.6558,0.7605,0.7043,0.4007,0.4053
3,0.6994,0.7723,0.664,0.7565,0.7072,0.4011,0.4047
4,0.6981,0.7704,0.6646,0.7542,0.7065,0.3983,0.4017
Mean,0.697,0.7695,0.663,0.7534,0.7053,0.3962,0.3997
SD,0.004,0.0046,0.0062,0.0054,0.0041,0.0081,0.0083


### Boosting

In [15]:
boo_catboost = ensemble_model(model_catboost, method = 'Boosting', fold=5, optimize='AUC')
prediction_models.append(boo_catboost)

Unnamed: 0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,0.6979,0.7679,0.6781,0.7462,0.7106,0.3961,0.3981
1,0.6874,0.7587,0.6604,0.74,0.6979,0.3761,0.3788
2,0.6914,0.7663,0.6609,0.7459,0.7008,0.3846,0.3876
3,0.6956,0.766,0.6758,0.7441,0.7083,0.3917,0.3937
4,0.6952,0.7662,0.6717,0.7455,0.7067,0.3912,0.3935
Mean,0.6935,0.765,0.6694,0.7443,0.7049,0.3879,0.3903
SD,0.0037,0.0032,0.0074,0.0023,0.0047,0.0069,0.0067


### Blending
#### blend_models(estimator_list: list, fold: Optional[Union[int, Any]] = None, round: int = 4, choose_better: bool = False, optimize: str = 'Accuracy', method: str = 'auto', weights: Optional[List[float]] = None, fit_kwargs: Optional[dict] = None, groups: Optional[Union[str, Any]] = None, verbose: bool = True)

In [16]:
blend_3_soft = blend_models(estimator_list=[model_catboost, model_lightgbm, model_lda], method='soft', fold=5, optimize='AUC')
prediction_models.append(blend_3_soft)

Unnamed: 0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,0.699,0.7697,0.6718,0.7514,0.7094,0.3992,0.4019
1,0.6913,0.7594,0.6592,0.7466,0.7002,0.3846,0.3877
2,0.698,0.7741,0.6569,0.7585,0.7041,0.399,0.4033
3,0.7025,0.7725,0.6697,0.758,0.7112,0.407,0.4103
4,0.698,0.769,0.6666,0.7528,0.707,0.3978,0.4009
Mean,0.6978,0.7689,0.6649,0.7535,0.7064,0.3975,0.4008
SD,0.0036,0.0051,0.0058,0.0044,0.0039,0.0072,0.0073


In [17]:
blend_4_soft = blend_models(estimator_list=[model_catboost, model_lightgbm, model_lda, model_lr], method='soft', fold=5, optimize='AUC')
prediction_models.append(blend_4_soft)

Unnamed: 0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,0.6955,0.7688,0.6612,0.752,0.7037,0.3932,0.3967
1,0.6902,0.7585,0.6529,0.7484,0.6974,0.383,0.3868
2,0.6966,0.7731,0.648,0.7616,0.7002,0.3971,0.4024
3,0.7014,0.771,0.6611,0.7614,0.7077,0.4057,0.4099
4,0.698,0.7684,0.6617,0.7556,0.7055,0.3984,0.4021
Mean,0.6963,0.768,0.657,0.7558,0.7029,0.3955,0.3996
SD,0.0037,0.005,0.0056,0.0052,0.0037,0.0074,0.0077


In [18]:
blend_5_soft = blend_models(estimator_list=tuned_models, method='soft', fold=5, optimize='AUC')
prediction_models.append(blend_5_soft)

Unnamed: 0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,0.6977,0.7691,0.6647,0.7535,0.7063,0.3974,0.4008
1,0.69,0.7587,0.6532,0.748,0.6974,0.3827,0.3864
2,0.6969,0.7733,0.6495,0.7613,0.7009,0.3976,0.4027
3,0.7011,0.7712,0.6623,0.7602,0.7079,0.4049,0.4089
4,0.6977,0.7684,0.6608,0.7556,0.705,0.3978,0.4016
Mean,0.6967,0.7681,0.6581,0.7557,0.7035,0.3961,0.4001
SD,0.0036,0.005,0.0058,0.0048,0.0038,0.0073,0.0074


### Stacking
#### stack_models(estimator_list: list, meta_model=None, fold: Optional[Union[int, Any]] = None, round: int = 4, method: str = 'auto', restack: bool = True, choose_better: bool = False, optimize: str = 'Accuracy', fit_kwargs: Optional[dict] = None, groups: Optional[Union[str, Any]] = None, verbose: bool = True) → Any

In [20]:
stack_3_soft = stack_models(estimator_list=[model_catboost, model_lightgbm, model_lda],
                            meta_model=model_lr,
                           fold = 5,
                           optimize = 'AUC',
                           choose_better= True)
prediction_models.append(stack_3_soft)

Unnamed: 0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,0.7043,0.7725,0.6188,0.7951,0.6959,0.4168,0.4292
1,0.6915,0.7589,0.5975,0.7869,0.6793,0.3925,0.4064
2,0.6953,0.7715,0.5861,0.8037,0.6778,0.402,0.4201
3,0.7007,0.7698,0.6123,0.793,0.6911,0.4099,0.4229
4,0.6989,0.7709,0.6103,0.7913,0.6891,0.4066,0.4195
Mean,0.6981,0.7687,0.605,0.794,0.6866,0.4056,0.4196
SD,0.0044,0.005,0.0117,0.0055,0.007,0.0081,0.0074


In [21]:
stack_3_best = stack_models(estimator_list=[model_lightgbm, model_lda,model_lr],
                            meta_model=model_catboost,
                           fold = 5,
                           optimize = 'AUC',
                           choose_better= True)
prediction_models.append(stack_3_best)

Unnamed: 0,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,0.7001,0.7701,0.6408,0.7719,0.7003,0.4053,0.4123
1,0.6872,0.7615,0.6236,0.7612,0.6856,0.3805,0.388
2,0.7018,0.7746,0.6279,0.7837,0.6972,0.4104,0.4202
3,0.7051,0.7733,0.6419,0.7797,0.7041,0.4156,0.4234
4,0.6961,0.7693,0.6293,0.7727,0.6937,0.3984,0.4067
Mean,0.698,0.7697,0.6327,0.7739,0.6962,0.402,0.4101
SD,0.0061,0.0046,0.0073,0.0077,0.0063,0.0122,0.0125


## `Predict`

In [24]:
prediction_models

[BaggingClassifier(base_estimator=<catboost.core.CatBoostClassifier object at 0x000001E602791EE0>,
                   bootstrap=True, bootstrap_features=False, max_features=1.0,
                   max_samples=1.0, n_estimators=10, n_jobs=-1, oob_score=False,
                   random_state=20210302, verbose=0, warm_start=False),
 BaggingClassifier(base_estimator=<catboost.core.CatBoostClassifier object at 0x000001E602791EE0>,
                   bootstrap=True, bootstrap_features=False, max_features=1.0,
                   max_samples=1.0, n_estimators=50, n_jobs=-1, oob_score=False,
                   random_state=20210302, verbose=0, warm_start=False),
 AdaBoostClassifier(algorithm='SAMME.R',
                    base_estimator=<catboost.core.CatBoostClassifier object at 0x000001E602791EE0>,
                    learning_rate=1.0, n_estimators=10, random_state=20210302),
 VotingClassifier(estimators=[('Cat Boost Classifier E E E_0',
                               <catboost.core.CatBoost

In [22]:
for model in prediction_models:
    print(model.__class__.__name__)
    display(predict_model(model))

BaggingClassifier


Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,Bagging Classifier,0.6891,0.7628,0.6572,0.7444,0.698,0.3802,0.3833


Unnamed: 0,QaA,QaE,QbA,QcA,QcE,QdA,QdE,QeA,QeE,QfA,...,wr_04_0_1.0,wr_06_0_1.0,wr_09_0_0.0,wr_11_1_0.0,wr_13_0_0.0,wf_03_0_1.0,wf_03_1_0.0,voted,Label,Score
0,0.564892,1.984928,0.627831,0.574759,2.020359,0.549449,1.977486,0.537259,2.048587,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.9734
1,0.540892,1.999249,0.561204,0.574759,2.126176,0.549449,2.081516,0.538197,2.080165,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,0,0.3309
2,0.564892,2.061202,0.487898,0.530138,2.191196,0.549449,2.163461,0.537259,2.226536,0.515232,...,1.0,1.0,0.0,1.0,0.0,1.0,1.0,1,0,0.4156
3,0.540892,1.986719,0.561204,0.530138,2.073033,0.537439,2.167986,0.538197,2.130750,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.6528
4,0.564892,1.996939,0.522095,0.574759,2.036587,0.549449,2.087927,0.596230,2.318136,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.7369
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13655,0.563033,2.002854,0.487898,0.530138,2.068097,0.548998,2.095770,0.538197,2.091561,0.522300,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4981
13656,0.564892,1.941796,0.522095,0.530138,1.967494,0.549449,2.081067,0.537259,2.125878,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.8638
13657,0.494825,1.915445,0.557456,0.499742,2.017649,0.549449,2.066700,0.537259,2.048587,0.585859,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4329
13658,0.564892,1.939581,0.487898,0.499742,2.068851,0.537439,2.306922,0.537259,2.086973,0.523605,...,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0,1,0.5012


BaggingClassifier


Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,Bagging Classifier,0.6891,0.7629,0.6573,0.7443,0.6981,0.3802,0.3833


Unnamed: 0,QaA,QaE,QbA,QcA,QcE,QdA,QdE,QeA,QeE,QfA,...,wr_04_0_1.0,wr_06_0_1.0,wr_09_0_0.0,wr_11_1_0.0,wr_13_0_0.0,wf_03_0_1.0,wf_03_1_0.0,voted,Label,Score
0,0.564892,1.984928,0.627831,0.574759,2.020359,0.549449,1.977486,0.537259,2.048587,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.9735
1,0.540892,1.999249,0.561204,0.574759,2.126176,0.549449,2.081516,0.538197,2.080165,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,0,0.3223
2,0.564892,2.061202,0.487898,0.530138,2.191196,0.549449,2.163461,0.537259,2.226536,0.515232,...,1.0,1.0,0.0,1.0,0.0,1.0,1.0,1,0,0.4045
3,0.540892,1.986719,0.561204,0.530138,2.073033,0.537439,2.167986,0.538197,2.130750,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.6378
4,0.564892,1.996939,0.522095,0.574759,2.036587,0.549449,2.087927,0.596230,2.318136,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.7352
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13655,0.563033,2.002854,0.487898,0.530138,2.068097,0.548998,2.095770,0.538197,2.091561,0.522300,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,1,0.5044
13656,0.564892,1.941796,0.522095,0.530138,1.967494,0.549449,2.081067,0.537259,2.125878,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.8632
13657,0.494825,1.915445,0.557456,0.499742,2.017649,0.549449,2.066700,0.537259,2.048587,0.585859,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4575
13658,0.564892,1.939581,0.487898,0.499742,2.068851,0.537439,2.306922,0.537259,2.086973,0.523605,...,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0,1,0.5017


AdaBoostClassifier


Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,Ada Boost Classifier,0.6869,0.7599,0.6637,0.7375,0.6987,0.3747,0.377


Unnamed: 0,QaA,QaE,QbA,QcA,QcE,QdA,QdE,QeA,QeE,QfA,...,wr_04_0_1.0,wr_06_0_1.0,wr_09_0_0.0,wr_11_1_0.0,wr_13_0_0.0,wf_03_0_1.0,wf_03_1_0.0,voted,Label,Score
0,0.564892,1.984928,0.627831,0.574759,2.020359,0.549449,1.977486,0.537259,2.048587,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.7021
1,0.540892,1.999249,0.561204,0.574759,2.126176,0.549449,2.081516,0.538197,2.080165,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,0,0.4774
2,0.564892,2.061202,0.487898,0.530138,2.191196,0.549449,2.163461,0.537259,2.226536,0.515232,...,1.0,1.0,0.0,1.0,0.0,1.0,1.0,1,0,0.4874
3,0.540892,1.986719,0.561204,0.530138,2.073033,0.537439,2.167986,0.538197,2.130750,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.5393
4,0.564892,1.996939,0.522095,0.574759,2.036587,0.549449,2.087927,0.596230,2.318136,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.5275
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13655,0.563033,2.002854,0.487898,0.530138,2.068097,0.548998,2.095770,0.538197,2.091561,0.522300,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4920
13656,0.564892,1.941796,0.522095,0.530138,1.967494,0.549449,2.081067,0.537259,2.125878,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.5548
13657,0.494825,1.915445,0.557456,0.499742,2.017649,0.549449,2.066700,0.537259,2.048587,0.585859,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4927
13658,0.564892,1.939581,0.487898,0.499742,2.068851,0.537439,2.306922,0.537259,2.086973,0.523605,...,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0,0,0.4902


VotingClassifier


Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,Voting Classifier,0.6885,0.762,0.659,0.7424,0.6982,0.3787,0.3816


Unnamed: 0,QaA,QaE,QbA,QcA,QcE,QdA,QdE,QeA,QeE,QfA,...,wr_04_0_1.0,wr_06_0_1.0,wr_09_0_0.0,wr_11_1_0.0,wr_13_0_0.0,wf_03_0_1.0,wf_03_1_0.0,voted,Label,Score
0,0.564892,1.984928,0.627831,0.574759,2.020359,0.549449,1.977486,0.537259,2.048587,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.9663
1,0.540892,1.999249,0.561204,0.574759,2.126176,0.549449,2.081516,0.538197,2.080165,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,0,0.3428
2,0.564892,2.061202,0.487898,0.530138,2.191196,0.549449,2.163461,0.537259,2.226536,0.515232,...,1.0,1.0,0.0,1.0,0.0,1.0,1.0,1,0,0.4023
3,0.540892,1.986719,0.561204,0.530138,2.073033,0.537439,2.167986,0.538197,2.130750,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.6421
4,0.564892,1.996939,0.522095,0.574759,2.036587,0.549449,2.087927,0.596230,2.318136,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.7320
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13655,0.563033,2.002854,0.487898,0.530138,2.068097,0.548998,2.095770,0.538197,2.091561,0.522300,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,1,0.5202
13656,0.564892,1.941796,0.522095,0.530138,1.967494,0.549449,2.081067,0.537259,2.125878,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.8751
13657,0.494825,1.915445,0.557456,0.499742,2.017649,0.549449,2.066700,0.537259,2.048587,0.585859,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4440
13658,0.564892,1.939581,0.487898,0.499742,2.068851,0.537439,2.306922,0.537259,2.086973,0.523605,...,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0,0,0.4778


VotingClassifier


Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,Voting Classifier,0.6877,0.76,0.653,0.7445,0.6958,0.3778,0.3812


Unnamed: 0,QaA,QaE,QbA,QcA,QcE,QdA,QdE,QeA,QeE,QfA,...,wr_04_0_1.0,wr_06_0_1.0,wr_09_0_0.0,wr_11_1_0.0,wr_13_0_0.0,wf_03_0_1.0,wf_03_1_0.0,voted,Label,Score
0,0.564892,1.984928,0.627831,0.574759,2.020359,0.549449,1.977486,0.537259,2.048587,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.9595
1,0.540892,1.999249,0.561204,0.574759,2.126176,0.549449,2.081516,0.538197,2.080165,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,0,0.3412
2,0.564892,2.061202,0.487898,0.530138,2.191196,0.549449,2.163461,0.537259,2.226536,0.515232,...,1.0,1.0,0.0,1.0,0.0,1.0,1.0,1,0,0.3931
3,0.540892,1.986719,0.561204,0.530138,2.073033,0.537439,2.167986,0.538197,2.130750,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.6060
4,0.564892,1.996939,0.522095,0.574759,2.036587,0.549449,2.087927,0.596230,2.318136,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.7332
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13655,0.563033,2.002854,0.487898,0.530138,2.068097,0.548998,2.095770,0.538197,2.091561,0.522300,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,1,0.5182
13656,0.564892,1.941796,0.522095,0.530138,1.967494,0.549449,2.081067,0.537259,2.125878,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.8731
13657,0.494825,1.915445,0.557456,0.499742,2.017649,0.549449,2.066700,0.537259,2.048587,0.585859,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4561
13658,0.564892,1.939581,0.487898,0.499742,2.068851,0.537439,2.306922,0.537259,2.086973,0.523605,...,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0,0,0.4866


VotingClassifier


Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,Voting Classifier,0.6875,0.7605,0.6538,0.7437,0.6959,0.3772,0.3805


Unnamed: 0,QaA,QaE,QbA,QcA,QcE,QdA,QdE,QeA,QeE,QfA,...,wr_04_0_1.0,wr_06_0_1.0,wr_09_0_0.0,wr_11_1_0.0,wr_13_0_0.0,wf_03_0_1.0,wf_03_1_0.0,voted,Label,Score
0,0.564892,1.984928,0.627831,0.574759,2.020359,0.549449,1.977486,0.537259,2.048587,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.9624
1,0.540892,1.999249,0.561204,0.574759,2.126176,0.549449,2.081516,0.538197,2.080165,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,0,0.3421
2,0.564892,2.061202,0.487898,0.530138,2.191196,0.549449,2.163461,0.537259,2.226536,0.515232,...,1.0,1.0,0.0,1.0,0.0,1.0,1.0,1,0,0.3791
3,0.540892,1.986719,0.561204,0.530138,2.073033,0.537439,2.167986,0.538197,2.130750,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.6140
4,0.564892,1.996939,0.522095,0.574759,2.036587,0.549449,2.087927,0.596230,2.318136,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.7376
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13655,0.563033,2.002854,0.487898,0.530138,2.068097,0.548998,2.095770,0.538197,2.091561,0.522300,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,1,0.5097
13656,0.564892,1.941796,0.522095,0.530138,1.967494,0.549449,2.081067,0.537259,2.125878,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.8725
13657,0.494825,1.915445,0.557456,0.499742,2.017649,0.549449,2.066700,0.537259,2.048587,0.585859,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4524
13658,0.564892,1.939581,0.487898,0.499742,2.068851,0.537439,2.306922,0.537259,2.086973,0.523605,...,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0,0,0.4871


CatBoostClassifier


Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,CatBoost Classifier,0.6898,0.7641,0.6544,0.7469,0.6976,0.3819,0.3855


Unnamed: 0,QaA,QaE,QbA,QcA,QcE,QdA,QdE,QeA,QeE,QfA,...,wr_04_0_1.0,wr_06_0_1.0,wr_09_0_0.0,wr_11_1_0.0,wr_13_0_0.0,wf_03_0_1.0,wf_03_1_0.0,voted,Label,Score
0,0.564892,1.984928,0.627831,0.574759,2.020359,0.549449,1.977486,0.537259,2.048587,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.9781
1,0.540892,1.999249,0.561204,0.574759,2.126176,0.549449,2.081516,0.538197,2.080165,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,0,0.3426
2,0.564892,2.061202,0.487898,0.530138,2.191196,0.549449,2.163461,0.537259,2.226536,0.515232,...,1.0,1.0,0.0,1.0,0.0,1.0,1.0,1,0,0.3835
3,0.540892,1.986719,0.561204,0.530138,2.073033,0.537439,2.167986,0.538197,2.130750,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.6753
4,0.564892,1.996939,0.522095,0.574759,2.036587,0.549449,2.087927,0.596230,2.318136,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.7256
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13655,0.563033,2.002854,0.487898,0.530138,2.068097,0.548998,2.095770,0.538197,2.091561,0.522300,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4963
13656,0.564892,1.941796,0.522095,0.530138,1.967494,0.549449,2.081067,0.537259,2.125878,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.8646
13657,0.494825,1.915445,0.557456,0.499742,2.017649,0.549449,2.066700,0.537259,2.048587,0.585859,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4594
13658,0.564892,1.939581,0.487898,0.499742,2.068851,0.537439,2.306922,0.537259,2.086973,0.523605,...,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0,0,0.4660


CatBoostClassifier


Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,CatBoost Classifier,0.6898,0.7641,0.6544,0.7469,0.6976,0.3819,0.3855


Unnamed: 0,QaA,QaE,QbA,QcA,QcE,QdA,QdE,QeA,QeE,QfA,...,wr_04_0_1.0,wr_06_0_1.0,wr_09_0_0.0,wr_11_1_0.0,wr_13_0_0.0,wf_03_0_1.0,wf_03_1_0.0,voted,Label,Score
0,0.564892,1.984928,0.627831,0.574759,2.020359,0.549449,1.977486,0.537259,2.048587,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.9781
1,0.540892,1.999249,0.561204,0.574759,2.126176,0.549449,2.081516,0.538197,2.080165,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,0,0.3426
2,0.564892,2.061202,0.487898,0.530138,2.191196,0.549449,2.163461,0.537259,2.226536,0.515232,...,1.0,1.0,0.0,1.0,0.0,1.0,1.0,1,0,0.3835
3,0.540892,1.986719,0.561204,0.530138,2.073033,0.537439,2.167986,0.538197,2.130750,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.6753
4,0.564892,1.996939,0.522095,0.574759,2.036587,0.549449,2.087927,0.596230,2.318136,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.7256
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13655,0.563033,2.002854,0.487898,0.530138,2.068097,0.548998,2.095770,0.538197,2.091561,0.522300,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4963
13656,0.564892,1.941796,0.522095,0.530138,1.967494,0.549449,2.081067,0.537259,2.125878,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.8646
13657,0.494825,1.915445,0.557456,0.499742,2.017649,0.549449,2.066700,0.537259,2.048587,0.585859,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4594
13658,0.564892,1.939581,0.487898,0.499742,2.068851,0.537439,2.306922,0.537259,2.086973,0.523605,...,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0,0,0.4660


In [23]:
for model in tuned_models:
    print(model.__class__.__name__)
    display(predict_model(model))

CatBoostClassifier


Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,CatBoost Classifier,0.6898,0.7641,0.6544,0.7469,0.6976,0.3819,0.3855


Unnamed: 0,QaA,QaE,QbA,QcA,QcE,QdA,QdE,QeA,QeE,QfA,...,wr_04_0_1.0,wr_06_0_1.0,wr_09_0_0.0,wr_11_1_0.0,wr_13_0_0.0,wf_03_0_1.0,wf_03_1_0.0,voted,Label,Score
0,0.564892,1.984928,0.627831,0.574759,2.020359,0.549449,1.977486,0.537259,2.048587,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.9781
1,0.540892,1.999249,0.561204,0.574759,2.126176,0.549449,2.081516,0.538197,2.080165,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,0,0.3426
2,0.564892,2.061202,0.487898,0.530138,2.191196,0.549449,2.163461,0.537259,2.226536,0.515232,...,1.0,1.0,0.0,1.0,0.0,1.0,1.0,1,0,0.3835
3,0.540892,1.986719,0.561204,0.530138,2.073033,0.537439,2.167986,0.538197,2.130750,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.6753
4,0.564892,1.996939,0.522095,0.574759,2.036587,0.549449,2.087927,0.596230,2.318136,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.7256
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13655,0.563033,2.002854,0.487898,0.530138,2.068097,0.548998,2.095770,0.538197,2.091561,0.522300,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4963
13656,0.564892,1.941796,0.522095,0.530138,1.967494,0.549449,2.081067,0.537259,2.125878,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.8646
13657,0.494825,1.915445,0.557456,0.499742,2.017649,0.549449,2.066700,0.537259,2.048587,0.585859,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4594
13658,0.564892,1.939581,0.487898,0.499742,2.068851,0.537439,2.306922,0.537259,2.086973,0.523605,...,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0,0,0.4660


GradientBoostingClassifier


Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,Gradient Boosting Classifier,0.6874,0.761,0.6527,0.7442,0.6955,0.3772,0.3806


Unnamed: 0,QaA,QaE,QbA,QcA,QcE,QdA,QdE,QeA,QeE,QfA,...,wr_04_0_1.0,wr_06_0_1.0,wr_09_0_0.0,wr_11_1_0.0,wr_13_0_0.0,wf_03_0_1.0,wf_03_1_0.0,voted,Label,Score
0,0.564892,1.984928,0.627831,0.574759,2.020359,0.549449,1.977486,0.537259,2.048587,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.9740
1,0.540892,1.999249,0.561204,0.574759,2.126176,0.549449,2.081516,0.538197,2.080165,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,0,0.3459
2,0.564892,2.061202,0.487898,0.530138,2.191196,0.549449,2.163461,0.537259,2.226536,0.515232,...,1.0,1.0,0.0,1.0,0.0,1.0,1.0,1,0,0.3231
3,0.540892,1.986719,0.561204,0.530138,2.073033,0.537439,2.167986,0.538197,2.130750,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.6462
4,0.564892,1.996939,0.522095,0.574759,2.036587,0.549449,2.087927,0.596230,2.318136,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.7552
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13655,0.563033,2.002854,0.487898,0.530138,2.068097,0.548998,2.095770,0.538197,2.091561,0.522300,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4758
13656,0.564892,1.941796,0.522095,0.530138,1.967494,0.549449,2.081067,0.537259,2.125878,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.8702
13657,0.494825,1.915445,0.557456,0.499742,2.017649,0.549449,2.066700,0.537259,2.048587,0.585859,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4374
13658,0.564892,1.939581,0.487898,0.499742,2.068851,0.537439,2.306922,0.537259,2.086973,0.523605,...,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0,0,0.4890


LGBMClassifier


Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,Light Gradient Boosting Machine,0.688,0.7603,0.6485,0.7475,0.6945,0.379,0.383


Unnamed: 0,QaA,QaE,QbA,QcA,QcE,QdA,QdE,QeA,QeE,QfA,...,wr_04_0_1.0,wr_06_0_1.0,wr_09_0_0.0,wr_11_1_0.0,wr_13_0_0.0,wf_03_0_1.0,wf_03_1_0.0,voted,Label,Score
0,0.564892,1.984928,0.627831,0.574759,2.020359,0.549449,1.977486,0.537259,2.048587,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.9837
1,0.540892,1.999249,0.561204,0.574759,2.126176,0.549449,2.081516,0.538197,2.080165,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,0,0.3533
2,0.564892,2.061202,0.487898,0.530138,2.191196,0.549449,2.163461,0.537259,2.226536,0.515232,...,1.0,1.0,0.0,1.0,0.0,1.0,1.0,1,0,0.4105
3,0.540892,1.986719,0.561204,0.530138,2.073033,0.537439,2.167986,0.538197,2.130750,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.7426
4,0.564892,1.996939,0.522095,0.574759,2.036587,0.549449,2.087927,0.596230,2.318136,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.6925
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13655,0.563033,2.002854,0.487898,0.530138,2.068097,0.548998,2.095770,0.538197,2.091561,0.522300,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,1,0.5299
13656,0.564892,1.941796,0.522095,0.530138,1.967494,0.549449,2.081067,0.537259,2.125878,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.8922
13657,0.494825,1.915445,0.557456,0.499742,2.017649,0.549449,2.066700,0.537259,2.048587,0.585859,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4363
13658,0.564892,1.939581,0.487898,0.499742,2.068851,0.537439,2.306922,0.537259,2.086973,0.523605,...,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0,0,0.4257


LinearDiscriminantAnalysis


Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,Linear Discriminant Analysis,0.6818,0.7543,0.6631,0.7302,0.695,0.364,0.3658


Unnamed: 0,QaA,QaE,QbA,QcA,QcE,QdA,QdE,QeA,QeE,QfA,...,wr_04_0_1.0,wr_06_0_1.0,wr_09_0_0.0,wr_11_1_0.0,wr_13_0_0.0,wf_03_0_1.0,wf_03_1_0.0,voted,Label,Score
0,0.564892,1.984928,0.627831,0.574759,2.020359,0.549449,1.977486,0.537259,2.048587,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.9371
1,0.540892,1.999249,0.561204,0.574759,2.126176,0.549449,2.081516,0.538197,2.080165,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,0,0.3325
2,0.564892,2.061202,0.487898,0.530138,2.191196,0.549449,2.163461,0.537259,2.226536,0.515232,...,1.0,1.0,0.0,1.0,0.0,1.0,1.0,1,0,0.4130
3,0.540892,1.986719,0.561204,0.530138,2.073033,0.537439,2.167986,0.538197,2.130750,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.5084
4,0.564892,1.996939,0.522095,0.574759,2.036587,0.549449,2.087927,0.596230,2.318136,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.7778
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13655,0.563033,2.002854,0.487898,0.530138,2.068097,0.548998,2.095770,0.538197,2.091561,0.522300,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,1,0.5344
13656,0.564892,1.941796,0.522095,0.530138,1.967494,0.549449,2.081067,0.537259,2.125878,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.8685
13657,0.494825,1.915445,0.557456,0.499742,2.017649,0.549449,2.066700,0.537259,2.048587,0.585859,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4362
13658,0.564892,1.939581,0.487898,0.499742,2.068851,0.537439,2.306922,0.537259,2.086973,0.523605,...,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0,1,0.5418


LogisticRegression


Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,Kappa,MCC
0,Logistic Regression,0.68,0.7505,0.6268,0.7473,0.6818,0.365,0.3707


Unnamed: 0,QaA,QaE,QbA,QcA,QcE,QdA,QdE,QeA,QeE,QfA,...,wr_04_0_1.0,wr_06_0_1.0,wr_09_0_0.0,wr_11_1_0.0,wr_13_0_0.0,wf_03_0_1.0,wf_03_1_0.0,voted,Label,Score
0,0.564892,1.984928,0.627831,0.574759,2.020359,0.549449,1.977486,0.537259,2.048587,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.9375
1,0.540892,1.999249,0.561204,0.574759,2.126176,0.549449,2.081516,0.538197,2.080165,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,0,0.3297
2,0.564892,2.061202,0.487898,0.530138,2.191196,0.549449,2.163461,0.537259,2.226536,0.515232,...,1.0,1.0,0.0,1.0,0.0,1.0,1.0,1,0,0.3685
3,0.540892,1.986719,0.561204,0.530138,2.073033,0.537439,2.167986,0.538197,2.130750,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,0,0.4980
4,0.564892,1.996939,0.522095,0.574759,2.036587,0.549449,2.087927,0.596230,2.318136,0.568371,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.7368
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13655,0.563033,2.002854,0.487898,0.530138,2.068097,0.548998,2.095770,0.538197,2.091561,0.522300,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,1,0.5091
13656,0.564892,1.941796,0.522095,0.530138,1.967494,0.549449,2.081067,0.537259,2.125878,0.523605,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,1,1,0.8684
13657,0.494825,1.915445,0.557456,0.499742,2.017649,0.549449,2.066700,0.537259,2.048587,0.585859,...,0.0,1.0,0.0,1.0,1.0,1.0,1.0,0,0,0.4922
13658,0.564892,1.939581,0.487898,0.499742,2.068851,0.537439,2.306922,0.537259,2.086973,0.523605,...,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0,1,0.5154


## `Save Best Model`

### 1. Tuned Catboost Classifier - AUC : 0.7641

In [25]:
save_model(model_catboost,'Pycaret_Intermediate_catboost_20210302')

Transformation Pipeline and Model Succesfully Saved


### 2. bag_catboost_50 - AUC : 0.7629

In [26]:
save_model(bag_catboost_50,'Pycaret_Intermediate_bag_catboost_50_20210302')

Transformation Pipeline and Model Succesfully Saved


### 3. blend_3_soft - AUC : 0.762

In [27]:
save_model(blend_3_soft,'Pycaret_Intermediate_blend_3model_20210302')

Transformation Pipeline and Model Succesfully Saved


## `Finalize Best Models`

In [28]:
final_tuend_cat = finalize_model(model_catboost)

In [29]:
final_bag_catboost_50 = finalize_model(bag_catboost_50)

In [30]:
final_blend_3_soft = finalize_model(blend_3_soft)

## `Predict & Submit`
- Cat : Public : 0.7720590281 / Private : 0.7717540655
- Bagging : 0.7721665517 / 0.7727374822
- Voting : 0.664457165 / 0.6820266892 (과적합이 된 것으로 보입니다)

In [33]:
for model in [final_tuend_cat, final_bag_catboost_50, final_blend_3_soft]:
    prediction = predict_model(model, data=test)
    sample_submission = pd.read_csv(data_path+'sample_submission.csv')
    sample_submission['voted'] = prediction['Score']
    display(sample_submission.head())
    sample_submission.to_csv(f'Pycaret_Classification_Intermediate_03022021_{model.__class__.__name__}.csv',index=False)

Unnamed: 0,index,voted
0,0,0.6597
1,1,0.8929
2,2,0.5535
3,3,0.1696
4,4,0.7993


Unnamed: 0,index,voted
0,0,0.6713
1,1,0.8918
2,2,0.5477
3,3,0.1698
4,4,0.7911


Unnamed: 0,index,voted
0,0,0.7882
1,1,0.5849
2,2,0.7267
3,3,0.4804
4,4,0.4817
