### 적용할 Model (총 8 가지)
- Logistic Regression
- LDA
- QDA
- KNN
- Decision Tree
- Random Forest
- XGBoost
- Light_GBM

### 적용할 Oversampling 방법 (총 4 가지)
- Oversampling하지 않은 원래 데이터
- SMOTE
- ADASYN
- Distribution-SMOTE

### ▶ 총 8*4=32가지의 Oversampling + modeling 경우에 대해서 평가지표를 출력

---

## (2) Modeling 목차
1. **평가함수 정의:** 평가지표를 출력하기 위한 함수 생성
    

2. **각 Oversampling 방법에 따른 모델 적용:** 각각의 32가지 경우 출력<br>
    **2.(1) Oversampling 하지 않음**<br>
    2.(1)-1. 데이터 가공 : 해당 Oversampling 기법을 적용한 데이터를 불러와 Feature data와 target data로 나누어 train data & test data 분할<br>
    2.(1)-2. 모델 적합 : 해당 Oversampling 기법에 대해 8가지 모델을 적용<br>
    2.(1)-3. 성능 평가 : 각 적합 결과에 대해 혼동행렬, 정확도, 정밀도, 재현율, F1-score, AUC, 기하 평균 값 산출<br>
    
    **2.(2) SMOTE**<br>
    2.(2)-1. 데이터 가공<br>
    2.(2)-2. 모델 적합<br>
    2.(2)-3. 성능 평가<br>
    
    **2.(3) ADASYN**<br>
    2.(3)-1. 데이터 가공<br>
    2.(3)-2. 모델 적합<br>
    2.(3)-3. 성능 평가<br>
    
    **2.(4) Distribution-SMOTE**<br>
    2.(4)-1. 데이터 가공<br>
    2.(4)-2. 모델 적합<br>
    2.(4)-3. 성능 평가<br>


---

In [1]:
import warnings
warnings.filterwarnings(action='ignore')

## 1. 평가함수 정의

In [2]:
from sklearn.metrics import confusion_matrix
from sklearn.metrics import roc_curve
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score
from imblearn.metrics import geometric_mean_score

def get_clf_eval(y_test, y_pred):
    confmat=pd.DataFrame(confusion_matrix(y_test, y_pred),
                    index=['True[0]', 'True[1]'],
                    columns=['Predict[0]', 'Predict[1]'])
    accuracy = accuracy_score(y_test, y_pred)
    precision = precision_score(y_test, y_pred)
    recall = recall_score(y_test, y_pred)
    f1 = f1_score(y_test, y_pred)
    AUC = roc_auc_score(y_test, y_pred)
    g_means = geometric_mean_score(y_test, y_pred)
    print(confmat)
    print("\n정확도 : {:.3f} \n정밀도 : {:.3f} \n재현율 : {:.3f} \nf1-score : {:.3f} \nAUC : {:.3f} \n기하평균 : {:.3f} \n".format(accuracy,
                                        precision, recall, f1, AUC, g_means))

## 2. 각 Oversampling 방법에 따른 모델 적용

## 2.(1) 불균형 데이터 처리 하지 않음

### 2.(1)-1. 데이터 가공

In [3]:
import pandas as pd
df = pd.read_csv("Loan_data.csv")

In [4]:
df

Unnamed: 0,Id,Income,Age,Experience,Married/Single,House_Ownership,Car_Ownership,Profession,CITY,STATE,CURRENT_JOB_YRS,CURRENT_HOUSE_YRS,Risk_Flag
0,1,1303834,23,3,single,rented,no,Mechanical_engineer,Rewa,Madhya_Pradesh,3,13,0
1,2,7574516,40,10,single,rented,no,Software_Developer,Parbhani,Maharashtra,9,13,0
2,3,3991815,66,4,married,rented,no,Technical_writer,Alappuzha,Kerala,4,10,0
3,4,6256451,41,2,single,rented,yes,Software_Developer,Bhubaneswar,Odisha,2,12,1
4,5,5768871,47,11,single,rented,no,Civil_servant,Tiruchirappalli[10],Tamil_Nadu,3,14,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...
251995,251996,8154883,43,13,single,rented,no,Surgeon,Kolkata,West_Bengal,6,11,0
251996,251997,2843572,26,10,single,rented,no,Army_officer,Rewa,Madhya_Pradesh,6,11,0
251997,251998,4522448,46,7,single,rented,no,Design_Engineer,Kalyan-Dombivli,Maharashtra,7,12,0
251998,251999,6507128,45,0,single,rented,no,Graphic_Designer,Pondicherry,Puducherry,0,10,0


In [5]:
# Feature, Target 나누기
X = df.drop(['Id','Risk_Flag'], axis=1)
y = df.Risk_Flag

In [6]:
# 범주형 변수 Labeling하기

from sklearn.preprocessing import LabelEncoder

en = LabelEncoder()
category_cols = ['Married/Single','House_Ownership','Car_Ownership', 'Profession', 'CITY', 'STATE']
for cols in category_cols:
    X[cols] = en.fit_transform(X[cols])

In [7]:
# Train & Test 데이터셋 나누기
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 101)

### 2.(1)-2. 모델 적합

In [8]:
from sklearn.linear_model import LogisticRegression
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.tree import DecisionTreeClassifier
import xgboost as xgb
from xgboost.sklearn import XGBClassifier
from sklearn.neural_network import MLPClassifier
from lightgbm import LGBMClassifier
import time

models_X = []
models_X.append(('LR', LogisticRegression(max_iter =5000))) # 로지스틱 분류기 
models_X.append(('LDA', LinearDiscriminantAnalysis()))  # LDA 모델
models_X.append(('QDA', QuadraticDiscriminantAnalysis()))  # QDA 모델
models_X.append(('KNN', KNeighborsClassifier())) # KNN 모델
models_X.append(('DT', DecisionTreeClassifier()))  # 의사결정나무 모델
models_X.append(('RF', RandomForestClassifier()))  # 랜덤포레스트 모델
models_X.append(('XGB', XGBClassifier()))  # XGB 모델
models_X.append(('Light_GBM', LGBMClassifier())) # Light_GBM 모델

for name, model in models_X:
    start = time.time()
    model.fit(X_train, y_train)
    end = time.time() - start
    msg = "%s - train_score : %.3f, test score : %.3f, time : %.5f 초" % (name, model.score(X_train, y_train), model.score(X_test, y_test), end)
    print(msg)

LR - train_score : 0.877, test score : 0.877, time : 0.95596 초
LDA - train_score : 0.877, test score : 0.877, time : 0.47928 초
QDA - train_score : 0.877, test score : 0.877, time : 0.20683 초
KNN - train_score : 0.901, test score : 0.889, time : 2.38774 초
DT - train_score : 0.937, test score : 0.879, time : 2.78954 초
RF - train_score : 0.937, test score : 0.899, time : 50.36346 초
XGB - train_score : 0.894, test score : 0.887, time : 13.92591 초
Light_GBM - train_score : 0.881, test score : 0.879, time : 1.57509 초


### 2.(1)-3. 성능 평가

In [9]:
# 모델 갯수
a = list(range(0,len(models_X)))

for i in a:
    print("----------OverSampling 하지 않음 + %s 모델 적용----------" % (models_X[i][0]))
    get_clf_eval(y_test, models_X[i][1].predict(X_test))

----------OverSampling 하지 않음 + LR 모델 적용----------
         Predict[0]  Predict[1]
True[0]       66292           0
True[1]        9308           0

정확도 : 0.877 
정밀도 : 0.000 
재현율 : 0.000 
f1-score : 0.000 
AUC : 0.500 
기하평균 : 0.000 

----------OverSampling 하지 않음 + LDA 모델 적용----------
         Predict[0]  Predict[1]
True[0]       66292           0
True[1]        9308           0

정확도 : 0.877 
정밀도 : 0.000 
재현율 : 0.000 
f1-score : 0.000 
AUC : 0.500 
기하평균 : 0.000 

----------OverSampling 하지 않음 + QDA 모델 적용----------
         Predict[0]  Predict[1]
True[0]       66292           0
True[1]        9308           0

정확도 : 0.877 
정밀도 : 0.000 
재현율 : 0.000 
f1-score : 0.000 
AUC : 0.500 
기하평균 : 0.000 

----------OverSampling 하지 않음 + KNN 모델 적용----------
         Predict[0]  Predict[1]
True[0]       62474        3818
True[1]        4573        4735

정확도 : 0.889 
정밀도 : 0.554 
재현율 : 0.509 
f1-score : 0.530 
AUC : 0.726 
기하평균 : 0.692 

----------OverSampling 하지 않음 + DT 모델 적용----------
         Predict[0]

## 2.(2) SMOTE

### 2.(2)-1. 데이터 가공

In [10]:
X_s = pd.read_csv("x_smote.csv")
y_s = pd.read_csv("y_smote.csv")

In [11]:
X_s.drop(['Unnamed: 0'], axis=1, inplace=True)
X_s

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10
0,8196231,37,14,1,2,1,20,63,1,7,13
1,2869872,56,19,1,2,0,19,86,1,11,12
2,7361429,58,2,1,2,0,50,292,12,2,12
3,5921974,25,3,0,2,0,50,194,6,3,11
4,1583605,45,13,1,2,0,5,249,13,13,14
...,...,...,...,...,...,...,...,...,...,...,...
309419,8928667,38,15,1,2,0,9,256,2,3,10
309420,7761009,38,3,1,2,0,33,164,4,3,12
309421,4002405,68,6,1,2,0,31,65,13,3,13
309422,8517978,58,7,1,2,0,8,160,12,4,10


In [12]:
y_s = y_s['0']
y_s

0         0
1         0
2         1
3         0
4         0
         ..
309419    1
309420    1
309421    1
309422    1
309423    1
Name: 0, Length: 309424, dtype: int64

In [13]:
from sklearn.model_selection import train_test_split

X_train_s, X_test_s, y_train_s, y_test_s = train_test_split(X_s, y_s, test_size = 0.3, random_state = 101)

### 2.(2)-2. 모델 적합

In [14]:
models_s = []
models_s.append(('LR', LogisticRegression(max_iter =5000))) # 로지스틱 분류기 
models_s.append(('LDA', LinearDiscriminantAnalysis()))  # LDA 모델
models_s.append(('QDA', QuadraticDiscriminantAnalysis()))  # QDA 모델
models_s.append(('KNN', KNeighborsClassifier())) # KNN 모델
models_s.append(('DT', DecisionTreeClassifier()))  # 의사결정나무 모델
models_s.append(('RF', RandomForestClassifier()))  # 랜덤포레스트 모델
models_s.append(('XGB', XGBClassifier()))  # XGB 모델
models_s.append(('Light_GBM', LGBMClassifier(boost_from_average=False))) # Light_GBM 모델

for name, model in models_s:
    start = time.time()
    model.fit(X_train_s, y_train_s)
    end = time.time() - start
    msg = "%s - train_score : %.3f, test score : %.3f, time : %.5f 초" % (name, model.score(X_train_s, y_train_s), model.score(X_test_s, y_test_s), end)
    print(msg)

LR - train_score : 0.501, test score : 0.499, time : 0.45080 초
LDA - train_score : 0.550, test score : 0.550, time : 0.57846 초
QDA - train_score : 0.557, test score : 0.557, time : 0.24734 초
KNN - train_score : 0.908, test score : 0.891, time : 2.35414 초
DT - train_score : 0.957, test score : 0.905, time : 1.96118 초
RF - train_score : 0.957, test score : 0.923, time : 55.54366 초
XGB - train_score : 0.889, test score : 0.881, time : 19.05337 초
Light_GBM - train_score : 0.803, test score : 0.798, time : 2.19603 초


### 2.(2)-3. 성능 평가

In [15]:
# 모델 갯수
a = list(range(0,len(models_s)))

for i in a:
    print("----------SMOTE + %s 모델 적용----------" % (models_s[i][0]))
    get_clf_eval(y_test_s, models_s[i][1].predict(X_test_s))

----------SMOTE + LR 모델 적용----------
         Predict[0]  Predict[1]
True[0]       46299           0
True[1]       46529           0

정확도 : 0.499 
정밀도 : 0.000 
재현율 : 0.000 
f1-score : 0.000 
AUC : 0.500 
기하평균 : 0.000 

----------SMOTE + LDA 모델 적용----------
         Predict[0]  Predict[1]
True[0]       24111       22188
True[1]       19567       26962

정확도 : 0.550 
정밀도 : 0.549 
재현율 : 0.579 
f1-score : 0.564 
AUC : 0.550 
기하평균 : 0.549 

----------SMOTE + QDA 모델 적용----------
         Predict[0]  Predict[1]
True[0]       20867       25432
True[1]       15688       30841

정확도 : 0.557 
정밀도 : 0.548 
재현율 : 0.663 
f1-score : 0.600 
AUC : 0.557 
기하평균 : 0.547 

----------SMOTE + KNN 모델 적용----------
         Predict[0]  Predict[1]
True[0]       40056        6243
True[1]        3847       42682

정확도 : 0.891 
정밀도 : 0.872 
재현율 : 0.917 
f1-score : 0.894 
AUC : 0.891 
기하평균 : 0.891 

----------SMOTE + DT 모델 적용----------
         Predict[0]  Predict[1]
True[0]       39989        6310
True[1]        2543 

## 2.(3) ADASYN

### 2.(3)-1. 데이터 가공

In [16]:
X_a = pd.read_csv("x_adasyn.csv")
y_a = pd.read_csv("y_adasyn.csv")

In [17]:
X_a

Unnamed: 0.1,Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10
0,0,8196231,37,14,1,2,1,20,63,1,7,13
1,1,2869872,56,19,1,2,0,19,86,1,11,12
2,2,7361429,58,2,1,2,0,50,292,12,2,12
3,3,5921974,25,3,0,2,0,50,194,6,3,11
4,4,1583605,45,13,1,2,0,5,249,13,13,14
...,...,...,...,...,...,...,...,...,...,...,...,...
307024,307024,9768782,61,10,1,2,1,46,113,2,5,11
307025,307025,9768782,61,10,1,2,1,46,113,2,5,11
307026,307026,9768782,61,10,1,2,1,46,113,2,5,11
307027,307027,9768782,61,10,1,2,1,46,113,2,5,11


In [18]:
X_a.drop(['Unnamed: 0'], axis=1, inplace=True)
X_a

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10
0,8196231,37,14,1,2,1,20,63,1,7,13
1,2869872,56,19,1,2,0,19,86,1,11,12
2,7361429,58,2,1,2,0,50,292,12,2,12
3,5921974,25,3,0,2,0,50,194,6,3,11
4,1583605,45,13,1,2,0,5,249,13,13,14
...,...,...,...,...,...,...,...,...,...,...,...
307024,9768782,61,10,1,2,1,46,113,2,5,11
307025,9768782,61,10,1,2,1,46,113,2,5,11
307026,9768782,61,10,1,2,1,46,113,2,5,11
307027,9768782,61,10,1,2,1,46,113,2,5,11


In [19]:
y_a = y_a['0']
y_a

0         0
1         0
2         1
3         0
4         0
         ..
307024    1
307025    1
307026    1
307027    1
307028    1
Name: 0, Length: 307029, dtype: int64

In [20]:
from sklearn.model_selection import train_test_split

X_train_a, X_test_a, y_train_a, y_test_a = train_test_split(X_a, y_a, test_size = 0.3, random_state = 101)

### 2.(3)-2. 모델 적합

In [21]:
models_a = []
models_a.append(('LR', LogisticRegression(max_iter =5000))) # 로지스틱 분류기 
models_a.append(('LDA', LinearDiscriminantAnalysis()))  # LDA 모델
models_a.append(('QDA', QuadraticDiscriminantAnalysis()))  # QDA 모델
models_a.append(('KNN', KNeighborsClassifier())) # KNN 모델
models_a.append(('DT', DecisionTreeClassifier()))  # 의사결정나무 모델
models_a.append(('RF', RandomForestClassifier()))  # 랜덤포레스트 모델
models_a.append(('XGB', XGBClassifier()))  # XGB 모델
models_a.append(('Light_GBM', LGBMClassifier())) # Light_GBM 모델

for name, model in models_a:
    start = time.time()
    model.fit(X_train_a, y_train_a)
    end = time.time() - start
    msg = "%s - train_score : %.3f, test score : %.3f, time : %.5f 초" % (name, model.score(X_train_a, y_train_a), model.score(X_test_a, y_test_a), end)
    print(msg)

LR - train_score : 0.504, test score : 0.503, time : 0.47672 초
LDA - train_score : 0.543, test score : 0.544, time : 0.68903 초
QDA - train_score : 0.571, test score : 0.571, time : 0.51856 초
KNN - train_score : 0.899, test score : 0.880, time : 3.48614 초
DT - train_score : 0.957, test score : 0.903, time : 2.41355 초
RF - train_score : 0.957, test score : 0.922, time : 82.63604 초
XGB - train_score : 0.877, test score : 0.871, time : 21.80919 초
Light_GBM - train_score : 0.803, test score : 0.799, time : 3.02433 초


### 2.(3)-3. 성능 평가

In [22]:
# 모델 갯수
a = list(range(0,len(models_a)))

for i in a:
    print("----------ADASYN + %s 모델 적용----------" % (models_a[i][0]))
    get_clf_eval(y_test_a, models_a[i][1].predict(X_test_a))

----------ADASYN + LR 모델 적용----------
         Predict[0]  Predict[1]
True[0]       46314           0
True[1]       45795           0

정확도 : 0.503 
정밀도 : 0.000 
재현율 : 0.000 
f1-score : 0.000 
AUC : 0.500 
기하평균 : 0.000 

----------ADASYN + LDA 모델 적용----------
         Predict[0]  Predict[1]
True[0]       26120       20194
True[1]       21770       24025

정확도 : 0.544 
정밀도 : 0.543 
재현율 : 0.525 
f1-score : 0.534 
AUC : 0.544 
기하평균 : 0.544 

----------ADASYN + QDA 모델 적용----------
         Predict[0]  Predict[1]
True[0]       29358       16956
True[1]       22588       23207

정확도 : 0.571 
정밀도 : 0.578 
재현율 : 0.507 
f1-score : 0.540 
AUC : 0.570 
기하평균 : 0.567 

----------ADASYN + KNN 모델 적용----------
         Predict[0]  Predict[1]
True[0]       39399        6915
True[1]        4094       41701

정확도 : 0.880 
정밀도 : 0.858 
재현율 : 0.911 
f1-score : 0.883 
AUC : 0.881 
기하평균 : 0.880 

----------ADASYN + DT 모델 적용----------
         Predict[0]  Predict[1]
True[0]       39964        6350
True[1]        

## 2.(4) Distribution-SMOTE

### 2.(4)-1. 데이터 가공

In [23]:
X_d = pd.read_csv("X_smova.csv")
y_d = pd.read_csv("y_smova.csv")

In [24]:
X_d

Unnamed: 0.1,Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10
0,0,8.196231e+06,37.000000,14.000000,1.000000,2.000000,1.000000,20.000000,63.000000,1.000000,7.000000,13.000000
1,1,2.869872e+06,56.000000,19.000000,1.000000,2.000000,0.000000,19.000000,86.000000,1.000000,11.000000,12.000000
2,2,7.361429e+06,58.000000,2.000000,1.000000,2.000000,0.000000,50.000000,292.000000,12.000000,2.000000,12.000000
3,3,5.921974e+06,25.000000,3.000000,0.000000,2.000000,0.000000,50.000000,194.000000,6.000000,3.000000,11.000000
4,4,1.583605e+06,45.000000,13.000000,1.000000,2.000000,0.000000,5.000000,249.000000,13.000000,13.000000,14.000000
...,...,...,...,...,...,...,...,...,...,...,...,...
309419,309419,7.826657e+06,30.000000,3.000000,1.000000,2.000000,0.000000,16.000000,66.000000,2.000000,3.000000,10.000000
309420,309420,8.981043e+06,57.000000,8.000000,1.000000,2.000000,1.000000,47.000000,201.000000,11.000000,8.000000,12.000000
309421,309421,6.299623e+05,48.145033,15.698746,0.433751,2.000000,0.000000,24.963742,173.990159,19.794986,12.132498,13.433751
309422,309422,2.117133e+06,67.555042,4.304676,0.217626,0.435252,0.174101,39.730221,300.983823,10.569062,3.348201,10.696402


In [25]:
X_d.drop(['Unnamed: 0'], axis=1, inplace=True)
X_d

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10
0,8.196231e+06,37.000000,14.000000,1.000000,2.000000,1.000000,20.000000,63.000000,1.000000,7.000000,13.000000
1,2.869872e+06,56.000000,19.000000,1.000000,2.000000,0.000000,19.000000,86.000000,1.000000,11.000000,12.000000
2,7.361429e+06,58.000000,2.000000,1.000000,2.000000,0.000000,50.000000,292.000000,12.000000,2.000000,12.000000
3,5.921974e+06,25.000000,3.000000,0.000000,2.000000,0.000000,50.000000,194.000000,6.000000,3.000000,11.000000
4,1.583605e+06,45.000000,13.000000,1.000000,2.000000,0.000000,5.000000,249.000000,13.000000,13.000000,14.000000
...,...,...,...,...,...,...,...,...,...,...,...
309419,7.826657e+06,30.000000,3.000000,1.000000,2.000000,0.000000,16.000000,66.000000,2.000000,3.000000,10.000000
309420,8.981043e+06,57.000000,8.000000,1.000000,2.000000,1.000000,47.000000,201.000000,11.000000,8.000000,12.000000
309421,6.299623e+05,48.145033,15.698746,0.433751,2.000000,0.000000,24.963742,173.990159,19.794986,12.132498,13.433751
309422,2.117133e+06,67.555042,4.304676,0.217626,0.435252,0.174101,39.730221,300.983823,10.569062,3.348201,10.696402


In [26]:
y_d = y_d['0']
y_d

0         0
1         0
2         1
3         0
4         0
         ..
309419    1
309420    1
309421    1
309422    1
309423    1
Name: 0, Length: 309424, dtype: int64

In [27]:
from sklearn.model_selection import train_test_split

X_train_d, X_test_d, y_train_d, y_test_d = train_test_split(X_d, y_d, test_size = 0.3, random_state = 101)

### 2.(4)-2. 모델 적합

In [28]:
models_d = []
models_d.append(('LR', LogisticRegression(max_iter =5000))) # 로지스틱 분류기 
models_d.append(('LDA', LinearDiscriminantAnalysis()))  # LDA 모델
models_d.append(('QDA', QuadraticDiscriminantAnalysis()))  # QDA 모델
models_d.append(('KNN', KNeighborsClassifier())) # KNN 모델
models_d.append(('DT', DecisionTreeClassifier()))  # 의사결정나무 모델
models_d.append(('RF', RandomForestClassifier()))  # 랜덤포레스트 모델
models_d.append(('XGB', XGBClassifier()))  # XGB 모델
models_d.append(('Light_GBM', LGBMClassifier())) # Light_GBM 모델

for name, model in models_d:
    start = time.time()
    model.fit(X_train_d, y_train_d)
    end = time.time() - start
    msg = "%s - train_score : %.3f, test score : %.3f, time : %.5f 초" % (name, model.score(X_train_d, y_train_d), model.score(X_test_d, y_test_d), end)
    print(msg)

LR - train_score : 0.501, test score : 0.499, time : 0.48117 초
LDA - train_score : 0.535, test score : 0.536, time : 0.55950 초
QDA - train_score : 0.558, test score : 0.561, time : 0.28427 초
KNN - train_score : 0.915, test score : 0.901, time : 2.75273 초
DT - train_score : 0.958, test score : 0.926, time : 2.78638 초
RF - train_score : 0.958, test score : 0.940, time : 63.98812 초
XGB - train_score : 0.921, test score : 0.917, time : 25.08321 초
Light_GBM - train_score : 0.855, test score : 0.849, time : 2.79054 초


### 2.(4)-3. 성능 평가

In [37]:
# 모델 갯수
a = list(range(0,len(models_d)))

for i in a:
    print("----------ADASYN + %s 모델 적용----------" % (models_d[i][0]))
    get_clf_eval(y_test_d, models_d[i][1].predict(X_test_d))

----------ADASYN + LR 모델 적용----------
         Predict[0]  Predict[1]
True[0]       46299           0
True[1]       46529           0

정확도 : 0.499 
정밀도 : 0.000 
재현율 : 0.000 
f1-score : 0.000 
AUC : 0.500 
기하평균 : 0.000 

----------ADASYN + LDA 모델 적용----------
         Predict[0]  Predict[1]
True[0]       23538       22761
True[1]       20332       26197

정확도 : 0.536 
정밀도 : 0.535 
재현율 : 0.563 
f1-score : 0.549 
AUC : 0.536 
기하평균 : 0.535 

----------ADASYN + QDA 모델 적용----------
         Predict[0]  Predict[1]
True[0]       16378       29921
True[1]       10841       35688

정확도 : 0.561 
정밀도 : 0.544 
재현율 : 0.767 
f1-score : 0.637 
AUC : 0.560 
기하평균 : 0.521 

----------ADASYN + KNN 모델 적용----------
         Predict[0]  Predict[1]
True[0]       40173        6126
True[1]        3064       43465

정확도 : 0.901 
정밀도 : 0.876 
재현율 : 0.934 
f1-score : 0.904 
AUC : 0.901 
기하평균 : 0.900 

----------ADASYN + DT 모델 적용----------
         Predict[0]  Predict[1]
True[0]       40907        5392
True[1]        