# Model metrics summaries
All metrics were taken after hyperparameter tuning.

As the **target variable seemed imbalanced**, the assumption is made that**oversampling through SMOTE** will increase model performance.
To test this, performance metrics were logged before and after this implementation.

**Disclaimer**: For ease of use, SMOTE was implemented before pipeline creation (`imblearn.over_sampling.SMOTE`). This is reported to sometimes cause slight data leakage. Additional tests should be done on the usage of `imblearn.pipeline`


### Evaluation
-  **Train & Test Scores**: The accuracy on the training and test sets.
-  **ROC AUC**: Area Under the ROC Curve.
-  **Classification Report**: This report gives you a breakdown of precision, recall, f1-score, and support for each class.
-  **Cross Validation**: An additional layer of validation, confirming that the parameters selected indeed perform well across different subsets of the training data.

## Logistic regression


### No SMOTE
```
Train Score:  0.906
Test Score:  0.897
ROC_AUC:  0.916
                   precision    recall  f1-score   support

Attrited Customer       0.76      0.52      0.62       327
Existing Customer       0.91      0.97      0.94      1699

         accuracy                           0.90      2026
        macro avg       0.84      0.75      0.78      2026
     weighted avg       0.89      0.90      0.89      2026

Mean Test Accuracy: 0.906
Mean Train Accuracy: 0.906
Mean Fit Time: 0.096
Mean Score Time: 0.015

```


### SMOTE
```
Train Score:  0.853
Test Score:  0.858
ROC_AUC:  0.914
                   precision    recall  f1-score   support

Attrited Customer       0.54      0.83      0.65       327
Existing Customer       0.96      0.86      0.91      1699

         accuracy                           0.86      2026
        macro avg       0.75      0.84      0.78      2026
     weighted avg       0.89      0.86      0.87      2026

Mean Test Accuracy: 0.851
Mean Train Accuracy: 0.851
Mean Fit Time: 0.147
Mean Score Time: 0.011

```

## Random Forest

### no SMOTE
```                    
Train Score:  1.0
Test Score:  0.951
ROC_AUC:  0.989
                   precision    recall  f1-score   support

Attrited Customer       0.93      0.76      0.83       327
Existing Customer       0.95      0.99      0.97      1699

         accuracy                           0.95      2026
        macro avg       0.94      0.87      0.90      2026
     weighted avg       0.95      0.95      0.95      2026

Mean Test Accuracy: 0.957
Mean Train Accuracy: 1.0
Mean Fit Time: 4.469
Mean Score Time: 0.092


```


### SMOTE
```
Train Score:  1.0
Test Score:  0.956
ROC_AUC:  0.987
                   precision    recall  f1-score   support

Attrited Customer       0.88      0.84      0.86       327
Existing Customer       0.97      0.98      0.97      1699

         accuracy                           0.96      2026
        macro avg       0.92      0.91      0.92      2026
     weighted avg       0.96      0.96      0.96      2026

Mean Test Accuracy: 0.957
Mean Train Accuracy: 1.0
Mean Fit Time: 6.593
Mean Score Time: 0.076

```

## AdaBoost

### no SMOTE
```                    
Train Score:  0.971
Test Score:  0.958
ROC_AUC:  0.986
                   precision    recall  f1-score   support

Attrited Customer       0.90      0.83      0.86       327
Existing Customer       0.97      0.98      0.97      1699

         accuracy                           0.96      2026
        macro avg       0.93      0.91      0.92      2026
     weighted avg       0.96      0.96      0.96      2026
    
Mean Test Accuracy: 0.963
Mean Train Accuracy: 0.971
Mean Fit Time: 1.784
Mean Score Time: 0.047

```

### SMOTE
```
Train Score:  0.968
Test Score:  0.958
ROC_AUC:  0.985
                   precision    recall  f1-score   support

Attrited Customer       0.86      0.88      0.87       327
Existing Customer       0.98      0.97      0.97      1699

         accuracy                           0.96      2026
        macro avg       0.92      0.93      0.92      2026
     weighted avg       0.96      0.96      0.96      2026

Mean Test Accuracy: 0.958
Mean Train Accuracy: 0.97
Mean Fit Time: 8.957
Mean Score Time: 0.088

```

## K-Nearest Neighbours

### no SMOTE
```                    
Train Score:  1.0
Test Score:  0.906
ROC_AUC:  0.925
                   precision    recall  f1-score   support

Attrited Customer       0.83      0.52      0.64       327
Existing Customer       0.91      0.98      0.95      1699

         accuracy                           0.91      2026
        macro avg       0.87      0.75      0.79      2026
     weighted avg       0.90      0.91      0.90      2026

Mean Test Accuracy: 0.911
Mean Train Accuracy: 1.0
Mean Fit Time: 0.075
Mean Score Time: 0.09

```


### SMOTE
```
Train Score:  0.941
Test Score:  0.853
ROC_AUC:  0.882
                   precision    recall  f1-score   support

Attrited Customer       0.53      0.81      0.64       327
Existing Customer       0.96      0.86      0.91      1699

         accuracy                           0.85      2026
        macro avg       0.74      0.84      0.77      2026
     weighted avg       0.89      0.85      0.86      2026

Mean Test Accuracy: 0.863
Mean Train Accuracy: 0.939
Mean Fit Time: 0.126
Mean Score Time: 0.104

```

## Decision Tree

### no SMOTE
```                    
Train Score:  0.967
Test Score:  0.938
ROC_AUC:  0.93
                   precision    recall  f1-score   support

Attrited Customer       0.81      0.80      0.81       327
Existing Customer       0.96      0.96      0.96      1699

         accuracy                           0.94      2026
        macro avg       0.89      0.88      0.88      2026
     weighted avg       0.94      0.94      0.94      2026

Mean Test Accuracy: 0.939
Mean Train Accuracy: 0.969
Mean Fit Time: 0.09
Mean Score Time: 0.011

```

### SMOTE
```
Train Score:  0.97
Test Score:  0.92
ROC_AUC:  0.923
                   precision    recall  f1-score   support

Attrited Customer       0.71      0.84      0.77       327
Existing Customer       0.97      0.93      0.95      1699

         accuracy                           0.92      2026
        macro avg       0.84      0.89      0.86      2026
     weighted avg       0.93      0.92      0.92      2026

Mean Test Accuracy: 0.929
Mean Train Accuracy: 0.972
Mean Fit Time: 0.363
Mean Score Time: 0.012

```

## Gradient Boost

### no SMOTE
```                    
Train Score:  1.0
Test Score:  0.968
ROC_AUC:  0.992
                   precision    recall  f1-score   support

Attrited Customer       0.93      0.87      0.90       327
Existing Customer       0.98      0.99      0.98      1699

         accuracy                           0.97      2026
        macro avg       0.95      0.93      0.94      2026
     weighted avg       0.97      0.97      0.97      2026

Mean Test Accuracy: 0.974
Mean Train Accuracy: 1.0
Mean Fit Time: 13.788
Mean Score Time: 0.025

```


### SMOTE
```
train_score:  0.966
test_score:  0.96
ROC_AUC:  0.988
                   precision    recall  f1-score   support

Attrited Customer       0.86      0.89      0.88       327
Existing Customer       0.98      0.97      0.98      1699

         accuracy                           0.96      2026
        macro avg       0.92      0.93      0.93      2026
     weighted avg       0.96      0.96      0.96      2026

Mean Test Accuracy: 0.957
Mean Train Accuracy: 0.968
Mean Fit Time: 4.899
Mean Score Time: 0.015

```

# Model Performance graph

In [None]:
# logreg no preprocessing
R2 = 0.89
AUC = 0.899
F1 = 0.89
Accuracy_avg = 0.75
Precision_avg = 0.83
Recall_avg = 0.71
Runtime_avg = 1.4

# logreg standard scaler
R2 = 0.905
AUC = 0.917
F1 = 0.9
Accuracy_avg = 0.79
Precision_avg = 0.84
Recall_avg = 0.75
Runtime_avg = 0.07

# logreg SMOTE


# forest standard scaler
R2 = 0.905
AUC = 0.917
F1 = 0.9
Accuracy_avg = 0.79
Precision_avg = 0.84
Recall_avg = 0.75
Runtime_avg = 0.07

# 