# Machine Learning Model: XGBoost 

### Background 

The following notebook seeks to apply an XGBoost gradient boosting model based on random oversampled (ROS) and SMOTE data to determine a predictive model. The results will then be evaluated based on the following metrics; 

1. Accuracy score
2. Confusion matrix
3. ROC - AUC
4. Feature Improtance
5. Mean sqaured error (MSE) 

### Objective 

The purpose is to evaluate the machine learning model's predictive capabilities in regards to determining the likelihood account holders of a financial insitution would be to take out a personal loan. Additionally, the models performance will be measured against two other machine learning algorithms; Logistic Regression and Random Forest Classifier. 

---


## Initial Imports & Dependencies

In [1]:
# Importing Libraries & Dependencies
from pathlib import Path
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

%matplotlib inline

In [None]:
# Import Machine Learning Model: GXBoost
from xgboost import XGBClassifier
from xgboost import plot_importance

In [4]:
# Import SK-LEARN libraries 
from sklearn.metrics import balanced_accuracy_score
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from sklearn.metrics import mean_squared_error
from sklearn.metrics import plot_roc_curve
from sklearn.metrics import roc_auc_score

In [5]:
# Import Warnings
import warnings
warnings.filterwarnings('ignore')

---

## Load Datasets

In [8]:
# Import SMOTE & Random Over Sampled dataset Features
# To import the X_train_smote, X_train_ros, X_train_scaled, X_test_scaled
X_train_smote = np.loadtxt('resources/X_train_smote.csv', delimiter=',')
X_train_ros = np.loadtxt('resources/X_train_ros.csv', delimiter=',')
X_train_scaled = np.loadtxt('resources/X_train_scaled.csv', delimiter=',')
X_test_scaled = np.loadtxt('resources/X_test_scaled.csv', delimiter=',')

# Import SMOTE & Random Over Sampled dataset Traget
# To import y_train_smote, y_train_ros
y_train_smote = pd.read_csv('resources/y_train_smote.csv', sep=',', header=0, squeeze=True)
y_train_ros = pd.read_csv('resources/y_train_ros.csv', sep=',', header=0, squeeze=True)


# To import y_train, y_test
y_train = pd.read_csv('resources/y_train.csv', sep=',', header=0, squeeze=True, index_col=0)
y_test = pd.read_csv('resources/y_test.csv', sep=',', header=0, squeeze=True, index_col=0)