# Loan Default Prediction - Modeling

For the Modeling we want to showcase the following machine learning tools:

<b>Keras/TensorFlow:</b> Keras is user-friendly and works well for deep learning. Our dataset is large enough and slightly complex, a neural network might capture the underlying patterns well. We will start with a simple feedforward neural network and gradually increase complexity if needed.


<b>PyTorch:</b> If possible we may look into Pytorch as it's great for custom model architectures and provides us with flexibility. This is a bit more challenging than than Keras but it is powerful. We begin with a basic architecture and explore more complex layers and structures if needed. 

At the end we will compare the two. 


#### Imports

In [20]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.utils.class_weight import compute_class_weight
from imblearn.over_sampling import SMOTE

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.metrics import roc_auc_score, classification_report, confusion_matrix



ImportError: cannot import name '_MissingValues' from 'sklearn.utils._param_validation' (C:\Users\darde\anaconda3\Lib\site-packages\sklearn\utils\_param_validation.py)

In [4]:
df = pd.read_csv("../data/loan_data_cleaned.csv", index_col=0)

df.head()

Unnamed: 0_level_0,Income,LoanAmount,CreditScore,MonthsEmployed,NumCreditLines,InterestRate,LoanTerm,DTIRatio,Education,EmploymentType,MaritalStatus,HasMortgage,HasDependents,LoanPurpose,HasCoSigner,Default
Age,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
56,85994,50587,520,80,4,15.23,36,0.44,0,0,0,1,1,4,1,0
69,50432,124440,458,15,1,4.81,60,0.68,2,0,1,0,0,4,1,0
46,84208,129188,451,26,3,21.17,24,0.31,2,3,0,1,1,0,0,1
32,31713,44799,743,0,3,7.07,24,0.23,1,0,1,0,0,1,0,0
60,20437,9139,633,8,4,6.51,48,0.73,0,3,0,0,1,0,0,0


### Train-Test Split

In [5]:
# we are doing an 80/20 split

X = df.drop('Default', axis=1)  
y = df['Default']  

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=99)

## Neural Network with Keras/TensorFlow
We'll start by building a simple feedforward neural network (multilayer perceptron) for classification.

For our model we will have an input layer, two hidden layers (64 and 32 units), and an output layer. We will use ReLU for hidden layers and Sigmoid for the output layer since it’s a binary classification problem.

We will train the model with 10 epochs and adjust depending on the performance. (How many times the model will go through the training set)

Note:
Batch controls how many samples are processed before the model’s internal parameters are updated.
validation_split=0.2: This will use 20% of the training data for validation during training, which helps monitor the model's performance on unseen data.

In [7]:

model = Sequential()


model.add(Dense(64, input_dim=X_train.shape[1], activation='relu'))  

model.add(Dense(32, activation='relu'))
model.add(Dense(16, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

history = model.fit(X_train, y_train, validation_split=0.2, epochs=10, batch_size=32)


y_pred_prob = model.predict(X_test)
y_pred_class = (y_pred_prob > 0.5).astype("int32")

# Evaluate the model
auc = roc_auc_score(y_test, y_pred_prob)
print(f"AUC-ROC Score: {auc}")


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
AUC-ROC Score: 0.5


#### Keras/TensorFlow Evaluation



In [10]:

auc = roc_auc_score(y_test, y_pred_prob)
print(f"AUC-ROC Score: {auc}")

print("Classification Report:\n", classification_report(y_test, y_pred_class))

print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred_class))

AUC-ROC Score: 0.5
Classification Report:
               precision    recall  f1-score   support

           0       0.88      1.00      0.94     45167
           1       0.00      0.00      0.00      5903

    accuracy                           0.88     51070
   macro avg       0.44      0.50      0.47     51070
weighted avg       0.78      0.88      0.83     51070

Confusion Matrix:
 [[45167     0]
 [ 5903     0]]


  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


##### Observation
This shows that the model is performing poorly since it isn't distinguishing between defaults and non-defaults effectively.


This is because the model is biased towards the majority class (class 0)which is because are many more non-defaults than defaults. So in this case, the model learned to predict non-defaults every time, which leads to high accuracy but poor recall for the default cases.

To fix this we will using class weights, where we assign more weight to the minority class to encourage the model to pay more attention to it.

### Weight Classes

In [17]:
class_weights = compute_class_weight('balanced', classes=np.array([0, 1]), y=y_train)
class_weights_dict = {0: class_weights[0], 1: class_weights[1]}

print("Class Weights: ", class_weights_dict)

Class Weights:  {0: 0.5657796340713578, 1: 4.300568421052631}


Before making predictions on the test set, we are going to evaluate the model on the test set

In [18]:
history = model.fit(X_train, y_train, validation_split=0.2, epochs=10, batch_size=32, class_weight=class_weights_dict)
loss, accuracy = model.evaluate(X_test, y_test)
print(f"Test Loss: {loss}")
print(f"Test Accuracy: {accuracy}")

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test Loss: 0.6734201908111572
Test Accuracy: 0.8844135403633118


In [19]:
# Prediction
y_pred_prob = model.predict(X_test)
y_pred_class = (y_pred_prob > 0.5).astype("int32")

#Evaluation
auc = roc_auc_score(y_test, y_pred_prob)
print(f"AUC-ROC Score: {auc}")
print("Classification Report:\n", classification_report(y_test, y_pred_class))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred_class))

AUC-ROC Score: 0.5
Classification Report:
               precision    recall  f1-score   support

           0       0.88      1.00      0.94     45167
           1       0.00      0.00      0.00      5903

    accuracy                           0.88     51070
   macro avg       0.44      0.50      0.47     51070
weighted avg       0.78      0.88      0.83     51070

Confusion Matrix:
 [[45167     0]
 [ 5903     0]]


  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


#### Observation
It looks like the class weights did not improve the model. The model is no better than random guessing

AUC-ROC Score: 0.5
Recall and Precision for Class 1 (defaults) are still 0:
Confusion Matrix: All instances are classified as non-defaults.

<b>Possible Reasons:</b>
SThe class imbalance might still be too strong, and even with class weights, the model struggles to predict defaults.

Our  architecture may not be well-suited for this problem.

Or our features might not be sufficient enough for the model.

<b>Next Steps</b>
We could look into increasing the class weights. 
Consider using SMOTE
Add more layers or neurons
Use Ensemble model such as XGBoost or Random Forest
Lower the threshold for classifying defaults (try 0.3 instead of 0.5)

### SMOTE 
We will try SMOTE (Synthetic Minority Over-sampling Technique), hopefully this could help balance the dataset better than class weights.