# Student Loan: Neural Network

In this chapter we will use the **keras** package to predict student loan prepayments.  In particular, we will use dense feed-forward neural networks.

## Importing Packages

Let's begin by importing some initial packages that we will need.

In [None]:
import pandas as pd
import numpy as np
import sklearn

## Reading-In Data

Now we can read-in our data.

In [None]:
df_train = pd.read_csv('../data/student_loan.csv')

## Feature Selection

Next, let's select our features and organize our lables.  Notice that we are excluding `cosign` and `repay_status` because they are categorical variables.

In [None]:
lst_features = \
    ['loan_age','income_annual', 'upb',              
    'monthly_payment','fico','origbalance',
    'mos_to_repay','mos_to_balln',]    
df_X = df_train[lst_features]

In [None]:
df_y = df_train['paid_label']

## Holdout Set

We will want to create a holdout set to measure out-of-sample performance.  The following code uses `train_test_split()` to do that.

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(df_X, df_y, test_size=0.20, random_state=0)

## Normalization

Next, let's perform a Gaussian normalization of our features so that they are all the same order of magnitude and have similar variability.

In [None]:
mu = X_train.mean()
std = X_train.std()

Notice that we are scaling both the training set and testing set with the mean and standard deviation of the training set.  It is important not to normalize the test set with it's own standard deviation to avoid information leek into the testing data.

In [None]:
X_train_scaled = (X_train - mu) / std
X_test_scaled = (X_test - mu) / std

## Setting Random Seeds

Fitting neural networks involves a lot of random number generation.  To ensure that we get reproducible results, let's create a user-defined function that sets a variety of random number generators that get used.  In order to do that we'll need to import a couple of other packages.

In [None]:
import random
import tensorflow as tf

2023-08-31 18:58:13.416320: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-08-31 18:58:13.455995: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-08-31 18:58:13.456758: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [None]:
def set_seeds(seed=100):
    random.seed(seed)
    np.random.seed(seed)
    tf.random.set_seed(seed)

## Neural Network

We can now fit our initial neural network.  Let's begin by importing some of the functions we will need from **keras** and **sklearn**.

In [None]:
from keras.layers import Dense
from keras.models import Sequential
from keras.optimizers import Adam
from sklearn.metrics import accuracy_score
from sklearn.metrics import f1_score
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score

Next, we'll set the random seeds with our user-defined function.

In [None]:
set_seeds()

The next step is to construct the model.  We instantiate it with the `Sequential()` constructor and then add two hidden layers with 16 and 8 units.

In [None]:
model = Sequential()
model.add(Dense(units=16, input_dim=len(df_X.columns), activation='relu'))
model.add(Dense(units=8, activation='relu'))
model.add(Dense(units=1, activation='sigmoid'))

The learning process is defined in the compilation step.

In [None]:
model.compile(loss='binary_crossentropy', optimizer=Adam(learning_rate=0.001), metrics=['accuracy'])

Once the construction and compilation are complete we are ready for the actual learning/fitting to happen.  (I played with the class weights until I got reasonable results.)

In [None]:
%%time
model.fit(X_train_scaled, y_train, epochs=10, verbose=True, batch_size=256, class_weight={0:1, 1:1.25});

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
CPU times: user 1min 2s, sys: 3.79 s, total: 1min 5s
Wall time: 43.7 s


We can use the `.evaluate()` method of the model to check the out-of-sample accuracy.

In [None]:
model.evaluate(X_test_scaled, y_test)



[0.0585552453994751, 0.9873527884483337]

Next, let's check the out-of-sample precision, recall, and f1 score.

In [None]:
test_predictions = np.where(model.predict(X_test_scaled) > 0.5, 1, 0)
print("F1:       ", f1_score(y_test, test_predictions))
print("Precision:", precision_score(y_test, test_predictions))
print("Recall:   ", recall_score(y_test, test_predictions))

F1:        0.37920489296636084
Precision: 0.9361207897793263
Recall:    0.2377581120943953


We can check the ratio of predicted number of prepayments to the actual number of prepayments.  As you can see, the model only predicts 25% the number of actual prepayments. 

In [None]:
print(test_predictions.sum() / y_test.sum())

0.25398230088495577


Finally, let's check the expected balance ratio: it is about 118%.

In [None]:
np.sum(X_test['upb'] * np.ravel(model.predict(X_test_scaled))) / np.sum(X_test['upb'] * y_test)



1.1768956596208677

## Dropout

In this section we implement drop-out regularization.  We begin by first resetting the random seeds.

In [None]:
set_seeds()

Notice that in the construction of our network, we add a `Dropout` layer after each hidden layer.

In [None]:
from keras.layers import Dropout
model = Sequential()
model.add(Dense(units=16, input_dim=len(df_X.columns), activation='relu'))
model.add(Dropout(rate=0.3, seed=0))
model.add(Dense(units=8, activation='relu'))
model.add(Dropout(rate=0.3, seed=0))
model.add(Dense(units=1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer=Adam(learning_rate=0.001), metrics=['accuracy'])

Next, we fit the network.

In [None]:
%%time
model.fit(X_train_scaled, y_train, epochs=10, verbose=True, batch_size=256, class_weight={0:1, 1:1.25});

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
CPU times: user 1min 11s, sys: 4.42 s, total: 1min 16s
Wall time: 49.6 s


Now we can check the out-of-sample measures of fit.  Dropout regularization doesn't seem to improve these metrics - in fact, the F1 score decreases.

In [None]:
test_predictions = np.where(model.predict(X_test_scaled) > 0.5, 1, 0)
print("F1:       ", f1_score(y_test, test_predictions))
print("Precision:", precision_score(y_test, test_predictions))
print("Recall:   ", recall_score(y_test, test_predictions))

F1:        0.31407407407407406
Precision: 0.9636363636363636
Recall:    0.18761061946902655


There is also a further reduction in the absolute number of prepayments that are predicted

In [None]:
test_predictions = np.where(model.predict(X_test_scaled) > 0.5, 1, 0)
test_predictions.sum() / y_test.sum()



0.19469026548672566

The expected loan balance ratio increases to 130%.

In [None]:
np.sum(X_test['upb'] * np.ravel(model.predict(X_test_scaled))) / np.sum(X_test['upb'] * y_test)



1.302716041226891

All in all, dropout regularization doesn't seem to improve model performance.

## Regularization

In this section we implement ridge (`l2`) regularization.  (I tried lasso regularization and it was an epic fail).

In [None]:
set_seeds()

We implement `l2` regularization by populating the `activity_regularizer` argument of the `Dense()` layer constructor.

In [None]:
from keras.regularizers import l2
model = Sequential()
model.add(Dense(units=16, input_dim=len(df_X.columns), activation='relu', activity_regularizer=l2(0.0005)))
model.add(Dense(units=8, activation='relu', activity_regularizer=l2(0.0005)))
model.add(Dense(units=1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer=Adam(learning_rate=0.001), metrics=['accuracy'])

Next, we fit the model.

In [None]:
%%time
model.fit(X_train_scaled, y_train, epochs=10, verbose=True, batch_size=256, class_weight={0:1, 1:1.25});

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
CPU times: user 1min 6s, sys: 4.6 s, total: 1min 10s
Wall time: 44.3 s


Let's check the out-of-sample metrics.  There is a marginal improvement in F1 score.

In [None]:
test_predictions = np.where(model.predict(X_test_scaled) > 0.5, 1, 0)
print("F1:       ", f1_score(y_test, test_predictions))
print("Precision:", precision_score(y_test, test_predictions))
print("Recall:   ", recall_score(y_test, test_predictions))

F1:        0.43484102104791755
Precision: 0.9024163568773235
Recall:    0.28643067846607667


There is also a marginal improvement in the absolute number of prepayments predicted.

In [None]:
test_predictions = np.where(model.predict(X_test_scaled) > 0.5, 1, 0)
test_predictions.sum() / y_test.sum()



0.31740412979351035

However, the expected loan balance ratio is worse: it increases to 121%.

In [None]:
np.sum(X_test['upb'] * np.ravel(model.predict(X_test_scaled))) / np.sum(X_test['upb'] * y_test)



1.2114536969706748

All in all, `l2` regularization doesn't seem to help.