# Lending Club Data - Machine Learning

The question that this project seeks to answer is whether macroeconomic factors at the time that a loan is originated have a measurable impact on the likelihood that borrowers will default. To evaluate this question, the impact of incorporating macroeconomic variables into Machine Learning models will be evaluated.

First, the data will be cleaned so that it contains only relevant data <i>included in the dataset provided by Lending Club and known at the time that the loan is originated</i>. This will provide a baseline for the ability to predict loan outcomes without using external data. Second, logistic regression, SVM, and KNN will be used to evaluate how well the data can be used to predict loan defaults. Third, the data will be modified to include a number of macroeconomic features. Finally, the same machine learning methods will be employed on the modified data to evaluate whether adding macroeconomic features improves model performance.

### Data/Library import

The necessary libraries are imported. Also, the dataset to be used is imported from a .pkl file, which is available in the following shared Google Drive location (in .7z format):

https://drive.google.com/file/d/1RCgoJYONVQJek5zrlShaGEzjrk99rAIk/view

The data has been pre-processed to remove all loans that have not completed their entire term length. For example, loans that have 36-month terms are only included if they originated at least 36 months prior to the date that the data was updated.

In [1]:
# Import the necessary libraries
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import fredapi
import datetime

from sklearn import preprocessing
from sklearn.decomposition import PCA
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.metrics import f1_score
from sklearn.metrics import roc_curve
from sklearn.metrics import auc

from sklearn.model_selection import GridSearchCV
from sklearn import linear_model
from sklearn.metrics import confusion_matrix


# Import the dataset containing completed loans
df = pd.read_pickle(r'..\data\completed_loan_dataset.pkl')

### Data Clean-Up

The data includes some information that has was pulled in for previous analyses, including some macroeconomic variables. These are stripped for the initial analysis, which is intended to test only how well predictions using the data provided by Lending Club would perform.

In [3]:
# Create a duplicate dataframe
lc_data = df.copy()

# From the duplicate dataframe, drop the columns that were not included in the original dataset
lc_data = lc_data.drop(['bom_value', 'eom_value', 'month_avg', 'monthly_bks', 'lag_time', 'loan_end_date', 'loan_ended_flag',
                       'sp_500_diff', 'unemployment_rate', 'date'], axis=1)

# Remove all columns with too many null values. If more than 1% of the values in a column are null, then remove it
lc_data = lc_data[[c for c in lc_data if lc_data[c].isnull().sum() <= .01 * len(lc_data)]]

# Less than 0.2% of the remaining rows have null values. These are removed
lc_data = lc_data.dropna()

# For several of the columns, there is no reason to think that they should be a predictor of default rates or there
# is no way that they are actionable prior to origination (such as issue date or collections recoveries). Additionally, the
# interest rate is determined by grade and sub_grade, so grade and sub_grade are removed.
lc_data = lc_data.drop(['id', 'url', 'recoveries', 'collection_recovery_fee', 'last_pymnt_d', 'last_pymnt_amnt',
                        'debt_settlement_flag', 'grade', 'sub_grade', 'installment', 'loan_status', 'loan_amnt',
                        'funded_amnt_inv', 'pymnt_plan', 'policy_code', 'application_type', 'out_prncp',
                        'out_prncp_inv', 'last_credit_pull_d', 'hardship_flag', 'title', 'disbursement_method',
                       'issue_d', 'earliest_cr_line', 'zip_code', 'addr_state', 'total_pymnt_inv', 'total_pymnt',
                       'total_rec_late_fee', 'total_rec_prncp', 'total_rec_int', 'acc_now_delinq', 'delinq_amnt'], axis=1)

The columns are printed below to evaluate what is left in the data. 'charge_off_flag' is the outcome variable - it is one if the loan charged off, and zero otherwise. The other features are datapoints known to Lending Club when the loan was originated.

In [4]:
lc_data.columns

Index(['funded_amnt', 'term', 'int_rate', 'home_ownership', 'annual_inc',
       'verification_status', 'purpose', 'dti', 'delinq_2yrs',
       'fico_range_low', 'fico_range_high', 'inq_last_6mths', 'open_acc',
       'pub_rec', 'revol_bal', 'revol_util', 'total_acc',
       'initial_list_status', 'last_fico_range_high', 'last_fico_range_low',
       'collections_12_mths_ex_med', 'chargeoff_within_12_mths',
       'pub_rec_bankruptcies', 'tax_liens', 'orig_month', 'charge_off_flag'],
      dtype='object')

All of the columns containing categories ("object" columns) are converted to dummy variables.

In [5]:
lc_data = pd.get_dummies(lc_data, columns=list(lc_data.select_dtypes(include=["object"])))

The data is ready to be split into X and y arrays, with X being all of the columns except for charge_off_flag and y being the charge_off_flag column.

In [6]:
X = lc_data.drop(['charge_off_flag', 'orig_month'], axis=1).values
y = lc_data['charge_off_flag'].values

In [7]:
Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, test_size=0.3)

The X data is standardized so that datapoints of a greater scale do not have outsize influence on the results.

In [8]:
Xtrain = preprocessing.StandardScaler().fit_transform(Xtrain)
Xtest = preprocessing.StandardScaler().fit_transform(Xtest)

### Application of Machine Learning Models

A KNN model with several different parameters is tested. The dataset is relatively sparse, so KNN is not expected to perform well.

In [9]:
knn_parameters = {"n_neighbors": [3, 5], "algorithm": ["auto", "ball_tree", "kd_tree"], 
               "weights": ["uniform", "distance"]}
clf = GridSearchCV(estimator=KNeighborsClassifier(), param_grid=knn_parameters, cv=3, n_jobs=-1, 
                   scoring="f1", verbose=10)

print('Starting KNN training...')
clf.fit(Xtrain, ytrain)
print('Predicting on the train set...')
y_val_knn = clf.predict(Xtrain)
print('Predicting on the test set...')
y_test_knn = clf.predict(Xtest)

Starting KNN training...
Fitting 3 folds for each of 12 candidates, totalling 36 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done   5 tasks      | elapsed: 174.7min
[Parallel(n_jobs=-1)]: Done  10 tasks      | elapsed: 268.6min
[Parallel(n_jobs=-1)]: Done  17 tasks      | elapsed: 433.7min
[Parallel(n_jobs=-1)]: Done  24 tasks      | elapsed: 555.2min
[Parallel(n_jobs=-1)]: Done  33 out of  36 | elapsed: 791.2min remaining: 71.9min
[Parallel(n_jobs=-1)]: Done  36 out of  36 | elapsed: 816.6min finished


Predicting on the train set...
Predicting on the test set...


The confusion matrix below indicates that the KNN model is useful for predicting defaults, even using only the variables in the Lending Club dataset. A commonly used statistic in lending is the proportion of non-defaulting borrowers who would be rejected based on a rule change for every defaulting borrower who is rejected (good-to-bad ratio). The good-to-bad ratio for this analasis is close to exactly one, meaning that rejecting would-be borrowers based on KNN predicting their default would result in one non-defaulting borrower rejected for every defaulting borrower rejected. This is almost certainly a good trade-off, as the value of a non-defaulted loan is typically much smaller than the cost of a default.

In [10]:
knn_cm_df = pd.DataFrame(confusion_matrix(ytest, y_test_knn))
knn_cm_df.columns = ['No Default Predicted', 'Default Predicted']
knn_cm_df['ind'] = ['No Default Occurred', 'Default Occurred']
knn_cm_df = knn_cm_df.set_index('ind')
knn_cm_df.index.name = None
print('Confusion matrix: \n\n{}'.format(knn_cm_df))
knn_gb_ratio = knn_cm_df.loc['No Default Occurred', 'Default Predicted'] / knn_cm_df.loc['Default Occurred', 'Default Predicted']
print('\nGood-to-Bad ratio: {}'.format(knn_gb_ratio))

Confusion matrix: 

                     No Default Predicted  Default Predicted
No Default Occurred                 69880               4121
Default Occurred                     8117               4423

Good-to-Bad ratio: 0.931720551661768


Next, SVM is performed on the data, similarly with several specifications.

In [11]:
svc_parameters = {"kernel": ["rbf"], "gamma": [0.001, 0.0001], "C": [1, 10, 100, 1000]}
clf = GridSearchCV(estimator=SVC(), param_grid=svc_parameters, cv=3, n_jobs=-1, 
                   scoring="f1", verbose=10)

print('Starting SVM training...')
clf.fit(Xtrain, ytrain)
print('Predicting on the train set...')
y_val_svm = clf.predict(Xtrain)
print('Predicting on the test set...')
y_test_svm = clf.predict(Xtest)

Starting SVM training...
Fitting 3 folds for each of 8 candidates, totalling 24 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done   5 tasks      | elapsed: 57.0min
[Parallel(n_jobs=-1)]: Done  10 tasks      | elapsed: 85.9min
[Parallel(n_jobs=-1)]: Done  17 tasks      | elapsed: 157.7min
[Parallel(n_jobs=-1)]: Done  20 out of  24 | elapsed: 241.4min remaining: 48.3min
[Parallel(n_jobs=-1)]: Done  24 out of  24 | elapsed: 339.8min finished


Predicting on the train set...
Predicting on the test set...


The confusion matrix below shows that SVM performed significantly better than KNN, accurately predicting approximately 2 defaults for every 1 false positive.

In [12]:
svm_cm_df = pd.DataFrame(confusion_matrix(ytest, y_test_svm))
svm_cm_df.columns = ['No Default Predicted', 'Default Predicted']
svm_cm_df['ind'] = ['No Default Occurred', 'Default Occurred']
svm_cm_df = svm_cm_df.set_index('ind')
svm_cm_df.index.name = None
print('Confusion matrix: \n\n{}'.format(svm_cm_df))
gb_ratio = svm_cm_df.loc['No Default Occurred', 'Default Predicted'] / svm_cm_df.loc['Default Occurred', 'Default Predicted']
print('\nGood-to-Bad ratio: {}'.format(gb_ratio))

Confusion matrix: 

                     No Default Predicted  Default Predicted
No Default Occurred                 71776               2225
Default Occurred                     8314               4226

Good-to-Bad ratio: 0.5265026029342168


Next, a random forest model is tested with several different specifications.

In [14]:
forest_parameters = {"n_estimators": [5, 10, 50], "max_depth": [2, 5, 7, 9]}
clf = GridSearchCV(estimator=RandomForestClassifier(), param_grid=forest_parameters, cv=3, n_jobs=-1, 
                   scoring="f1", verbose=10)

print('Starting Random Forest training...')
clf.fit(Xtrain, ytrain)
print('Predicting on the train set...')
y_val_forest = clf.predict(Xtrain)
print('Predicting on the test set...')
y_test_forest = clf.predict(Xtest)

Starting Random Forest training...
Fitting 3 folds for each of 12 candidates, totalling 36 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done   5 tasks      | elapsed:    5.8s
[Parallel(n_jobs=-1)]: Done  10 tasks      | elapsed:   12.8s
[Parallel(n_jobs=-1)]: Done  17 tasks      | elapsed:   21.7s
[Parallel(n_jobs=-1)]: Done  24 tasks      | elapsed:   34.2s
[Parallel(n_jobs=-1)]: Done  33 out of  36 | elapsed:   57.1s remaining:    5.1s
[Parallel(n_jobs=-1)]: Done  36 out of  36 | elapsed:  1.2min finished


Predicting on the train set...
Predicting on the test set...


The random forest model has results that are similar to, but very slightly worse than, the SVM results.

In [15]:
for_cm_df = pd.DataFrame(confusion_matrix(ytest, y_test_forest))
for_cm_df.columns = ['No Default Predicted', 'Default Predicted']
for_cm_df['ind'] = ['No Default Occurred', 'Default Occurred']
for_cm_df = for_cm_df.set_index('ind')
for_cm_df.index.name = None
print('Confusion matrix: \n\n{}'.format(for_cm_df))
gb_ratio = for_cm_df.loc['No Default Occurred', 'Default Predicted'] / for_cm_df.loc['Default Occurred', 'Default Predicted']
print('\nGood-to-Bad ratio: {}'.format(gb_ratio))

Confusion matrix: 

                     No Default Predicted  Default Predicted
No Default Occurred                 71765               2236
Default Occurred                     8481               4059

Good-to-Bad ratio: 0.5508745996550874


All models are quite good at predicting defaults.

### Importing New Features

Next, macroeconomic features are imported. Bankruptcy data that was drafted for an earlier analysis is pulled in from a .pkl file, which is available in the Google Drive location linked below. The remainder of the data was sourced from the Federal Reserve Economic Data (FRED). The API_KEY variable below has been replaced with an empty string so that the private API key is not shared.

BK Data .pkl: https://drive.google.com/open?id=1hZc5-S451fGmIY1s-D2IEqVg_rTcsRzv

In [32]:
# Import the .pkl file containing the bankruptcy data
bk_data = pd.read_pickle(r'C:\Users\Mark\Desktop\springboard_projects\data\monthly_bk_data.pkl')

# Convert BKs to numeric
bk_data['monthly_bks'] = pd.to_numeric(bk_data['monthly_bks'], errors='coerce')
bk_data = bk_data.reset_index()
bk_data.columns = ['date', 'bks']

# Pull in data using the FRED API
# Set up the API. The API key is removed from the published version.
API_KEY = ''
fred = fredapi.Fred(api_key=API_KEY)

In addition to the 3 features already included in the data previously imported, a number of new features will be incorporated and evaluated. All of them are sourced from the FRED (Federal Reserve Economic Database) using the fredapi library and adjusted to only end-of-month values.

In [18]:
# TED spread
ted_spread = fred.get_series('TEDRATE').to_frame().reset_index()
ted_spread.columns = ['date', 'ted_spread']
ted_spread['date'] = ted_spread['date'].apply(lambda x: x.replace(day=1))
ted_spread = ted_spread.groupby([ted_spread['date']]).mean()

# St. Louis Fed Financial Stress Index
financial_stress = fred.get_series('STLFSI').to_frame().reset_index()
financial_stress.columns = ['date', 'financial_stress']
financial_stress['date'] = financial_stress['date'].apply(lambda x: x.replace(day=1))
financial_stress = financial_stress.groupby([financial_stress['date']]).mean()

# Civilian labor force participation rate
participation_rate = fred.get_series('LNS11300060').to_frame().reset_index()
participation_rate.columns = ['date', 'participation_rate']

# Total vehicle sales
vehicle_sales = fred.get_series('TOTALSA').to_frame().reset_index()
vehicle_sales.columns = ['date', 'vehicle_sales']
# Make the variable the month-over-month change instead of the in-period value.
vehicle_sales['vehicle_sales'] = (vehicle_sales['vehicle_sales'] - vehicle_sales['vehicle_sales'].shift(1)) / vehicle_sales['vehicle_sales'].shift(1)

# Consumer Sentiment
consumer_sentiment = fred.get_series('UMCSENT').to_frame().reset_index()
consumer_sentiment.columns = ['date', 'consumer_sentiment']

# Unemployment Rate
unemployment_data = fred.get_series('UNRATE').to_frame().reset_index()
unemployment_data.columns = ['date', 'unemployment_rate']

# S&P 500 Data
sp_data = fred.get_series('SP500').to_frame().reset_index()
sp_data.columns = ['date', 'sp']
sp_data['date'] = sp_data['date'].apply(lambda x: x.replace(day=1))
sp_data = sp_data.groupby([sp_data['date']]).mean()
# Make the variable the month-over-month change instead of the in-period value.
sp_data['sp'] = (sp_data['sp'] - sp_data['sp'].shift(1)) / sp_data['sp'].shift(1)

The features are merged into the Lending Club data so that they can be used as features in the Machine Learning models.

In [19]:
# Merge the new features into the LC data
lc_data = lc_data.merge(right=ted_spread, left_on='orig_month', right_on='date', how='left', suffixes=('', ''))
if 'date' in lc_data.columns:
    lc_data = lc_data.drop(['date'], axis=1)
lc_data = lc_data.merge(right=financial_stress, left_on='orig_month', right_on='date', how='left', suffixes=('', ''))
if 'date' in lc_data.columns:
    lc_data = lc_data.drop(['date'], axis=1)
lc_data = lc_data.merge(right=participation_rate, left_on='orig_month', right_on='date', how='left', suffixes=('', ''))
if 'date' in lc_data.columns:
    lc_data = lc_data.drop(['date'], axis=1)
lc_data = lc_data.merge(right=vehicle_sales, left_on='orig_month', right_on='date', how='left', suffixes=('', ''))
if 'date' in lc_data.columns:
    lc_data = lc_data.drop(['date'], axis=1)
lc_data = lc_data.merge(right=consumer_sentiment, left_on='orig_month', right_on='date', how='left', suffixes=('', ''))
if 'date' in lc_data.columns:
    lc_data = lc_data.drop(['date'], axis=1)
lc_data = lc_data.merge(right=unemployment_data, left_on='orig_month', right_on='date', how='left', suffixes=('', ''))
if 'date' in lc_data.columns:
    lc_data = lc_data.drop(['date'], axis=1)
lc_data = lc_data.merge(right=sp_data, left_on='orig_month', right_on='date', how='left', suffixes=('', ''))
if 'date' in lc_data.columns:
    lc_data = lc_data.drop(['date'], axis=1)
lc_data = lc_data.merge(right=bk_data, left_on='orig_month', right_on='date', how='left', suffixes=('', ''))
if 'date' in lc_data.columns:
    lc_data = lc_data.drop(['date'], axis=1)

### Re-Run Machine Learning Models with Macroeconomic Data

The same steps that were previously performed to ready the data for ML applications is performed below. First, all object-type variables are split into dummy variables.

In [20]:
lc_data = pd.get_dummies(lc_data, columns=list(lc_data.select_dtypes(include=["object"])))

The data is again split into predictive features (X) and the outcome variable (y) and split into train and test sets.

In [21]:
X = lc_data.drop(['charge_off_flag', 'orig_month'], axis=1).values
y = lc_data['charge_off_flag'].values

In [22]:
Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, test_size=0.3)

In [23]:
Xtrain = preprocessing.StandardScaler().fit_transform(Xtrain)
Xtest = preprocessing.StandardScaler().fit_transform(Xtest)

KNN is performed on the modified dataset, incorporating the macroeconomic variables.

In [24]:
knn_parameters = {"n_neighbors": [3, 5], "algorithm": ["auto", "ball_tree", "kd_tree"], 
               "weights": ["uniform", "distance"]}
clf = GridSearchCV(estimator=KNeighborsClassifier(), param_grid=knn_parameters, cv=3, n_jobs=-1, 
                   scoring="f1", verbose=10)

print('Starting KNN training...')
clf.fit(Xtrain, ytrain)
print('Predicting on the train set...')
y_val_knn = clf.predict(Xtrain)
print('Predicting on the test set...')
y_test_knn = clf.predict(Xtest)

Starting KNN training...
Fitting 3 folds for each of 12 candidates, totalling 36 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  36 out of  36 | elapsed: 1154.0min finished


Predicting on the train set...
Predicting on the test set...


The results are very slightly better for KNN - the good-to-bad ratio went from .93 to .92.

In [26]:
knn_m_cm_df = pd.DataFrame(confusion_matrix(ytest, y_test_knn))
knn_m_cm_df.columns = ['No Default Predicted', 'Default Predicted']
knn_m_cm_df['ind'] = ['No Default Occurred', 'Default Occurred']
knn_m_cm_df = knn_m_cm_df.set_index('ind')
knn_m_cm_df.index.name = None
print('Confusion matrix: \n\n{}'.format(knn_m_cm_df))
gb_ratio = knn_m_cm_df.loc['No Default Occurred', 'Default Predicted'] / knn_m_cm_df.loc['Default Occurred', 'Default Predicted']
print('\nGood-to-Bad ratio: {}'.format(gb_ratio))

Confusion matrix: 

                     No Default Predicted  Default Predicted
No Default Occurred                 70167               3749
Default Occurred                     8546               4079

Good-to-Bad ratio: 0.9190978180926698


Next, SVM is performed on the data.

In [27]:
svc_parameters = {"kernel": ["rbf"], "gamma": [0.001, 0.0001], "C": [1, 10, 100, 1000]}
clf = GridSearchCV(estimator=SVC(), param_grid=svc_parameters, cv=3, n_jobs=-1, 
                   scoring="f1", verbose=10)

print('Starting SVM training...')
clf.fit(Xtrain, ytrain)
print('Predicting on the train set...')
y_val_svm = clf.predict(Xtrain)
print('Predicting on the test set...')
y_test_svm = clf.predict(Xtest)

Starting SVM training...
Fitting 3 folds for each of 8 candidates, totalling 24 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done   5 tasks      | elapsed: 64.2min
[Parallel(n_jobs=-1)]: Done  10 tasks      | elapsed: 97.1min
[Parallel(n_jobs=-1)]: Done  17 tasks      | elapsed: 169.7min
[Parallel(n_jobs=-1)]: Done  20 out of  24 | elapsed: 296.0min remaining: 59.2min
[Parallel(n_jobs=-1)]: Done  24 out of  24 | elapsed: 436.5min finished


Predicting on the train set...
Predicting on the test set...


The results for SVM also improved very slightly, from .53 to .52

In [28]:
svm_m_cm_df = pd.DataFrame(confusion_matrix(ytest, y_test_svm))
svm_m_cm_df.columns = ['No Default Predicted', 'Default Predicted']
svm_m_cm_df['ind'] = ['No Default Occurred', 'Default Occurred']
svm_m_cm_df = svm_m_cm_df.set_index('ind')
svm_m_cm_df.index.name = None
print('Confusion matrix: \n\n{}'.format(svm_m_cm_df))
gb_ratio = svm_m_cm_df.loc['No Default Occurred', 'Default Predicted'] / svm_m_cm_df.loc['Default Occurred', 'Default Predicted']
print('\nGood-to-Bad ratio: {}'.format(gb_ratio))

Confusion matrix: 

                     No Default Predicted  Default Predicted
No Default Occurred                 71486               2430
Default Occurred                     7927               4698

Good-to-Bad ratio: 0.5172413793103449


Next, the Random Forest model is re-run on the new data.

In [29]:
forest_parameters = {"n_estimators": [5, 10, 50], "max_depth": [2, 5, 7, 9]}
clf = GridSearchCV(estimator=RandomForestClassifier(), param_grid=forest_parameters, cv=3, n_jobs=-1, 
                   scoring="f1", verbose=10)

print('Starting Random Forest training')
clf.fit(Xtrain, ytrain)
print('Predicting on the train set...')
y_val_forest = clf.predict(Xtrain)
print('Predicting on the test set...')
y_test_forest = clf.predict(Xtest)

Starting Random Forest training
Fitting 3 folds for each of 12 candidates, totalling 36 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done   5 tasks      | elapsed:   10.3s
[Parallel(n_jobs=-1)]: Done  10 tasks      | elapsed:   17.9s
[Parallel(n_jobs=-1)]: Done  17 tasks      | elapsed:   29.0s
[Parallel(n_jobs=-1)]: Done  24 tasks      | elapsed:   45.0s
[Parallel(n_jobs=-1)]: Done  33 out of  36 | elapsed:  1.2min remaining:    6.3s
[Parallel(n_jobs=-1)]: Done  36 out of  36 | elapsed:  1.5min finished


Predicting on the train set...
Predicting on the test set...


The results for Random Forest are very good - the model improved from having a good-to-bad ratio of .55 to one of .49, the lowest of any prediction by a small but significant margin.

In [30]:
rf_m_cm_df = pd.DataFrame(confusion_matrix(ytest, y_test_forest))
rf_m_cm_df.columns = ['No Default Predicted', 'Default Predicted']
rf_m_cm_df['ind'] = ['No Default Occurred', 'Default Occurred']
rf_m_cm_df = rf_m_cm_df.set_index('ind')
rf_m_cm_df.index.name = None
print('Confusion matrix: \n\n{}'.format(rf_m_cm_df))
gb_ratio = rf_m_cm_df.loc['No Default Occurred', 'Default Predicted'] / rf_m_cm_df.loc['Default Occurred', 'Default Predicted']
print('\nGood-to-Bad ratio: {}'.format(gb_ratio))

Confusion matrix: 

                     No Default Predicted  Default Predicted
No Default Occurred                 71713               2203
Default Occurred                     8138               4487

Good-to-Bad ratio: 0.49097392467127254


### Conclusion

Each of the methods tested showed an improvement with incorporation of macroeconomic data - for the time period tested, at least, incorporating macroeconomic variables into decision-making would have resulted in improvements in loan performance. The magnitude of the improvements would be quite large: 4,487 bad loans would have never been made, at the cost of only 2,203 good loans, which could easily represent a 7-figure cost savings.