# Predicting Credit Approval with Machine Learning

Notebook Author: Matthew D. Kearns


<img src='img/credit-card.png' alt="Credit Card" height="100" width="200" align=left>

Using a Random Forest classifier (RFC) to predict approval for a credit card.

Notebook Contents:

   - [Load Cleaned Data](#data)
   - [Initialize RFC Model](#model)
   - [Train RFC Model on Data w/ LOOCV](#train)
   - [Report Accuracy](#report)

In [1]:
# imports
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import KFold

##### Load Cleaned Data
<a id='data'></a>

In [2]:
df = pd.read_csv('./customer-data/clean.csv')
df = df.iloc[:,1:] # drop first row of index values
df = df.sample(frac=1) # shuffle rows
df.head()

Unnamed: 0,1,2,7,10,13,14,0_a,0_b,3_l,3_u,...,8_f,8_t,9_f,9_t,11_f,11_t,12_g,12_p,12_s,15_+
501,24.75,3.0,1.835,19,0,500,1,0,0,1,...,0,1,0,1,1,0,1,0,0,1
255,20.0,11.045,2.0,0,136,0,0,1,0,1,...,1,0,1,0,0,1,1,0,0,0
671,27.83,1.0,3.0,0,176,537,0,1,0,0,...,1,0,1,0,1,0,1,0,0,0
81,27.67,1.5,2.0,0,368,0,1,0,0,1,...,0,1,1,0,1,0,0,0,1,0
473,22.75,11.5,0.415,0,0,0,0,1,0,1,...,1,0,1,0,1,0,1,0,0,0


In [3]:
# split data into its X and y components
X, y = df.values[:,:-1], df.values[:,-1]

##### Initialize RFC Model
<a id='model'></a>

In [4]:
clf = RandomForestClassifier(n_estimators=50)

##### Train RFC Model on Data w/ LOOCV
<a id='train'></a>

In [5]:
folds = len(X)
kf = KFold(n_splits=folds)

ave = 0

for (train_index, test_index) in kf.split(X):
    clf.fit(X[train_index], y[train_index])
    ave += clf.score(X[test_index], y[test_index])
    
ave /= folds

##### Report Accuracy
<a id='report'></a>

In [7]:
print('Random Forest Model w/ LOOCV:', ave)

Random Forest Model w/ LOOCV: 0.8720588235294118
