# Neural Network - Bank Risk Mitigation

We welcome you all to the case-based project of this course. This project has 2 case studies.
The first case study (described below - 30 points) covers concepts taught in Part 1 (first 8 hours
of Neural networks basics).
 
1st case study - Project 1:
 
The case study is from an open source dataset from Kaggle. 

Link to the Kaggle project site:
https://www.kaggle.com/barelydedicated/bank-customer-churn-modeling

Given a Bank customer, can we build a classifier which can determine whether they will leave or
not using Neural networks?
 
Case file: 

bank.csv
 
The points distribution for this case is as follows:
1. Read the dataset in a new python notebook.
2. Drop the columns which are unique for all users like IDs (2.5 points)
3. Distinguish the feature and target set (2.5 points)
4. Divide the data set into Train and test sets
5. Normalize the train and test data (2.5 points)
6. Initialize &amp; build the model (10 points)
7. Optimize the model (5 points)
9. Predict the results using 0.5 as a threshold (5 points)
10. Print the Accuracy score and confusion matrix (2.5 points)

 

In [2]:
from google.colab import drive
drive.mount('/gdrive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /gdrive


In [0]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import OneHotEncoder

In [4]:
%tensorflow_version 2.x
import tensorflow as tf
tf.random.set_seed(42)

TensorFlow 2.x selected.


In [5]:
import keras

Using TensorFlow backend.


###### 1. Read the dataset in a new python notebook.

In [0]:
bank = pd.read_csv('/gdrive/My Drive/Colab Notebooks/bank.csv')

In [0]:
#bank = pd.read_csv("bank.csv")

In [8]:
bank.head()

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


###### 2 Drop the columns which are unique for all users like IDs

In [9]:
bank.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
RowNumber          10000 non-null int64
CustomerId         10000 non-null int64
Surname            10000 non-null object
CreditScore        10000 non-null int64
Geography          10000 non-null object
Gender             10000 non-null object
Age                10000 non-null int64
Tenure             10000 non-null int64
Balance            10000 non-null float64
NumOfProducts      10000 non-null int64
HasCrCard          10000 non-null int64
IsActiveMember     10000 non-null int64
EstimatedSalary    10000 non-null float64
Exited             10000 non-null int64
dtypes: float64(2), int64(9), object(3)
memory usage: 1.1+ MB


In [10]:
bank.nunique()

RowNumber          10000
CustomerId         10000
Surname             2932
CreditScore          460
Geography              3
Gender                 2
Age                   70
Tenure                11
Balance             6382
NumOfProducts          4
HasCrCard              2
IsActiveMember         2
EstimatedSalary     9999
Exited                 2
dtype: int64

Row Number, CustomerID and even Surname is also unique cloumn since it is identifiable for particular customer. Hence we will remove all these columns.

In [0]:
bank.drop(columns=['RowNumber','CustomerId','Surname'],inplace=True)

In [12]:
bank.head()

Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


We can see here that it does not have any column which alone can uniquely identify the file. 

###### 3. Feature and Target

Since 'Exited' is the target column where we want to predict if customer will leave the bank or not, we will keep that in Y and rest everything in features.

In [0]:
X=bank.iloc[:,:-1]

In [0]:
y=bank.iloc[:,-1]

In [15]:
X.shape

(10000, 10)

In [16]:
X.head()

Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary
0,619,France,Female,42,2,0.0,1,1,1,101348.88
1,608,Spain,Female,41,1,83807.86,1,0,1,112542.58
2,502,France,Female,42,8,159660.8,3,1,0,113931.57
3,699,France,Female,39,1,0.0,2,0,0,93826.63
4,850,Spain,Female,43,2,125510.82,1,1,1,79084.1


In [17]:
y.shape

(10000,)

In [18]:
y.head()

0    1
1    0
2    1
3    0
4    0
Name: Exited, dtype: int64

In [0]:
#Also since we see 3 unique geographies and 2 unique genders which are in string hence we would need to encode it. 

In [0]:
 X=pd.get_dummies(X,drop_first=True)

In [21]:
X.head()

Unnamed: 0,CreditScore,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Geography_Germany,Geography_Spain,Gender_Male
0,619,42,2,0.0,1,1,1,101348.88,0,0,0
1,608,41,1,83807.86,1,0,1,112542.58,0,1,0
2,502,42,8,159660.8,3,1,0,113931.57,0,0,0
3,699,39,1,0.0,2,0,0,93826.63,0,0,0
4,850,43,2,125510.82,1,1,1,79084.1,0,1,0


###### 4. Divide the data set into Train and test sets

In [0]:
X_train, X_test, y_train, y_test = train_test_split(X,y)

In [43]:
X_train.shape

(7500, 11)

In [44]:
y_train.shape

(7500,)

In [45]:
X_test.shape

(2500, 11)

In [46]:
y_test.shape

(2500,)

In [47]:
X_train.head()

Unnamed: 0,CreditScore,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Geography_Germany,Geography_Spain,Gender_Male
6445,679,30,1,112543.42,1,1,1,179435.21,0,0,0
272,811,34,1,149297.19,2,1,1,186339.74,1,0,0
8511,643,28,9,160858.13,2,1,0,27149.27,1,0,1
9092,781,38,2,117810.79,1,0,1,65632.33,0,0,1
8409,749,38,9,129378.32,1,1,1,13549.34,0,1,1


In [48]:
y_train

6445    0
272     0
8511    0
9092    1
8409    0
       ..
9946    0
1938    0
2989    1
2323    0
4157    0
Name: Exited, Length: 7500, dtype: int64

In [49]:
X_test.head()

Unnamed: 0,CreditScore,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Geography_Germany,Geography_Spain,Gender_Male
5211,749,22,4,94762.16,2,1,1,42241.54,1,0,1
368,636,34,2,40105.51,2,0,1,53512.16,1,0,1
917,646,45,3,47134.75,1,1,1,57236.44,0,0,0
224,671,45,6,99564.22,1,1,1,108872.45,1,0,1
9460,744,35,7,0.0,2,1,1,43036.6,0,1,1


In [50]:
y_test

5211    0
368     0
917     0
224     1
9460    0
       ..
9563    0
7698    1
1778    0
1330    0
8439    0
Name: Exited, Length: 2500, dtype: int64

5. Normalize the train and test data 

In [0]:
sc = StandardScaler()

In [0]:
ScaledX_train = sc.fit_transform(X_train)

In [0]:
ScaledX_test = sc.transform(X_test)

In [54]:
ScaledX_train[:,1]

array([-0.85195555, -0.47100033, -1.04243315, ...,  0.57662651,
        1.14805934,  2.19568618])

In [55]:
ScaledX_test[:,1]

array([-1.61386598, -0.47100033,  0.57662651, ...,  0.1956713 ,
        0.67186532, -1.61386598])

In [0]:
y_train = y_train.values

In [0]:
y_test = y_test.values

In [0]:
#Convert labels to one hot encoding because even though it has just one output of exited yes or no (0 or 1) but Softmax will create 2 output hence we need to pass one hot encoder labels here. 
trainY = tf.keras.utils.to_categorical(y_train, num_classes=2)
testY = tf.keras.utils.to_categorical(y_test, num_classes=2)


6. Initialize & build the model

Adding 3 hidden layers ro see the performance of deep neural network

In [92]:
# Clear out tensorflow memory
tf.keras.backend.clear_session()

# Initialize Sequential model
model = tf.keras.models.Sequential()

# Normalize the data
model.add(tf.keras.layers.BatchNormalization())

#Add 1st hidden layer
model.add(tf.keras.layers.Dense(900))
model.add(tf.keras.layers.LeakyReLU())
model.add(tf.keras.layers.BatchNormalization())
#model.add(tf.keras.layers.Dropout(0.3))    

#Add 2nd hidden layer
model.add(tf.keras.layers.Dense(800))
model.add(tf.keras.layers.LeakyReLU())
model.add(tf.keras.layers.BatchNormalization())
#model.add(tf.keras.layers.Dropout(0.2))

#Add 3rd hidden layer
model.add(tf.keras.layers.Dense(500))
model.add(tf.keras.layers.LeakyReLU())
model.add(tf.keras.layers.BatchNormalization())
#model.add(tf.keras.layers.Dropout(0.4))

# Add Dense Layer which provides 2 Output using softmax where total should be 1 or 100% 
model.add(tf.keras.layers.Dense(2, input_shape=(11,), activation='softmax'))

# Compile the model
model.compile(optimizer='sgd', loss='categorical_crossentropy',metrics=['accuracy'])

#Execute the model
model.fit(ScaledX_train, trainY, validation_data=(ScaledX_test, testY), epochs=10, batch_size=32)  

Train on 7500 samples, validate on 2500 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f9fd17067b8>

7. Optimize the model:

Since we see the loss is very minimal which is showing vanishing gradient problem means model stopped learning. We need to change the optimizer to have higher jump, hence we will use SGD with momentum.

In [94]:
# Clear out tensorflow memory
tf.keras.backend.clear_session()

# Initialize Sequential model
model = tf.keras.models.Sequential()

# Normalize the data
model.add(tf.keras.layers.BatchNormalization())

#Add 1st hidden layer
model.add(tf.keras.layers.Dense(900))
model.add(tf.keras.layers.LeakyReLU())
model.add(tf.keras.layers.BatchNormalization())

#Add 2nd hidden layer
model.add(tf.keras.layers.Dense(800))
model.add(tf.keras.layers.LeakyReLU())
model.add(tf.keras.layers.BatchNormalization())

#Add 3rd hidden layer
model.add(tf.keras.layers.Dense(500))
model.add(tf.keras.layers.LeakyReLU())
model.add(tf.keras.layers.BatchNormalization())

# Add Dense Layer which provides 2 Output using softmax where total should be 1 or 100% 
model.add(tf.keras.layers.Dense(2, input_shape=(11,), activation='softmax'))

# Compile the model
#model.compile(optimizer='sgd', loss='categorical_crossentropy',metrics=['accuracy'])
sgd_optim = tf.keras.optimizers.SGD(lr=.01, decay=0.001,momentum=0.9, nesterov=True)
model.compile(optimizer=sgd_optim, loss='categorical_crossentropy', metrics=['accuracy'])

#Execute the model
model.fit(ScaledX_train, trainY, validation_data=(ScaledX_test, testY), epochs=5,batch_size=32)  


Train on 7500 samples, validate on 2500 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7f9fd1373cf8>

Adding more layers and changing optimizer to ADAM

In [95]:
# Clear out tensorflow memory
tf.keras.backend.clear_session()

# Initialize Sequential model
model = tf.keras.models.Sequential()

# Normalize the data
model.add(tf.keras.layers.BatchNormalization())

#Add 1st hidden layer
model.add(tf.keras.layers.Dense(900))
model.add(tf.keras.layers.LeakyReLU())
model.add(tf.keras.layers.BatchNormalization())

#Add 2nd hidden layer
model.add(tf.keras.layers.Dense(800))
model.add(tf.keras.layers.LeakyReLU())
model.add(tf.keras.layers.BatchNormalization())

#Add 3rd hidden layer
model.add(tf.keras.layers.Dense(500))
model.add(tf.keras.layers.LeakyReLU())
model.add(tf.keras.layers.BatchNormalization())

#Add 4th hidden layer
model.add(tf.keras.layers.Dense(300))
model.add(tf.keras.layers.LeakyReLU())
model.add(tf.keras.layers.BatchNormalization())

#Add 5th hidden layer
model.add(tf.keras.layers.Dense(200))
model.add(tf.keras.layers.LeakyReLU())
model.add(tf.keras.layers.BatchNormalization())

# Add Dense Layer which provides 2 Output using softmax where total should be 1 or 100% 
model.add(tf.keras.layers.Dense(2, input_shape=(11,), activation='softmax'))

# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])
#sgd_optim = tf.keras.optimizers.SGD(lr=.01, decay=0.001,momentum=0.9, nesterov=True)
#model.compile(optimizer=sgd_optim, loss='categorical_crossentropy', metrics=['accuracy'])

#Execute the model
model.fit(ScaledX_train, trainY, validation_data=(ScaledX_test, testY), epochs=10,batch_size=32)  

Train on 7500 samples, validate on 2500 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f9fc90c4a58>

8. Predict the results using 0.5 as a threshold 

Default threshold is 0.5 only. So regular predict command will give prediction on 0.5 threshold.

In [98]:
model.predict(X_train)

array([[0., 1.],
       [0., 1.],
       [0., 1.],
       ...,
       [0., 1.],
       [0., 1.],
       [0., 1.]], dtype=float32)

In [102]:
predicted= model.predict(X_test)
predicted

array([[0., 1.],
       [0., 1.],
       [0., 1.],
       ...,
       [0., 1.],
       [0., 1.],
       [0., 1.]], dtype=float32)

9. Print the Accuracy score and confusion matrix

In [100]:
  score = model.evaluate( ScaledX_test,testY)
  score



[0.3499383521914482, 0.85]

In [101]:
model.metrics_names

['loss', 'accuracy']

In [106]:
from sklearn.metrics import confusion_matrix

confusion_matrix(testY[:,0],predicted[:,0])

array([[ 516,    0],
       [1984,    0]])