# The Bank Churn Prediction
Given a Bank customer, can we build a classifier that can determine whether they will leave or not using Neural networks?

The dataset contains 10,000 sample points with 14 distinct features such as CustomerId, CreditScore, Geography, Gender, Age, Tenure, Balance etc. Know your data: https://www.kaggle.com/barelydedicated/bank-customer-churn-modeling

 

Context:
Businesses like banks which provide service have to worry about problem of 'Churn' i.e. customers leaving and joining another service provider. It is important to understand which aspects of the service influence a
customer's decision in this regard. Management can concentrate efforts on improvement of service, keeping in mind these priorities.

Steps and Milestones (100%):


 Setup Environment and Load Necessary Packages (5%)

 Data Preparation (40%)
o Loading Data (5%)
o Cleaning Data (10%)
o Data Representation & Feature Engineering (If Any) (15%)
o Creating Train and Validation Set (10%)

 Model Creation (30%)
o Write & Configure Model (10%)
o Compile Model (10%)
o Build Model & Checking Summary (10%)

 Training and Evaluation (25%)
o Run Multiple Experiments (10%)
o Reason & Visualize Model Performance (5%)
o Evaluate Model on Test Set (10%)

Learning Outcomes:
o Neural Networks for Predictive Analytics
o Fine-tuning Model
o Data Preparation
o Feature Engineering
o Visualization

 

The points distribution for this case is as follows:

Read the data set
Drop the columns which are unique for all users like IDs (2.5 points)
Distinguish the feature and target set (2.5 points)
Divide the data set into training and test sets ( 2.5 points)
Normalize the train and test data (5 points)
Initialize & build the model (10 points)
Predict the results using 0.5 as a threshold (5 points)
Print the Accuracy score and confusion matrix (2.5 points)

### Specifying the TensorFlow version
Running `import tensorflow` will import the default version (currently 1.x). You can use 2.x by running a cell with the `tensorflow_version` magic **before** you run `import tensorflow`.

In [1]:
# %tensorflow_version 2.x

### Import TensorFlow
Once you have specified a version via this magic, you can run `import tensorflow` as normal and verify which version was imported as follows:

In [181]:
import tensorflow as tf
print(tf.__version__)

2.0.0


### Set random seed

In [182]:
tf.random.set_seed(42)

### Import dataset
- Import Bank Churn Dataset
- Importing the dataset using the pandas library

In [183]:
import pandas as pd
import numpy as np
df = pd.read_csv("R6_data/bank_churn_data.csv", index_col='RowNumber')
df.head()

Unnamed: 0_level_0,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
RowNumber,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [184]:
df.describe(include='all').transpose()

Unnamed: 0,count,unique,top,freq,mean,std,min,25%,50%,75%,max
CustomerId,10000,,,,15690900.0,71936.2,15565700.0,15628500.0,15690700.0,15753200.0,15815700.0
Surname,10000,2932.0,Smith,32.0,,,,,,,
CreditScore,10000,,,,650.529,96.6533,350.0,584.0,652.0,718.0,850.0
Geography,10000,3.0,France,5014.0,,,,,,,
Gender,10000,2.0,Male,5457.0,,,,,,,
Age,10000,,,,38.9218,10.4878,18.0,32.0,37.0,44.0,92.0
Tenure,10000,,,,5.0128,2.89217,0.0,3.0,5.0,7.0,10.0
Balance,10000,,,,76485.9,62397.4,0.0,0.0,97198.5,127644.0,250898.0
NumOfProducts,10000,,,,1.5302,0.581654,1.0,1.0,1.0,2.0,4.0
HasCrCard,10000,,,,0.7055,0.45584,0.0,0.0,1.0,1.0,1.0


### Drop the columns which are unique for all users
- Dropping CustomerId
- Can try dropping surname as well

In [185]:
df_drop = df.drop(['CustomerId','Surname'],axis=1) ## Removing surname as onhot encoding will cause issues for each one of them
df_drop.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 10000 entries, 1 to 10000
Data columns (total 11 columns):
CreditScore        10000 non-null int64
Geography          10000 non-null object
Gender             10000 non-null object
Age                10000 non-null int64
Tenure             10000 non-null int64
Balance            10000 non-null float64
NumOfProducts      10000 non-null int64
HasCrCard          10000 non-null int64
IsActiveMember     10000 non-null int64
EstimatedSalary    10000 non-null float64
Exited             10000 non-null int64
dtypes: float64(2), int64(7), object(2)
memory usage: 937.5+ KB


### Encoding and scaling values

In [186]:
## Label Encoding of all the columns
from sklearn.preprocessing import LabelEncoder
# instantiate labelencoder object
le = LabelEncoder()

# Categorical boolean mask
categorical_feature_mask = df_drop.dtypes==object
# filter categorical columns using mask and turn it into a list
categorical_cols = df_drop.columns[categorical_feature_mask].tolist()
df_drop[categorical_cols] = df_drop[categorical_cols].apply(lambda col: le.fit_transform(col))
print(df_drop.info())


<class 'pandas.core.frame.DataFrame'>
Int64Index: 10000 entries, 1 to 10000
Data columns (total 11 columns):
CreditScore        10000 non-null int64
Geography          10000 non-null int64
Gender             10000 non-null int64
Age                10000 non-null int64
Tenure             10000 non-null int64
Balance            10000 non-null float64
NumOfProducts      10000 non-null int64
HasCrCard          10000 non-null int64
IsActiveMember     10000 non-null int64
EstimatedSalary    10000 non-null float64
Exited             10000 non-null int64
dtypes: float64(2), int64(9)
memory usage: 937.5 KB
None


In [187]:
df_drop.head()

Unnamed: 0_level_0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
RowNumber,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
1,619,0,0,42,2,0.0,1,1,1,101348.88,1
2,608,2,0,41,1,83807.86,1,0,1,112542.58,0
3,502,0,0,42,8,159660.8,3,1,0,113931.57,1
4,699,0,0,39,1,0.0,2,0,0,93826.63,0
5,850,2,0,43,2,125510.82,1,1,1,79084.1,0


In [234]:
from scipy.stats import zscore
df_scaled = df_drop.apply(zscore)
X_columns =  df_scaled.columns.tolist()[0:10]
Y_Columns = df_drop.columns.tolist()[-1:]

X = df_scaled[X_columns].values # Credit Score through Estimated Salary
Y = np.array(df_drop['Exited']) # Exited

print(Y)
print(X)

[1 0 1 ... 1 1 0]
[[-0.32622142 -0.90188624 -1.09598752 ...  0.64609167  0.97024255
   0.02188649]
 [-0.44003595  1.51506738 -1.09598752 ... -1.54776799  0.97024255
   0.21653375]
 [-1.53679418 -0.90188624 -1.09598752 ...  0.64609167 -1.03067011
   0.2406869 ]
 ...
 [ 0.60498839 -0.90188624 -1.09598752 ... -1.54776799  0.97024255
  -1.00864308]
 [ 1.25683526  0.30659057  0.91241915 ...  0.64609167 -1.03067011
  -0.12523071]
 [ 1.46377078 -0.90188624 -1.09598752 ...  0.64609167 -1.03067011
  -1.07636976]]


### Create train and test data
- use train_test_split to get train and test set
- set a random_state
- test_size: 0.20

In [235]:
from sklearn.model_selection import train_test_split
print(Y)

Y = Y.astype('bool_')
print(Y.dtype)

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.20, random_state=8)

[1 0 1 ... 1 1 0]
bool


In [236]:
# from tensorflow.keras.utils import to_categorical
# #Encoding the output class label (One-Hot Encoding)
# y_train=to_categorical(y_train,3,dtype='int')
# y_test=to_categorical(y_test,3,dtype='int')

from tensorflow.keras.utils import to_categorical
#Encoding the output class label (One-Hot Encoding)
y_train=to_categorical(y_train,2,dtype='int')
y_test=to_categorical(y_test,2,dtype='int')

### Initialize a sequential model
- Define a sequential model

In [237]:
import tensorflow as tf
from tensorflow.keras import models
from tensorflow.keras.layers import Dense
#Initialize Sequential Graph (model)
model = tf.keras.Sequential()


### Add a layer
- Use Dense Layer  with input shape of 4 (according to the feature set) and number of outputs set to 3
- Apply Softmax on Dense Layer outputs

In [238]:
#Add Dense layer for prediction - Keras declares weights and bias automatically
model.add(Dense(18, activation='relu', input_shape=(10,)))
model.add(Dense(20, activation='relu'))
model.add(Dense(2, activation='softmax'))

### Compile the model
- Use SGD as Optimizer
- Use categorical_crossentropy as loss function
- Use accuracy as metrics

In [239]:
#Compile the model - add Loss and Gradient Descent optimizer
model.compile(optimizer='sgd', loss='categorical_crossentropy',metrics=['accuracy'])

### Summarize the model
- Check model layers
- Understand number of trainable parameters

In [240]:
model.summary()

Model: "sequential_69"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_272 (Dense)            (None, 18)                198       
_________________________________________________________________
dense_273 (Dense)            (None, 20)                380       
_________________________________________________________________
dense_274 (Dense)            (None, 2)                 42        
Total params: 620
Trainable params: 620
Non-trainable params: 0
_________________________________________________________________


### Fit the model
- Give train data as training features and labels
- Epochs: 100
- Give validation data as testing features and labels

In [241]:
model.fit(X_train, y_train, epochs=100, validation_data=(X_test,y_test))

Train on 8000 samples, validate on 2000 samples
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100


Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100


<tensorflow.python.keras.callbacks.History at 0x1a48db5400>

### Compare the prediction with actual label
- Print the same row as done in the previous step but of actual labels

In [242]:
y_test[0:1]

array([[1, 0]])

In [243]:
score = model.evaluate(X_test, y_test,verbose=1)

print(score)



[0.34554733192920684, 0.8635]


In [248]:
from sklearn import metrics
y_pred = np.round(model.predict(X_test))
y_pred[0:10]
print(y_pred.shape)
print(y_pred[0:10])
cm = metrics.confusion_matrix(y_test.argmax(axis=1), y_pred.argmax(axis=1))
cm

(2000, 2)
[[1. 0.]
 [1. 0.]
 [1. 0.]
 [1. 0.]
 [1. 0.]
 [0. 1.]
 [1. 0.]
 [1. 0.]
 [1. 0.]
 [1. 0.]]


array([[1531,   56],
       [ 217,  196]])

### Grid SearchCV

In [249]:
def create_model():
  model_2 = tf.keras.Sequential()
  model_2.add(Dense(20, activation='relu', input_shape=(10,)))
  model_2.add(Dense(30, activation='relu'))
  model_2.add(Dense(20, activation='relu'))
  model_2.add(Dense(2, activation='softmax'))
  model_2.compile(optimizer='sgd', loss='categorical_crossentropy',metrics=['accuracy'])
  return model_2

In [250]:
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
#from sklearn.grid_search import GridSearchCV
from sklearn.model_selection import GridSearchCV

# param = {'n_estimators': [10,50,100,200,500], 
#          'max_features': [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22],
#          }

model_KC = KerasClassifier(build_fn=create_model)

optimizers = ['rmsprop', 'adam']
init = ['glorot_uniform', 'normal', 'uniform']
batches = [100,1000]
epochs = [1,10,50]
n_estimators = [10,50,100,200,500]
max_features = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22]
param_grid = dict(epochs=epochs, batch_size=batches)

In [251]:
gs = GridSearchCV(model_KC, param_grid=param_grid,cv=5,scoring='accuracy')
grid_result = gs.fit(X_train,y_train.argmax(axis=1))

Train on 6400 samples
Train on 6400 samples
Train on 6400 samples
Train on 6400 samples
Train on 6400 samples
Train on 6400 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 6400 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 6400 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 6400 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 6400 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 6400 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18

Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Train on 6400 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50


Epoch 50/50
Train on 6400 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Train on 6400 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50


Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Train on 6400 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Train on 6400 samples
Train on 6400 samples
Train on 6400 samples


Train on 6400 samples
Train on 6400 samples
Train on 6400 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 6400 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 6400 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 6400 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 6400 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 6400 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Ep

Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Train on 6400 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Train on 6400 samples
Epoch 1/50
Epoch 2/50


Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Train on 6400 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50


Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Train on 6400 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Train on 8000 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50


Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [252]:
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means,stds,params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best: 0.834375 using {'batch_size': 100, 'epochs': 50}
0.796875 (0.012203) with: {'batch_size': 100, 'epochs': 1}
0.801375 (0.017050) with: {'batch_size': 100, 'epochs': 10}
0.834375 (0.010990) with: {'batch_size': 100, 'epochs': 50}
0.539500 (0.170285) with: {'batch_size': 1000, 'epochs': 1}
0.796750 (0.012446) with: {'batch_size': 1000, 'epochs': 10}
0.797000 (0.012459) with: {'batch_size': 1000, 'epochs': 50}


In [253]:
df_drop.head()

Unnamed: 0_level_0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
RowNumber,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
1,619,0,0,42,2,0.0,1,1,1,101348.88,1
2,608,2,0,41,1,83807.86,1,0,1,112542.58,0
3,502,0,0,42,8,159660.8,3,1,0,113931.57,1
4,699,0,0,39,1,0.0,2,0,0,93826.63,0
5,850,2,0,43,2,125510.82,1,1,1,79084.1,0


In [254]:
df_1 = df_drop.copy(deep=True)
Y_cv = df_1['Exited']
X_cv = df_1.drop(['Exited'], axis=1)
X_cv = X_cv.values

In [255]:
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import cross_val_score
model_KC_CV = KerasClassifier(build_fn=create_model,epochs=150, batch_size=10, verbose=0)

kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=7)
results = cross_val_score(model_KC_CV, X_cv, Y_cv, cv=kfold, scoring='accuracy')
print(results.mean())
cross_val_score

0.7963001779001779


<function sklearn.model_selection._validation.cross_val_score(estimator, X, y=None, groups=None, scoring=None, cv='warn', n_jobs=None, verbose=0, fit_params=None, pre_dispatch='2*n_jobs', error_score='raise-deprecating')>