# Churn Prediction

- Data and Business Problem: Our basic aim is to predict customer churn for a certain bank i.e. which customer is going to leave this bank service. 

- Dataset is small(for learning purpose) and contains 10000 rows with 14 columns.

- Do Label Encoding for both Geography and Gender and One Hot Encoding for Geography (Categorical columns)

- Do Feature Scaling (Standard Scaling)

- Use 80 percent of the data for training and 20 percent for test

- Lets have two hidden layers with 6 neurons at each, and an output layer. Batch_size=32, epochs=10


In [48]:
import pandas as pd
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import keras
from keras.models import Sequential
from keras.layers import Dense
from sklearn.metrics import confusion_matrix, classification_report
import numpy as np
import tensorflow as tf

df = pd.read_csv('Datasets/Churn_Modelling.csv')



In [2]:
df.head()

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


## One Hot Encoding

In [2]:
X = df.iloc[:, 3:13]
y = df.iloc[:, 13]
X = pd.get_dummies(X).values  # both gender and geo are one hot encoded
print(X)

[[619.  42.   2. ...   0.   1.   0.]
 [608.  41.   1. ...   1.   1.   0.]
 [502.  42.   8. ...   0.   1.   0.]
 ...
 [709.  36.   7. ...   0.   1.   0.]
 [772.  42.   3. ...   0.   0.   1.]
 [792.  28.   4. ...   0.   1.   0.]]


In [3]:
x_train, x_test, y_train, y_test  = train_test_split(X, y, test_size=0.2, random_state=42) 

In [4]:
sc = StandardScaler()
X_train = sc.fit_transform(x_train)
X_test = sc.transform(x_test)

In [5]:
print(X_train.shape)

(8000, 13)


## Define Model

 two hidden layers with 6 neurons at each, and an output layer. Batch_size=32, epochs=10

In [12]:
model = Sequential()

model.add(Dense(6, kernel_initializer='uniform', activation='relu', input_shape=(13,)))

model.add(Dense(6, kernel_initializer='uniform', activation='relu'))

model.add(Dense(1, kernel_initializer='uniform', activation='sigmoid'))

## Compile Model

In [14]:
model.compile(loss=keras.losses.binary_crossentropy,
              optimizer='adam',
              metrics=['accuracy'])

## Train Model

In [39]:
model.fit(X_train, y_train, batch_size=32, epochs=50, validation_split=0.2)

Train on 6400 samples, validate on 1600 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x1471ace90>

## Evaluating the Model

In [40]:
y_pred = model.predict(X_test)

In [41]:
# Find the right threshold
labels = y.value_counts()
print(labels)

0    7963
1    2037
Name: Exited, dtype: int64


In [42]:
threshold = labels[1]/len(y)

In [43]:
print(threshold)  # the percentage of customers who are 1

0.2037


In [44]:
# use the threshold, so we're more sensitive to classifying those who exited
y_pred = (y_pred > threshold)

In [45]:
cm = confusion_matrix(y_test, y_pred)
print(cm)

[[1118  489]
 [ 102  291]]


## Evaluate Model Metrics

In [46]:
# Functions for metrics copied from https://datascience.stackexchange.com/questions/45165/how-to-get-accuracy-f1-precision-and-recall-for-a-keras-model
def recall(cm):
    true_positives = cm[1][1]
    false_negatives = cm[0][1]
    return (true_positives / (true_positives + false_negatives))
    
def precision(cm):
    true_positives = cm[1][1]
    false_positives = cm[1][0]
    return (true_positives / (true_positives + false_positives))
    
def f1(cm):
    precision_val = precision(cm)
    recall_val = recall(cm)
    return 2*((precision_val*recall_val)/(precision_val+recall_val))

def accuracy(y_pred, cm):
    TP, TN = cm[1][1], cm[0][0]
    return (TP + TN) / (len(y_pred))

In [47]:
print(f'Recall: {recall(cm)}')
print(f'Precision: {precision(cm)}')
print(f'F1-Score: {f1(cm)}')
print(f'Accuracy: {accuracy(y_pred, cm)}')

Recall: 0.3730769230769231
Precision: 0.7404580152671756
F1-Score: 0.4961636828644502
Accuracy: 0.7045


Alternatively...

In [50]:
report = classification_report(y_test, y_pred).split('\n')
for line in report:
    print(line)

              precision    recall  f1-score   support

           0       0.92      0.70      0.79      1607
           1       0.37      0.74      0.50       393

    accuracy                           0.70      2000
   macro avg       0.64      0.72      0.64      2000
weighted avg       0.81      0.70      0.73      2000

