## Churn prediction
Churn prediction is a problem when customers/employees leave company for different reason. First, employers want to predict who is likely to leave, and further determine how to minimize this problem. This small example illustrate such problem when the customers of a big international bank, who decided to leave (Exited) from the bank.
First model here is an attempt to use neural network to built predictive model using Keras withou additional features. Dataset from kaggle Churn prediction

This is an exploration of applying neural networks to predict customer churn using binary classification.  To evaluate the model, we use cross validation.  

In [2]:
import pandas as pd
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

seed = 7
np.random.seed(seed)

Using TensorFlow backend.


In [3]:
df = pd.read_csv("/home/tri/Downloads/Churn_Modelling.csv")
df.drop(['RowNumber','Surname'],axis=1,inplace=True)
#df=df.apply(LabelEncoder().fit_transform)
fields= ['Geography','Gender']
for f in fields:
    df[[f]]=df[[f]].apply(LabelEncoder().fit_transform)
df.head()

Unnamed: 0,CustomerId,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,15634602,619,0,0,42,2,0.0,1,1,1,101348.88,1
1,15647311,608,2,0,41,1,83807.86,1,0,1,112542.58,0
2,15619304,502,0,0,42,8,159660.8,3,1,0,113931.57,1
3,15701354,699,0,0,39,1,0.0,2,0,0,93826.63,0
4,15737888,850,2,0,43,2,125510.82,1,1,1,79084.1,0


In [4]:
def create_baseline():
    # create model
    model = Sequential()
    model.add(Dense(11, input_dim=11, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

In [5]:
#Baseline without encoding
x=df.values[:,0:11].astype(float)
y=df.values[:,11]
est = KerasClassifier(build_fn=create_baseline, nb_epoch=10, batch_size=20, verbose=0)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
res = cross_val_score(est, x, y, cv=kfold)
print("Baseline: %.2f%% (%.2f%%)" % (res.mean()*100, res.std()*100))

Baseline: 44.06% (29.03%)


In [6]:
# pipeline without label encoding
np.random.seed(seed)
estimators = [('standardize', StandardScaler()),\
              ('mlp', KerasClassifier(build_fn=create_baseline, nb_epoch=10, batch_size=20, verbose=0))]
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(pipeline, x, y, cv=kfold)
print("Standardized: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Standardized: 85.25% (0.75%)


In [9]:
# with label encoding
encoded_y  =  LabelEncoder().fit_transform(y)
estimators = [('standardize', StandardScaler()),\
              ('mlp', KerasClassifier(build_fn=create_baseline, nb_epoch=10, batch_size=20, verbose=0))]
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(pipeline, x, encoded_y, cv=kfold)
print("Standardized: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Standardized: 84.86% (0.98%)


 This code credits kaggler phillipo //www.kaggle.com/filippoo/deep-learning-az-ann

In [11]:
def create_smaller():
    model = Sequential()
    model.add(Dense(5, input_dim=11, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

estimators = [('standardize', StandardScaler()),\
              ('mlp', KerasClassifier(build_fn= create_smaller, nb_epoch=10, batch_size=20, verbose=0))]
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(pipeline, x, encoded_y, cv=kfold)
print("Smaller: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Smaller: 83.61% (1.45%)


In [12]:
def create_larger():
    # create model
    model = Sequential()
    model.add(Dense(11, input_dim=11, activation='relu'))
    model.add(Dense(5, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model
estimators = [('standardize', StandardScaler()),\
              ('mlp', KerasClassifier(build_fn= create_larger, nb_epoch=10, batch_size=20, verbose=0))]
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(pipeline, x, encoded_y, cv=kfold)
print("Larger: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Larger: 85.27% (0.96%)


Addition, we need to implement hyperparameter