Given a bank customer, build a neural network-based classifier that can determine whether
they will leave or not in the next 6 months.
Dataset Description: The case study is from an open-source dataset from Kaggle.
The dataset contains 10,000 sample points with 14 distinct features such as
CustomerId, CreditScore, Geography, Gender, Age, Tenure, Balance, etc.

Perform following steps:
1. Read the dataset.
2. Distinguish the feature and target set and divide the data set into training and test sets.
3. Normalize the train and test data.
4. Initialize and build the model. Identify the points of improvement and implement the same.
5. Print the accuracy score and confusion matrix (5 points).

In [28]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from keras.models import Sequential
from keras.layers import Dense, Dropout
from sklearn.metrics import confusion_matrix

In [29]:
df=pd.read_csv('Churn_Modelling.csv',index_col='RowNumber')
df.head()

Unnamed: 0_level_0,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
RowNumber,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [30]:
df.describe()

Unnamed: 0,CustomerId,CreditScore,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
count,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0
mean,15690940.0,650.5288,38.9218,5.0128,76485.889288,1.5302,0.7055,0.5151,100090.239881,0.2037
std,71936.19,96.653299,10.487806,2.892174,62397.405202,0.581654,0.45584,0.499797,57510.492818,0.402769
min,15565700.0,350.0,18.0,0.0,0.0,1.0,0.0,0.0,11.58,0.0
25%,15628530.0,584.0,32.0,3.0,0.0,1.0,0.0,0.0,51002.11,0.0
50%,15690740.0,652.0,37.0,5.0,97198.54,1.0,1.0,1.0,100193.915,0.0
75%,15753230.0,718.0,44.0,7.0,127644.24,2.0,1.0,1.0,149388.2475,0.0
max,15815690.0,850.0,92.0,10.0,250898.09,4.0,1.0,1.0,199992.48,1.0


In [31]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 10000 entries, 1 to 10000
Data columns (total 13 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   CustomerId       10000 non-null  int64  
 1   Surname          10000 non-null  object 
 2   CreditScore      10000 non-null  int64  
 3   Geography        10000 non-null  object 
 4   Gender           10000 non-null  object 
 5   Age              10000 non-null  int64  
 6   Tenure           10000 non-null  int64  
 7   Balance          10000 non-null  float64
 8   NumOfProducts    10000 non-null  int64  
 9   HasCrCard        10000 non-null  int64  
 10  IsActiveMember   10000 non-null  int64  
 11  EstimatedSalary  10000 non-null  float64
 12  Exited           10000 non-null  int64  
dtypes: float64(2), int64(8), object(3)
memory usage: 1.1+ MB


In [32]:
df.isnull().sum()

CustomerId         0
Surname            0
CreditScore        0
Geography          0
Gender             0
Age                0
Tenure             0
Balance            0
NumOfProducts      0
HasCrCard          0
IsActiveMember     0
EstimatedSalary    0
Exited             0
dtype: int64

In [33]:
df.shape

(10000, 13)

In [34]:
X_col=df.columns.tolist()[2:12]
y_col=df.columns.tolist()[-1:]
print(X_col)
print(y_col)

['CreditScore', 'Geography', 'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard', 'IsActiveMember', 'EstimatedSalary']
['Exited']


In [35]:
X=df[X_col].values
y=df[y_col].values

In [36]:
from sklearn.preprocessing import LabelEncoder
x_col_transform=LabelEncoder()
X[:,1]=x_col_transform.fit_transform(X[:,1])

In [37]:
#Transforming gender column
X[:,2]=x_col_transform.fit_transform(X[:,2])

In [38]:
pipeline = Pipeline(
    [
        ('Categorizer', ColumnTransformer(
            [
                ("Gender Label Encoder", OneHotEncoder(categories = 'auto', drop = 'first'), [2]),
                ("Geography Label Encoder", OneHotEncoder(categories = 'auto', drop = 'first'), [1])
            ], 
            remainder = 'passthrough', n_jobs = 1)),
        ('Normalizer', StandardScaler())
    ]
)

In [39]:
#Standardize the features
X = pipeline.fit_transform(X)

In [40]:
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=1)

In [41]:
#Initialize ANN
classifier = Sequential()

In [42]:
#Add input layer and hidden layer
classifier.add(Dense(6, activation = 'relu', input_shape = (X_train.shape[1], )))
classifier.add(Dropout(rate = 0.1))

In [43]:
#Add second layer
classifier.add(Dense(6, activation = 'relu'))
classifier.add(Dropout(rate = 0.1))

In [44]:
#Add output layer
classifier.add(Dense(1, activation = 'sigmoid'))

In [45]:
#our network summary
classifier.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_3 (Dense)             (None, 6)                 72        
                                                                 
 dropout_2 (Dropout)         (None, 6)                 0         
                                                                 
 dense_4 (Dense)             (None, 6)                 42        
                                                                 
 dropout_3 (Dropout)         (None, 6)                 0         
                                                                 
 dense_5 (Dense)             (None, 1)                 7         
                                                                 
Total params: 121 (484.00 Byte)
Trainable params: 121 (484.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [46]:
#Optimize the weights
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

In [47]:
#Fitting the Neural Network
history = classifier.fit(X_train, y_train, batch_size = 32, epochs = 200, validation_split = 0.1, verbose = 2)

Epoch 1/200
225/225 - 1s - loss: 0.6251 - accuracy: 0.7282 - val_loss: 0.5484 - val_accuracy: 0.7788 - 1s/epoch - 6ms/step
Epoch 2/200
225/225 - 1s - loss: 0.5075 - accuracy: 0.7993 - val_loss: 0.5063 - val_accuracy: 0.7788 - 669ms/epoch - 3ms/step
Epoch 3/200
225/225 - 1s - loss: 0.4731 - accuracy: 0.7993 - val_loss: 0.4911 - val_accuracy: 0.7788 - 647ms/epoch - 3ms/step
Epoch 4/200
225/225 - 1s - loss: 0.4576 - accuracy: 0.7993 - val_loss: 0.4813 - val_accuracy: 0.7788 - 500ms/epoch - 2ms/step
Epoch 5/200
225/225 - 1s - loss: 0.4466 - accuracy: 0.7993 - val_loss: 0.4768 - val_accuracy: 0.7788 - 519ms/epoch - 2ms/step
Epoch 6/200
225/225 - 0s - loss: 0.4456 - accuracy: 0.7993 - val_loss: 0.4749 - val_accuracy: 0.7788 - 461ms/epoch - 2ms/step
Epoch 7/200
225/225 - 1s - loss: 0.4403 - accuracy: 0.7994 - val_loss: 0.4693 - val_accuracy: 0.7812 - 515ms/epoch - 2ms/step
Epoch 8/200
225/225 - 1s - loss: 0.4376 - accuracy: 0.8067 - val_loss: 0.4660 - val_accuracy: 0.7925 - 517ms/epoch - 2ms/

Epoch 66/200
225/225 - 0s - loss: 0.3607 - accuracy: 0.8521 - val_loss: 0.3728 - val_accuracy: 0.8363 - 424ms/epoch - 2ms/step
Epoch 67/200
225/225 - 1s - loss: 0.3566 - accuracy: 0.8521 - val_loss: 0.3733 - val_accuracy: 0.8363 - 581ms/epoch - 3ms/step
Epoch 68/200
225/225 - 1s - loss: 0.3611 - accuracy: 0.8507 - val_loss: 0.3720 - val_accuracy: 0.8363 - 547ms/epoch - 2ms/step
Epoch 69/200
225/225 - 1s - loss: 0.3540 - accuracy: 0.8519 - val_loss: 0.3731 - val_accuracy: 0.8325 - 608ms/epoch - 3ms/step
Epoch 70/200
225/225 - 1s - loss: 0.3617 - accuracy: 0.8515 - val_loss: 0.3715 - val_accuracy: 0.8338 - 609ms/epoch - 3ms/step
Epoch 71/200
225/225 - 1s - loss: 0.3574 - accuracy: 0.8515 - val_loss: 0.3737 - val_accuracy: 0.8338 - 554ms/epoch - 2ms/step
Epoch 72/200
225/225 - 1s - loss: 0.3570 - accuracy: 0.8514 - val_loss: 0.3699 - val_accuracy: 0.8400 - 567ms/epoch - 3ms/step
Epoch 73/200
225/225 - 1s - loss: 0.3611 - accuracy: 0.8504 - val_loss: 0.3706 - val_accuracy: 0.8350 - 630ms/e

Epoch 131/200
225/225 - 1s - loss: 0.3535 - accuracy: 0.8535 - val_loss: 0.3659 - val_accuracy: 0.8375 - 649ms/epoch - 3ms/step
Epoch 132/200
225/225 - 1s - loss: 0.3494 - accuracy: 0.8525 - val_loss: 0.3656 - val_accuracy: 0.8388 - 622ms/epoch - 3ms/step
Epoch 133/200
225/225 - 1s - loss: 0.3552 - accuracy: 0.8510 - val_loss: 0.3658 - val_accuracy: 0.8400 - 651ms/epoch - 3ms/step
Epoch 134/200
225/225 - 1s - loss: 0.3522 - accuracy: 0.8512 - val_loss: 0.3652 - val_accuracy: 0.8375 - 621ms/epoch - 3ms/step
Epoch 135/200
225/225 - 1s - loss: 0.3521 - accuracy: 0.8561 - val_loss: 0.3658 - val_accuracy: 0.8388 - 536ms/epoch - 2ms/step
Epoch 136/200
225/225 - 1s - loss: 0.3530 - accuracy: 0.8521 - val_loss: 0.3651 - val_accuracy: 0.8313 - 514ms/epoch - 2ms/step
Epoch 137/200
225/225 - 0s - loss: 0.3518 - accuracy: 0.8537 - val_loss: 0.3645 - val_accuracy: 0.8400 - 465ms/epoch - 2ms/step
Epoch 138/200
225/225 - 1s - loss: 0.3544 - accuracy: 0.8532 - val_loss: 0.3652 - val_accuracy: 0.8350 -

225/225 - 1s - loss: 0.3544 - accuracy: 0.8519 - val_loss: 0.3649 - val_accuracy: 0.8338 - 621ms/epoch - 3ms/step
Epoch 196/200
225/225 - 1s - loss: 0.3518 - accuracy: 0.8558 - val_loss: 0.3658 - val_accuracy: 0.8313 - 631ms/epoch - 3ms/step
Epoch 197/200
225/225 - 1s - loss: 0.3537 - accuracy: 0.8542 - val_loss: 0.3667 - val_accuracy: 0.8313 - 702ms/epoch - 3ms/step
Epoch 198/200
225/225 - 1s - loss: 0.3556 - accuracy: 0.8522 - val_loss: 0.3664 - val_accuracy: 0.8325 - 625ms/epoch - 3ms/step
Epoch 199/200
225/225 - 1s - loss: 0.3467 - accuracy: 0.8556 - val_loss: 0.3662 - val_accuracy: 0.8363 - 610ms/epoch - 3ms/step
Epoch 200/200
225/225 - 1s - loss: 0.3576 - accuracy: 0.8533 - val_loss: 0.3655 - val_accuracy: 0.8363 - 661ms/epoch - 3ms/step


In [48]:
y_pred = classifier.predict(X_test)
print(y_pred[:5])

[[0.04599902]
 [0.11147252]
 [0.06912845]
 [0.07401666]
 [0.18085948]]


In [49]:
#Let us use confusion matrix with cutoff value as 0.5
y_pred = (y_pred > 0.5).astype(int)
print(y_pred[:5])

[[0]
 [0]
 [0]
 [0]
 [0]]


In [50]:
#Making the Matrix
cm = confusion_matrix(y_test, y_pred)
print(cm)

[[1529   56]
 [ 219  196]]


In [51]:
#Accuracy of our NN
print(((cm[0][0] + cm[1][1])* 100) / len(y_test), '% of data was classified correctly')

86.25 % of data was classified correctly
