Data set customer churn modeling in which a bank has given a fictional data based on these features we need to identify whether the customer is going to stay with the bank or to close the account and get away with the bank

### Following steps are taken for building the project
1. Importing libraries(tensorflow.keras,numpy,pandas,sklearn)
2. Reading the csv file and cleaning the data 
3. Splitting the data into train and test and standardizing Feature  
4. Building ANN --> Adding input layer, Random w init and Adding Hidden Layers with activation function
5. Select Optimizer, Loss, and Performance Metrics and Compiling the model
6. using model.fit to train the model
7. Prediction
8. Evaluate the model
9. Adjust optimization parameters or model if needed

In [3]:
import tensorflow as tf
from tensorflow import keras #keras is embedded into tensorflow 2.0
from tensorflow.keras import Sequential #layers: list of layers to add to the model.
from tensorflow.keras.layers import Flatten, Dense #Just your regular densely-connected NN layer.

In [4]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

In [5]:
data = pd.read_csv('Customer_Churn_Modelling.csv')

In [6]:
data.head()

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [8]:
X = data.drop(labels=['CustomerId', 'Surname', 'RowNumber', 'Exited'], axis=1)
y = data['Exited']

In [9]:
X.head()

Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary
0,619,France,Female,42,2,0.0,1,1,1,101348.88
1,608,Spain,Female,41,1,83807.86,1,0,1,112542.58
2,502,France,Female,42,8,159660.8,3,1,0,113931.57
3,699,France,Female,39,1,0.0,2,0,0,93826.63
4,850,Spain,Female,43,2,125510.82,1,1,1,79084.1


ANN works on a numerical data not on a string data so we have to use map function to map this data into a numerical data (label encoder or one hot encoder)

In [11]:
from sklearn.preprocessing import LabelEncoder 
# It can also be used to transform non-numerical labels 
# (as long as they are hashable and comparable) to numerical labels.

In [23]:
label1 = LabelEncoder()
X['Geography'] = label1.fit_transform(X['Geography'])
X.head()
#0-france, 1-germany, 2-spain

Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary
0,619,0,0,42,2,0.0,1,1,1,101348.88
1,608,2,0,41,1,83807.86,1,0,1,112542.58
2,502,0,0,42,8,159660.8,3,1,0,113931.57
3,699,0,0,39,1,0.0,2,0,0,93826.63
4,850,2,0,43,2,125510.82,1,1,1,79084.1


In [22]:
label = LabelEncoder()
X['Gender'] = label.fit_transform(X['Gender'])
X.head() # 0-female and 1-male

Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary
0,619,0,0,42,2,0.0,1,1,1,101348.88
1,608,2,0,41,1,83807.86,1,0,1,112542.58
2,502,0,0,42,8,159660.8,3,1,0,113931.57
3,699,0,0,39,1,0.0,2,0,0,93826.63
4,850,2,0,43,2,125510.82,1,1,1,79084.1


These are categorical values so we need to convert it into one hot encoding by using sklearn or we can use pandas get dummies

In [24]:
X = pd.get_dummies(X, drop_first=True, columns=['Geography'])
X.head()

Unnamed: 0,CreditScore,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Geography_1,Geography_2
0,619,0,42,2,0.0,1,1,1,101348.88,0,0
1,608,0,41,1,83807.86,1,0,1,112542.58,0,1
2,502,0,42,8,159660.8,3,1,0,113931.57,0,0
3,699,0,39,1,0.0,2,0,0,93826.63,0,0
4,850,0,43,2,125510.82,1,1,1,79084.1,0,1


### Feature Standardisation

In [25]:
from sklearn.preprocessing import StandardScaler #Standardize features by removing the mean and scaling to unit variance

In [27]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state = 0, stratify = y)

In [28]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
X_train

array([[-1.24021723, -1.09665089,  0.77986083, ...,  1.64099027,
        -0.57812007, -0.57504086],
       [ 0.75974873,  0.91186722, -0.27382717, ..., -1.55587522,
         1.72974448, -0.57504086],
       [-1.72725557, -1.09665089, -0.9443559 , ...,  1.1038111 ,
        -0.57812007, -0.57504086],
       ...,
       [-0.51484098,  0.91186722,  0.87565065, ..., -1.01507508,
         1.72974448, -0.57504086],
       [ 0.73902369, -1.09665089, -0.36961699, ..., -1.47887193,
        -0.57812007, -0.57504086],
       [ 0.95663657,  0.91186722, -1.32751517, ...,  0.50945854,
        -0.57812007,  1.73900686]])

### Building ANN

In [29]:
model = Sequential()
model.add(Dense(X.shape[1], activation='relu', input_dim = X.shape[1]))
model.add(Dense(128, activation='relu')) # hidden layer
model.add(Dense(1, activation = 'sigmoid')) #two output
# if we do not apply a Activation function then the output signal would simply be a simple linear function.A linear function is just a polynomial of one degree

In [31]:
model.compile(optimizer='adam', loss = 'binary_crossentropy', metrics=['accuracy'])

In [34]:
model.fit(X_train, y_train.to_numpy(), batch_size = 10, epochs = 10, verbose =1)

Train on 8000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x237b75e7888>

In [35]:
y_pred = model.predict_classes(X_test)

In [36]:
y_pred

array([[0],
       [0],
       [0],
       ...,
       [0],
       [1],
       [0]])

In [37]:
y_test

1344    1
8167    0
4747    0
5004    1
3124    1
       ..
9107    0
8249    0
8337    0
6279    1
412     0
Name: Exited, Length: 2000, dtype: int64

In [38]:
model.evaluate(X_test, y_test.to_numpy())



[0.3435504441261292, 0.8605]

In [39]:
from sklearn.metrics import confusion_matrix, accuracy_score

In [40]:
confusion_matrix(y_test, y_pred)

array([[1545,   48],
       [ 231,  176]], dtype=int64)

In [41]:
accuracy_score(y_test, y_pred)

0.8605