# Bank Customer Classification

It is no secret that customer retention is a top priority for many companies, acquiring new customers can be several times more expensive than retaining existing ones. A churn model can predict - which customers will be leaving the bank.

In this dataset, we have to consider which of the factors may play a role in someone exiting a bank. To do that we must look at all the column and infer whether it will matter in classifying a new customer or not. The information about a customer is entailed in columns 0 through 12 (RowNumber-EstimatedSalary), while the output (whether the customer exited or not) is stored in the 13th row (Exited).
Ten real-valued features are computed for each cell nucleus:

-a) radius (mean of distances from center to points on the perimeter) 
-b) texture (standard deviation of gray-scale values) 
-c) perimeter 
-d) area 
-e) smoothness (local variation in radius le

The label associated with each record contains the letter "1" if customer Leaves and "M" if 0 if customer stays.

Data can be downloded from: https://www.kaggle.com/aakash50897/churn-modellingcsv

## Loading the data
To load the data and format it nicely, we will use two very useful packages called Pandas and Numpy.

In [28]:
# Importing pandas and numpy
import pandas as pd
import numpy as np
from keras.utils import to_categorical
df = pd.read_csv('Churn_Modelling.csv')
df[:5]

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


# Remove the unnecessary column from the data

In [29]:
df=df.drop(['RowNumber', 'CustomerId', 'Surname'], axis=1)

## Splitting the data into features and targets (labels)
Now, we'll split the data into features (X) and targets (y).


In [30]:
#create a dataframe with all training data except the target column
X = df.drop(columns=['Exited'])

Y=df.Exited

### Encoding categorical data…
Now we encode the string values in the features to numerical values as a ML Algorithm can only work on numbers

In [31]:
import numpy as np
from keras.utils import to_categorical
X = pd.concat([X, pd.get_dummies(X['Geography'])], axis=1)
X = X.drop(['Geography'], axis=1)
X = pd.concat([X, pd.get_dummies(X['Gender'])], axis=1)
X = X.drop(['Gender'], axis=1)
X[:10]

Unnamed: 0,CreditScore,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,France,Germany,Spain,Female,Male
0,619,42,2,0.0,1,1,1,101348.88,1,0,0,1,0
1,608,41,1,83807.86,1,0,1,112542.58,0,0,1,1,0
2,502,42,8,159660.8,3,1,0,113931.57,1,0,0,1,0
3,699,39,1,0.0,2,0,0,93826.63,1,0,0,1,0
4,850,43,2,125510.82,1,1,1,79084.1,0,0,1,1,0
5,645,44,8,113755.78,2,1,0,149756.71,0,0,1,0,1
6,822,50,7,0.0,2,1,1,10062.8,1,0,0,0,1
7,376,29,4,115046.74,4,1,0,119346.88,0,1,0,1,0
8,501,44,4,142051.07,2,0,1,74940.5,1,0,0,0,1
9,684,27,2,134603.88,1,1,1,71725.73,1,0,0,0,1


## Splitting the data into Train and Test 

In [32]:
from sklearn.model_selection import train_test_split
xTrain, xTest, yTrain, yTest = train_test_split(X, Y, test_size = 0.2, random_state = 1)

## Feature Scaling

In [33]:

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
xTrain = sc.fit_transform(xTrain)
xTest = sc.transform(xTest)

# Defining the  model architecture
Here's where we use Keras to build our neural network.

This remains an unanswered question till date that how many nodes of the hidden layer do we actually need?
There is no thumb rule but you can set the number of nodes in Hidden Layers as an Average of the number of Nodes in Input and Output Layer Respectively.(Works in 90% of the cases!!)
    -Here avg= (11+1)/2==>6 So set Output Dim=6
    -Init will initialize the Hidden Layer weights uniformly
    -Activation Function is Rectifier Activation Function(Relu)

In [34]:
import keras
from keras.models import Sequential
from keras.layers import Dense

classifier = Sequential()
classifier.add(Dense(activation = 'relu', input_dim = 13, units=13, kernel_initializer='uniform'))
classifier.add(Dense(activation = 'relu', units=6, kernel_initializer='uniform')) 
classifier.add(Dense(activation = 'sigmoid', units=1, kernel_initializer='uniform'))

classifier.compile(optimizer='adam', loss = 'binary_crossentropy', metrics=['accuracy'])

Instructions for updating:
Colocations handled automatically by placer.


In [36]:
classifier.fit(xTrain, yTrain, batch_size=10, epochs=50)

Instructions for updating:
Use tf.cast instead.
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x7fe9040027f0>