# Bank Customer Churn Prediction Model
## Using Keras Tuner in Google Colab

Hyperparameters Tuned:
- Number of Hidden Layers
- Number of Neurons in Hiden Layer
- Learning Rate
- Number of epochs

### Test wether GPU is working? 

In [2]:
import tensorflow as tf
tf.test.gpu_device_name()

'/device:GPU:0'

### Which GPU we are using?

In [3]:
from tensorflow.python.client import device_lib
device_lib.list_local_devices()

[name: "/device:CPU:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 17573428563776177508, name: "/device:GPU:0"
 device_type: "GPU"
 memory_limit: 14674281152
 locality {
   bus_id: 1
   links {
   }
 }
 incarnation: 2447474448896226743
 physical_device_desc: "device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5"]

### RAM Information

In [4]:
!cat /proc/meminfo

MemTotal:       13333580 kB
MemFree:         9565812 kB
MemAvailable:   12032576 kB
Buffers:           81180 kB
Cached:          2396224 kB
SwapCached:            0 kB
Active:          1267768 kB
Inactive:        2069136 kB
Active(anon):     685096 kB
Inactive(anon):    10608 kB
Active(file):     582672 kB
Inactive(file):  2058528 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:               812 kB
Writeback:             0 kB
AnonPages:        859500 kB
Mapped:           584348 kB
Shmem:             11272 kB
Slab:             184016 kB
SReclaimable:     132880 kB
SUnreclaim:        51136 kB
KernelStack:        5056 kB
PageTables:         8684 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     6666788 kB
Committed_AS:    3808516 kB
VmallocTotal:   34359738367 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
Percpu:             1048 kB
AnonHugePages:   

### CPU Information

In [5]:
! cat /proc/cpuinfo

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 79
model name	: Intel(R) Xeon(R) CPU @ 2.20GHz
stepping	: 0
microcode	: 0x1
cpu MHz		: 2199.998
cache size	: 56320 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat md_clear arch_capabilities
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa
bogomips	: 4399.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 b

# Artificial Neural Network with Hyperparameter Optimization using Keras Tuner

### Data Preprocessing

In [7]:
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

In [8]:
dataset = pd.read_csv('/content/drive/MyDrive/Google Colab Notebooks/ANN/Churn_Modelling/Churn_Modelling.csv')
dataset.head()

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [9]:
dataset.shape

(10000, 14)

In [10]:
# As there is no impact of RowNumber, CustomerId and Surname on the model prediction
# So these columns will be removed
X = dataset.iloc[:, 3:13]
y = dataset.iloc[:, 13]

In [11]:
# Create dummy variables
geography=pd.get_dummies(X["Geography"],drop_first=True)
gender=pd.get_dummies(X['Gender'],drop_first=True)

In [12]:
# Concatenate the Data Frames
X=pd.concat([X,geography,gender],axis=1)

In [13]:
# Drop Unnecessary columns
X=X.drop(['Geography','Gender'],axis=1)

In [14]:
X.head()

Unnamed: 0,CreditScore,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Germany,Spain,Male
0,619,42,2,0.0,1,1,1,101348.88,0,0,0
1,608,41,1,83807.86,1,0,1,112542.58,0,1,0
2,502,42,8,159660.8,3,1,0,113931.57,0,0,0
3,699,39,1,0.0,2,0,0,93826.63,0,0,0
4,850,43,2,125510.82,1,1,1,79084.1,0,1,0


In [15]:
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

In [16]:
# In ANN neuron wights get multiplied with input, so it is necessary to scale the inputs to a common scale.
# Also it helps in easier multiplication as I/Ps are scaled down
# It also helps in back propogation as derivatives can be easily found with smaller values.
# As a result, convergence will happen quickly.

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

In [17]:
X_train

array([[ 0.16958176, -0.46460796,  0.00666099, ..., -0.5698444 ,
         1.74309049, -1.09168714],
       [-2.30455945,  0.30102557, -1.37744033, ...,  1.75486502,
        -0.57369368,  0.91601335],
       [-1.19119591, -0.94312892, -1.031415  , ..., -0.5698444 ,
        -0.57369368, -1.09168714],
       ...,
       [ 0.9015152 , -0.36890377,  0.00666099, ..., -0.5698444 ,
        -0.57369368,  0.91601335],
       [-0.62420521, -0.08179119,  1.39076231, ..., -0.5698444 ,
         1.74309049, -1.09168714],
       [-0.28401079,  0.87525072, -1.37744033, ...,  1.75486502,
        -0.57369368, -1.09168714]])

### Defining the Model

In [51]:
# The Keras Tuner has four tuners available - RandomSearch, Hyperband, BayesianOptimization, and Sklearn. 
import tensorflow
from tensorflow import keras
from tensorflow.keras import layers
from kerastuner.tuners import RandomSearch, Hyperband

In [42]:
def build_model(hp):
    # Initialising the ANN. This will create an empty neural network
    model = keras.Sequential()

    # Create 2 to 20 hidden layers with 32 to 512 neurons in each layer
    for i in range(hp.Int('num_layers', 2, 20)):
        model.add(layers.Dense(units=hp.Int('units_' + str(i), min_value=32, max_value=512, step=32), activation='relu'))
    
    # Create output layer. As this is a binary classification problem, so sigmoid AF is used in the output layer
    model.add(layers.Dense(1, activation='sigmoid'))
    
    # Use learning rate as hyperparameter in compiling ANN
    hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])
    model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_learning_rate), loss='binary_crossentropy', metrics=['accuracy'])
    
    return model

### Instantiate the RandomSearch Tuner and perform Hypertuning

In [45]:
# Each ANN created above will have 3*5=15 iterations, and it will checked if it is working fine or not
# All the weights get stored in the directory mentioned as the trainign is happening

tuner_rs = RandomSearch(
    build_model,
    objective='val_accuracy',
    max_trials=5,
    executions_per_trial=3,
    directory='ANN_Project',
    project_name='Churn_RS')

In [44]:
tuner_rs.search_space_summary()

Search space summary
Default search space size: 4
num_layers (Int)
{'default': None, 'conditions': [], 'min_value': 2, 'max_value': 20, 'step': 1, 'sampling': None}
units_0 (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 512, 'step': 32, 'sampling': None}
units_1 (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 512, 'step': 32, 'sampling': None}
learning_rate (Choice)
{'default': 0.01, 'conditions': [], 'values': [0.01, 0.001, 0.0001], 'ordered': True}


In [46]:
tuner_rs.search(X_train, y_train, epochs=5, validation_data=(X_test, y_test))

Trial 5 Complete [00h 00m 12s]
val_accuracy: 0.8618333339691162

Best val_accuracy So Far: 0.8629999955495199
Total elapsed time: 00h 01m 10s
INFO:tensorflow:Oracle triggered exit


In [47]:
# Objective is 'val_accuracy' and direction is to 'max' it.
# Result shows top 10 best results
tuner_rs.results_summary()

Results summary
Results in ANN_Project/Churn
Showing 10 best trials
Objective(name='val_accuracy', direction='max')
Trial summary
Hyperparameters:
num_layers: 17
units_0: 480
units_1: 128
learning_rate: 0.001
units_2: 128
units_3: 96
units_4: 96
units_5: 224
units_6: 512
units_7: 96
units_8: 320
units_9: 128
units_10: 32
units_11: 32
units_12: 32
units_13: 32
units_14: 32
units_15: 32
units_16: 32
Score: 0.8629999955495199
Trial summary
Hyperparameters:
num_layers: 12
units_0: 320
units_1: 192
learning_rate: 0.0001
units_2: 96
units_3: 416
units_4: 288
units_5: 480
units_6: 384
units_7: 96
units_8: 384
units_9: 32
units_10: 416
units_11: 320
units_12: 192
units_13: 480
units_14: 480
units_15: 288
units_16: 96
Score: 0.8629999955495199
Trial summary
Hyperparameters:
num_layers: 10
units_0: 320
units_1: 256
learning_rate: 0.001
units_2: 64
units_3: 160
units_4: 96
units_5: 256
units_6: 416
units_7: 384
units_8: 224
units_9: 480
units_10: 320
units_11: 448
units_12: 320
units_13: 448
unit

From the RandomSearch tuning, best result is :
- Having an Accuracy Score: 0.8629999955495199 
- Has 17 Hidden layers
- Has 480 neurons in layer1, 128 neurons in layer2 and so on...
- Best learning rate is 0.001

### Instantiate the Hyperband Tuner and perform Hypertuning

In [50]:
# The Hyperband tuning algorithm uses adaptive resource allocation and early-stopping to quickly converge on a high-performing model. 
# This is done using a sports championship style bracket. 
# The algorithm trains a large number of models for a few epochs and carries forward only the top-performing half of models to the next round.
# Hyperband determines the number of models to train in a bracket by computing 1 + logfactor(max_epochs) and rounding it up to the nearest integer.
tuner_hb= Hyperband(build_model,
                     objective='val_accuracy',
                     max_epochs=10,
                     factor=3,
                     directory='ANN_Project',
                     project_name='Churn_HB')

In [52]:
# Creating a callback to stop training early after reaching a certain value for the validation loss.
stop_early = tensorflow.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)

In [53]:
tuner_hb.search_space_summary()

Search space summary
Default search space size: 4
num_layers (Int)
{'default': None, 'conditions': [], 'min_value': 2, 'max_value': 20, 'step': 1, 'sampling': None}
units_0 (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 512, 'step': 32, 'sampling': None}
units_1 (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 512, 'step': 32, 'sampling': None}
learning_rate (Choice)
{'default': 0.01, 'conditions': [], 'values': [0.01, 0.001, 0.0001], 'ordered': True}


In [54]:
tuner_hb.search(X_train, y_train, epochs=5, validation_data=(X_test, y_test), validation_split=0.2, callbacks=[stop_early])

Trial 30 Complete [00h 00m 04s]
val_accuracy: 0.815625011920929

Best val_accuracy So Far: 0.862500011920929
Total elapsed time: 00h 01m 37s
INFO:tensorflow:Oracle triggered exit


In [55]:
tuner_hb.results_summary()

Results summary
Results in ANN_Project/Churn_HB
Showing 10 best trials
Objective(name='val_accuracy', direction='max')
Trial summary
Hyperparameters:
num_layers: 19
units_0: 192
units_1: 288
learning_rate: 0.0001
units_2: 224
units_3: 160
units_4: 64
units_5: 192
units_6: 384
units_7: 160
units_8: 352
units_9: 480
units_10: 192
units_11: 256
units_12: 448
units_13: 288
units_14: 384
units_15: 512
units_16: 416
units_17: 192
units_18: 512
units_19: 448
tuner/epochs: 10
tuner/initial_epoch: 0
tuner/bracket: 0
tuner/round: 0
Score: 0.862500011920929
Trial summary
Hyperparameters:
num_layers: 7
units_0: 512
units_1: 64
learning_rate: 0.001
tuner/epochs: 10
tuner/initial_epoch: 4
tuner/bracket: 2
tuner/round: 2
units_2: 32
units_3: 32
units_4: 32
units_5: 32
units_6: 32
tuner/trial_id: 1695351795952ea81b4f4a612f5bf382
Score: 0.8600000143051147
Trial summary
Hyperparameters:
num_layers: 4
units_0: 64
units_1: 352
learning_rate: 0.001
units_2: 128
units_3: 448
units_4: 288
units_5: 448
units_

From the Hyperband tuning, best result is :
- Having an Accuracy Score: 0.862500011920929
- Has 19 Hidden layers
- Has 192 neurons in layer1, 288 neurons in layer2 and so on...
- Best learning rate is 0.0001
- Number of epochs is 10


### Train the model based on best Hyperparameters obtained

In [56]:
# Get the optimal hyperparameters
best_hps=tuner_hb.get_best_hyperparameters(num_trials=1)[0]

In [58]:
# Build the model with the optimal hyperparameters and train it on the data for 50 epochs
model = tuner.hypermodel.build(best_hps)
history = model.fit(X_train, y_train, epochs=50, validation_split=0.2)

val_acc_per_epoch = history.history['val_accuracy']
best_epoch = val_acc_per_epoch.index(max(val_acc_per_epoch)) + 1
print('Best epoch: %d' % (best_epoch,))

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Best epoch: 9


In [59]:
# Re-instantiating the hypermodel and train it with the optimal number of epochs from above.
hypermodel = tuner.hypermodel.build(best_hps)
hypermodel.fit(X_test, y_test, epochs=best_epoch)

Epoch 1/9
Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9


<tensorflow.python.keras.callbacks.History at 0x7fc887e033d0>

In [61]:
eval_result = hypermodel.evaluate(X_test, y_test)
print("[Test Loss, Test Accuracy]:", eval_result)

[Test Loss, Test Accuracy]: [0.1707678586244583, 0.940500020980835]


## Result
- Accuracy rate of 94.05% has been achieved with hypertuning
- Without hypertuning it was only 86.40%