In [1]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


# Project 1 Introduction to Neural Networks and Deep Learning

The case study is from an open-source dataset from Kaggle.

Link to the Kaggle project site: https://www.kaggle.com/barelydedicated/bank-customer-churn-modeling

Given a Bank customer, can we build a classifier that can determine whether they will leave or not using Neural networks?

bank.csv

The points distribution for this case is as follows:

1. Read the dataset
2. Drop the columns which are unique for all users like IDs (2.5 points)
3. Distinguish the feature and target set (2.5 points)
4. Divide the data set into train and test sets
5. Normalize the train and test data (2.5 points)
6. Initialize & build the model (10 points)
7. Optimize the model (5 points)
9. Predict the results using 0.5 as a threshold (5 points)
10. Print the Accuracy score and confusion matrix (2.5 points)

In [0]:
import numpy as np
import pandas as pd

# 1. Read the dataset

In [0]:
bank_data=pd.read_csv("/content/drive/My Drive/Colab Notebooks/LAB/Residency_VI_Ext_LAB/bank.csv")

In [4]:
bank_data.head()

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [5]:
bank_data.shape

(10000, 14)

In [6]:
bank_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
RowNumber          10000 non-null int64
CustomerId         10000 non-null int64
Surname            10000 non-null object
CreditScore        10000 non-null int64
Geography          10000 non-null object
Gender             10000 non-null object
Age                10000 non-null int64
Tenure             10000 non-null int64
Balance            10000 non-null float64
NumOfProducts      10000 non-null int64
HasCrCard          10000 non-null int64
IsActiveMember     10000 non-null int64
EstimatedSalary    10000 non-null float64
Exited             10000 non-null int64
dtypes: float64(2), int64(9), object(3)
memory usage: 1.1+ MB


In [0]:
bank_data_1 = bank_data.drop(['RowNumber', 'CustomerId', 'Surname'], axis = 1)

In [8]:
bank_data_1.shape

(10000, 11)

# Lets convert the Geography and Gender to categorial.

In [0]:
bank_data_1["Geography"] = bank_data_1["Geography"].astype('category')
bank_data_1["Gender"] = bank_data_1["Gender"].astype('category')

In [0]:
bank_data_1["Geography"] = bank_data_1["Geography"].cat.codes
bank_data_1["Gender"] = bank_data_1["Gender"].cat.codes

In [11]:
bank_data_1.head()

Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,619,0,0,42,2,0.0,1,1,1,101348.88,1
1,608,2,0,41,1,83807.86,1,0,1,112542.58,0
2,502,0,0,42,8,159660.8,3,1,0,113931.57,1
3,699,0,0,39,1,0.0,2,0,0,93826.63,0
4,850,2,0,43,2,125510.82,1,1,1,79084.1,0


In [12]:
bank_data_1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 11 columns):
CreditScore        10000 non-null int64
Geography          10000 non-null int8
Gender             10000 non-null int8
Age                10000 non-null int64
Tenure             10000 non-null int64
Balance            10000 non-null float64
NumOfProducts      10000 non-null int64
HasCrCard          10000 non-null int64
IsActiveMember     10000 non-null int64
EstimatedSalary    10000 non-null float64
Exited             10000 non-null int64
dtypes: float64(2), int64(7), int8(2)
memory usage: 722.8 KB


### Observations: After dropping the irrelevant features, we are left with 10 features and a target. Also we have converted the Gender and Geography features to categorical codes.

# 3. Distinguish the feature and the target set

In [0]:
X = bank_data_1.drop(['Exited'], axis=1)

In [0]:
y = bank_data_1["Exited"]

In [15]:
X.head(2)

Unnamed: 0,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary
0,619,0,0,42,2,0.0,1,1,1,101348.88
1,608,2,0,41,1,83807.86,1,0,1,112542.58


In [16]:
y.head(2)

0    1
1    0
Name: Exited, dtype: int64

# 4. Divide the data set into Train and test sets.

In [0]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 7)

In [18]:
X_train.shape

(8000, 10)

In [19]:
y_train.shape

(8000,)

In [20]:
X_test.shape

(2000, 10)

In [21]:
y_test.shape

(2000,)

#### Splitted the date into train (80%) and test (20%).. 8000 records and 10 features in train dataset and 2000 records and 10 features in test datase

# 5. Normalize the train and test data

In [24]:
import tensorflow as tf
print(tf.__version__)

1.15.0


In [0]:
from scipy import stats
X_train_std = stats.zscore(X_train) 
X_test_std = stats.zscore(X_test)

In [0]:
y_train_cat = tf.keras.utils.to_categorical(y_train)
y_test_cat = tf.keras.utils.to_categorical(y_test)

In [27]:
y_train_cat[:3]

array([[1., 0.],
       [1., 0.],
       [0., 1.]], dtype=float32)

Observations: As the dataset have varied scales, normalizing the data will yield better results.Used zscore to normalize the features and have converted both train and test labels into one-hot vectors



# 6. Initialize & build the model

#### Build a neural Network with a binary crossentropy loss function and sgd optimizer in Keras. The output layer with 1 neurons.

In [0]:
model_1 = tf.keras.models.Sequential()

In [29]:
#Input Layer
model_1.add(tf.keras.layers.Dense(10, input_dim = 10, activation='relu'))

Instructions for updating:
If using Keras pass *_constraint arguments to layers.


In [0]:
#Add Dense Layer which provides 1 Output after applying sigmoid (Output Layer)
model_1.add(tf.keras.layers.Dense(2, activation='sigmoid'))

In [31]:
#Compile the model
model_1.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['accuracy'])

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


In [32]:
model_1.fit(X_train_std, y_train_cat, 
          validation_data=(X_test_std, y_test_cat), 
          epochs=30,
          batch_size=35)

Train on 8000 samples, validate on 2000 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<tensorflow.python.keras.callbacks.History at 0x7f7ec8d99358>

In [33]:
model_1.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 10)                110       
_________________________________________________________________
dense_1 (Dense)              (None, 2)                 22        
Total params: 132
Trainable params: 132
Non-trainable params: 0
_________________________________________________________________


Observations: As we have binary classification, we have used binary crossentropy for loss and sigmoid for activation in output layer.

Tried with relu activation in input layer and used the best activation method using grid search.

Same way tried with sgd optimizer and found the best optimizer using grid search.

The accuracy is around 82%

# 7. Optimize the model

In [34]:
from sklearn.model_selection import GridSearchCV
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from keras.optimizers import Nadam
from keras.optimizers import sgd
from keras.layers import Dropout
from keras.constraints import maxnorm

Using TensorFlow backend.


## Lets first findout the best optimizer among 'SGD', 'RMSprop', 'Adagrad', 'Adadelta', 'Adam', 'Adamax', 'Nadam'

In [35]:
def create_model(optimizer='adam'):
  #Initialize Sequential model
  model_2 = Sequential()
  
  #Input Layer
  model_2.add(Dense(10, input_dim = 10, activation='relu'))
  
  #Add Dense Layer which provides 1 Outputs after applying softmax (Output Layer)
  model_2.add(Dense(1, activation='sigmoid'))
  
	#Compile the model
  model_2.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy'])
  
  return model_2

model_2 = KerasClassifier(build_fn=create_model, epochs=30, batch_size=35, verbose=0)


# define the grid search parameters
optimizer = ['SGD', 'RMSprop', 'Adagrad', 'Adadelta', 'Adam', 'Adamax', 'Nadam']
param_grid = dict(optimizer=optimizer)

grid = GridSearchCV(estimator=model_2, param_grid=param_grid, n_jobs=-1, scoring="accuracy", cv=2)
grid_result = grid.fit(X_train_std, y_train)

# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))

means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']

for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))















Best: 0.855375 using {'optimizer': 'Nadam'}
0.829375 (0.001625) with: {'optimizer': 'SGD'}
0.849000 (0.003250) with: {'optimizer': 'RMSprop'}
0.825625 (0.003125) with: {'optimizer': 'Adagrad'}
0.838125 (0.012375) with: {'optimizer': 'Adadelta'}
0.852375 (0.004875) with: {'optimizer': 'Adam'}
0.838250 (0.010750) with: {'optimizer': 'Adamax'}
0.855375 (0.006125) with: {'optimizer': 'Nadam'}


Observations: The best optimizer we have got is Nadam and the accuracy is 85.53%.

The accuracy have increased 3%.

Note: As there is difference in multiclass representation with scikit-learn and keras, we are not going to use the categorical transformation on target variable with gridsearch. If we use the categorical transformation of target variable, we will be ending up with the error, "ValueError: Classification metrics can't handle a mix of multilabel-indicator and binary targets". So with gridsearchcv, we are going to use target variable without categorical transformation.



## Best learning rate

In [37]:
# Tune Learning Rate
from keras.optimizers import Nadam

# Function to create model, required for KerasClassifier
def create_model(learn_rate=0.01):
  #Initialize Sequential model
  model_4 = Sequential()
  #Input Layer
  model_4.add(Dense(10, input_dim = 10, activation='relu'))
  #Add Dense Layer which provides 1 Outputs after applying sigmoid (Output Layer)
  model_4.add(Dense(2, activation='sigmoid'))
	#Comile the model
  optimizer = Nadam(lr=learn_rate)
  model_4.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['accuracy'])
  return model_4

# create model
model_4 = KerasClassifier(build_fn=create_model, epochs=30, batch_size=30, verbose=0)

# define the grid search parameters
learn_rate = [0.001, 0.01, 0.1, 0.2, 0.3]
param_grid = dict(learn_rate=learn_rate)

grid = GridSearchCV(estimator=model_4, param_grid=param_grid, n_jobs=1, cv=2)
grid_result = grid.fit(X_train_std, y_train_cat)

# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))


Best: 0.855312 using {'learn_rate': 0.01}
0.850687 (0.004688) with: {'learn_rate': 0.001}
0.855312 (0.006937) with: {'learn_rate': 0.01}
0.843125 (0.000375) with: {'learn_rate': 0.1}
0.824562 (0.020937) with: {'learn_rate': 0.2}
0.815312 (0.005687) with: {'learn_rate': 0.3}


Observation: The best learning rate we got is 0.01 and the accuracy is 85.53%.

There is a slight increase in accuracy



## Best Batch Size and Number of Epochs

In [38]:
# Tune Batch Size and Number of Epochs

# Function to create model, required for KerasClassifier
def create_model():
  #Initialize Sequential model
  model_3 = Sequential()
  
  #Input Layer
  model_3.add(Dense(30, input_dim = 10, activation='softmax'))
  
  #Dropout
  model_3.add(Dropout(0.2))
  
  #Add Dense Layer which provides 2 Outputs after applying sigmoid (Output Layer)
  model_3.add(Dense(1, activation='sigmoid'))
  
	#Compile the model
  optimizer = Nadam(lr=0.01)
  model_3.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy'])
  
  return model_3

# create model
model_3 = KerasClassifier(build_fn=create_model, verbose=0)

# define the grid search parameters
batch_size = [10, 20, 40, 60, 80, 100]
epochs = [10, 50, 100]
param_grid = dict(batch_size=batch_size, epochs=epochs)

grid = GridSearchCV(estimator=model_3, param_grid=param_grid, n_jobs=1, scoring="accuracy", cv=2)
grid_result = grid.fit(X_train_std, y_train)

# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))



Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
Best: 0.857625 using {'batch_size': 10, 'epochs': 10}
0.857625 (0.005375) with: {'batch_size': 10, 'epochs': 10}
0.850375 (0.005375) with: {'batch_size': 10, 'epochs': 50}
0.843500 (0.008250) with: {'batch_size': 10, 'epochs': 100}
0.855125 (0.004625) with: {'batch_size': 20, 'epochs': 10}
0.849125 (0.004375) with: {'batch_size': 20, 'epochs': 50}
0.852625 (0.004375) with: {'batch_size': 20, 'epochs': 100}
0.856000 (0.002000) with: {'batch_size': 40, 'epochs': 10}
0.855000 (0.004000) with: {'batch_size': 40, 'epochs': 50}
0.851625 (0.005375) with: {'batch_size': 40, 'epochs': 100}
0.855750 (0.005500) with: {'batch_size': 60, 'epochs': 10}
0.855125 (0.005375) with: {'batch_size': 60, 'epochs': 50}
0.851250 (0.007500) with: {'batch_size': 60, 'epochs': 100}
0.855625 (0.004125) with: {'batch_size': 80, 'epochs': 10}
0.853125 (0.006375) with: {'batch_size': 80, 'epochs': 50}


Observations:
We have got, the best batch size as 10 and number of epochs as 10 with accuracy 85.76%.

Now lets build out final model with all the best parameter we have identified


# Final Model based on best optimizer, lerning rate and best batch size and epochs

In [39]:
model_Final = Sequential()
  
#Input Layer
model_Final.add(Dense(30, input_dim = 10, activation='softmax'))
  
#Dropout
model_Final.add(Dropout(0.2))

#Add Dense Layer which provides 10 Outputs
model_Final.add(Dense(30, activation='softmax'))

#Dropout
model_Final.add(Dropout(0.2))
  
#Add Dense Layer which provides 1 Output after applying sigmoid (Output Layer)
model_Final.add(Dense(2, activation='sigmoid'))
 
#Comile the model
optimizer = Nadam(lr=0.01)
model_Final.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy'])
 
model_Final.fit(X_train_std, y_train_cat, 
        validation_data=(X_test_std, y_test_cat), 
        epochs=10,
        batch_size=10)

Train on 8000 samples, validate on 2000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f7ebca03b00>

In [64]:
model_Final.summary()

Model: "sequential_50"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_99 (Dense)             (None, 30)                330       
_________________________________________________________________
dropout_38 (Dropout)         (None, 30)                0         
_________________________________________________________________
dense_100 (Dense)            (None, 30)                930       
_________________________________________________________________
dropout_39 (Dropout)         (None, 30)                0         
_________________________________________________________________
dense_101 (Dense)            (None, 2)                 62        
Total params: 1,322
Trainable params: 1,322
Non-trainable params: 0
_________________________________________________________________


# 9. Predict the results using 0.5 as a threshold

# Prediction for Model_1

In [0]:
y_pred_1 = model_1.predict(X_test_std)

In [43]:
print ("Prediction: ", y_pred_1[:10])

Prediction:  [[0.87802994 0.17836979]
 [0.5547338  0.41941348]
 [0.92430997 0.07918701]
 [0.67053413 0.3698973 ]
 [0.821352   0.20328718]
 [0.85741746 0.13361809]
 [0.9575169  0.04068792]
 [0.96654046 0.02547622]
 [0.9507282  0.04378313]
 [0.5357713  0.5494094 ]]


In [0]:
y_pred_1_threshold = (model_1.predict_proba(X_test_std) >= 0.5)

# Prediction for Final Model

In [0]:
y_pred_Final = model_Final.predict(X_test_std)

In [47]:
print ("Prediction: ", y_pred_Final[:10])

Prediction:  [[0.09267643 0.90823936]
 [0.8048177  0.19680128]
 [0.96092814 0.03905696]
 [0.4446677  0.5561017 ]
 [0.9536138  0.04605022]
 [0.96698797 0.03259215]
 [0.9480878  0.05187801]
 [0.97791576 0.02167287]
 [0.9717783  0.02807376]
 [0.08878946 0.9121022 ]]


In [0]:
y_pred_Final_threshold = (model_Final.predict_proba(X_test_std) >= 0.5)

#10. Print the Accuracy score and confusion matrix

# Print the Accuracy score and confusion matrix for Model_1(before hyper tuning)

In [50]:
# Accuracy score for predictions without threshold

from sklearn import metrics
print("Accuracy score for predictions with no specified thershold for model_1: ", metrics.accuracy_score(y_test_cat, y_pred_1.round()))
print("Accuracy score for predictions with specified threshold 0.5 for model_1: ", metrics.accuracy_score(y_test_cat, y_pred_1_threshold.round()))


Accuracy score for predictions with no specified thershold for model_1:  0.804
Accuracy score for predictions with specified threshold 0.5 for model_1:  0.804


In [52]:
print ("Confusion Matrix for predictions with no specified threshold for Model_1")
pd.DataFrame(metrics.confusion_matrix(y_test_cat.argmax(axis=1), y_pred_1.argmax(axis=1)),
                 columns=['pred_neg', 'pred_pos'], index=['neg', 'pos'])

Confusion Matrix for predictions with no specified threshold for Model_1


Unnamed: 0,pred_neg,pred_pos
neg,1543,46
pos,312,99


In [53]:
print ("Confusion Matrix for predictions with specified threshold 0.5 for model_1")
pd.DataFrame(metrics.confusion_matrix(y_test_cat.argmax(axis=1), y_pred_1_threshold.argmax(axis=1)),
                 columns=['pred_neg', 'pred_pos'], index=['neg', 'pos'])


Confusion Matrix for predictions with specified threshold 0.5 for model_1


Unnamed: 0,pred_neg,pred_pos
neg,1555,34
pos,328,83


In [55]:
from sklearn.metrics import classification_report
print ("Classification Report for predictions with no specified threshold")
print(classification_report(y_test_cat, y_pred_1.round()))

Classification Report for predictions with no specified threshold
              precision    recall  f1-score   support

           0       0.83      0.97      0.89      1589
           1       0.73      0.29      0.41       411

   micro avg       0.82      0.83      0.82      2000
   macro avg       0.78      0.63      0.65      2000
weighted avg       0.81      0.83      0.79      2000
 samples avg       0.82      0.83      0.82      2000



  _warn_prf(average, modifier, msg_start, len(result))


In [56]:
from sklearn.metrics import classification_report
print ("Classification Report for predictions with specified threshold 0.5")
print(classification_report(y_test_cat, y_pred_1_threshold))


Classification Report for predictions with specified threshold 0.5
              precision    recall  f1-score   support

           0       0.83      0.97      0.89      1589
           1       0.73      0.29      0.41       411

   micro avg       0.82      0.83      0.82      2000
   macro avg       0.78      0.63      0.65      2000
weighted avg       0.81      0.83      0.79      2000
 samples avg       0.82      0.83      0.82      2000



  _warn_prf(average, modifier, msg_start, len(result))


# Print the Accuracy score and confusion matrix for Final Model

In [58]:
# Accuracy score for predictions without threshold

from sklearn import metrics
print("Accuracy score for predictions with no specified thershold for model_Final: ", metrics.accuracy_score(y_test_cat, y_pred_Final.round()))
print("Accuracy score for predictions with specified threshold 0.5 for model_Final: ", metrics.accuracy_score(y_test_cat, y_pred_Final_threshold.round()))


Accuracy score for predictions with no specified thershold for model_Final:  0.8565
Accuracy score for predictions with specified threshold 0.5 for model_Final:  0.8565


In [59]:
print ("Confusion Matrix for predictions with no specified threshold for Model_Final")
pd.DataFrame(metrics.confusion_matrix(y_test_cat.argmax(axis=1), y_pred_Final.argmax(axis=1)),
                 columns=['pred_neg', 'pred_pos'], index=['neg', 'pos'])

Confusion Matrix for predictions with no specified threshold for Model_Final


Unnamed: 0,pred_neg,pred_pos
neg,1515,74
pos,211,200


In [60]:
print ("Confusion Matrix for predictions with specified threshold 0.5 for model_Final")
pd.DataFrame(metrics.confusion_matrix(y_test_cat.argmax(axis=1), y_pred_Final_threshold.argmax(axis=1)),
                 columns=['pred_neg', 'pred_pos'], index=['neg', 'pos'])


Confusion Matrix for predictions with specified threshold 0.5 for model_Final


Unnamed: 0,pred_neg,pred_pos
neg,1515,74
pos,211,200


In [61]:
from sklearn.metrics import classification_report
print ("Classification Report for predictions with no specified threshold")
print(classification_report(y_test_cat, y_pred_Final.round()))

Classification Report for predictions with no specified threshold
              precision    recall  f1-score   support

           0       0.88      0.95      0.91      1589
           1       0.73      0.49      0.58       411

   micro avg       0.86      0.86      0.86      2000
   macro avg       0.80      0.72      0.75      2000
weighted avg       0.85      0.86      0.85      2000
 samples avg       0.86      0.86      0.86      2000



In [62]:
from sklearn.metrics import classification_report
print ("Classification Report for predictions with specified threshold 0.5")
print(classification_report(y_test_cat, y_pred_Final_threshold))

Classification Report for predictions with specified threshold 0.5
              precision    recall  f1-score   support

           0       0.88      0.95      0.91      1589
           1       0.73      0.49      0.58       411

   micro avg       0.86      0.86      0.86      2000
   macro avg       0.80      0.72      0.75      2000
weighted avg       0.85      0.86      0.85      2000
 samples avg       0.86      0.86      0.86      2000



# Conclusion: After hypertuning the model, we can improve the accuracy as 85.6% from 80%. And also we can improve the overall precision, recall, f1-score. These resulted values clearly illustarted the improvements applied by hypertuning through grid search. By this model, we can predict the reasonable good numbers of who are all reatin with bank and who are all exit from the bank.