#Churn Bank Customer Model Prediction/ANN

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


###ABSTRACT

In this project, we focus on predicting whether customers will leave a bank or not. We have various details about each customer, such as credit score, country, gender, age, tenure (how long they've been with the bank), and more. By analyzing this information, we build a model using Artificial Neural Networks (ANN) to forecast if a customer is likely to leave the bank. We'll be considering factors like their balance, number of products used, and credit card ownership. Our goal is to create a tool that helps the bank identify and retain customers at risk of leaving, ultimately improving customer satisfaction and the bank's overall performance.

Importing necessary libraries

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Reading the dataset

In [None]:
df=pd.read_csv("/content/drive/MyDrive/ML 1/Churn_Predictions.csv")

Displaying the first few rows of the dataset

In [None]:
df.head()

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


Checking data types of each column

In [None]:
df.dtypes

RowNumber            int64
CustomerId           int64
Surname             object
CreditScore          int64
Geography           object
Gender              object
Age                  int64
Tenure               int64
Balance            float64
NumOfProducts        int64
HasCrCard            int64
IsActiveMember       int64
EstimatedSalary    float64
Exited               int64
dtype: object

Checking for missing values

In [None]:
df.isna().any()

RowNumber          False
CustomerId         False
Surname            False
CreditScore        False
Geography          False
Gender             False
Age                False
Tenure             False
Balance            False
NumOfProducts      False
HasCrCard          False
IsActiveMember     False
EstimatedSalary    False
Exited             False
dtype: bool

Checking for columns with zeros

In [None]:
(df==0).any()

RowNumber          False
CustomerId         False
Surname            False
CreditScore        False
Geography          False
Gender             False
Age                False
Tenure              True
Balance             True
NumOfProducts      False
HasCrCard           True
IsActiveMember      True
EstimatedSalary    False
Exited              True
dtype: bool

Handling zero values in 'Tenure' and 'Balance' columns

In [None]:
df['Tenure']=df['Tenure'].replace([0],df['Tenure'].mean())
df['Balance']=df['Balance'].replace([0],df['Balance'].mean())

Rechecking for zero values

In [None]:
(df==0).any()

RowNumber          False
CustomerId         False
Surname            False
CreditScore        False
Geography          False
Gender             False
Age                False
Tenure             False
Balance            False
NumOfProducts      False
HasCrCard           True
IsActiveMember      True
EstimatedSalary    False
Exited              True
dtype: bool

Data Preprocessing: Label Encoding for 'Geography' and 'Gender' columns

In [None]:
from sklearn.preprocessing import LabelEncoder

le=LabelEncoder()
df['Geography']=le.fit_transform(df['Geography'])
df['Gender']=le.fit_transform(df['Gender'])

In [None]:
df.head()

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,0,0,42,2.0,76485.889288,1,1,1,101348.88,1
1,2,15647311,Hill,608,2,0,41,1.0,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,0,0,42,8.0,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,0,0,39,1.0,76485.889288,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,2,0,43,2.0,125510.82,1,1,1,79084.1,0


Splitting the dataset into features (X) and target variable (y)

In [None]:
X=df.iloc[:,3:-1].values
y=df.iloc[:,-1].values

In [None]:
X.shape

(10000, 10)

Splitting the dataset into training and testing sets

In [None]:
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2)

Feature Scaling using StandardScaler

In [None]:
from sklearn.preprocessing import StandardScaler

sc=StandardScaler()
X_train=sc.fit_transform(X_train)
X_test=sc.fit_transform(X_test)

Building the Artificial Neural Network (ANN) Model

In [None]:
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD

In [None]:
ann=Sequential()

Input Layer: 10 neurons with ReLU activation

In [None]:
ann.add(Dense(10,input_dim=10,activation='relu'))

Hidden Layer: 8 neurons with ReLU activation

In [None]:
ann.add(Dense(8,activation='relu'))

Output Layer: 1 neuron with Sigmoid activation

In [None]:
ann.add(Dense(1,activation='sigmoid'))

Compiling the ANN

In [None]:
optimizer=SGD(learning_rate=0.01)
ann.compile(optimizer=optimizer,loss='binary_crossentropy',metrics=['accuracy'])

Training the ANN

In [None]:
ann.fit(X_train,y_train,epochs=100)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<keras.src.callbacks.History at 0x7eeaa4068ee0>

Making predictions on the test set

In [None]:
y_pred=ann.predict(X_test)
y_pred=(y_pred>0.5)
y_pred



array([[False],
       [False],
       [False],
       ...,
       [False],
       [False],
       [False]])

Evaluating the model on the test set

In [None]:
loss,accuracy=ann.evaluate(X_test,y_test)



Accuracy calculation using scikit-learn

In [None]:
from sklearn.metrics import accuracy_score

print("Testing Accuracy:", accuracy_score(y_pred,y_test))

Testing Accuracy: 0.8605
