# **Task 3: Customer Churn Prediction (Bank Customers)**

# **Introduction & Problem Statement**

Customer churn refers to when existing customers stop using a company’s services. For a bank, predicting churn is highly important because retaining customers is cheaper and more valuable than acquiring new ones.

In this task, we use to build a classification model using Artificial Neural Networks (ANN) that predicts whether a customer will leave the bank based on demographic and financial information.

# **Import Libraries**

In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import tensorflow
import keras
from keras.layers import Dense
from keras.models import Sequential
import pickle
import sklearn

# **Import DataSet**

In [None]:
data=pd.read_csv("/content/Churn_Modelling.csv")

In [None]:
data.head(3)

Unnamed: 0,CreditScore,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,619,42,2,0.0,1,1,1,101348.88,1
1,608,41,1,83807.86,1,0,1,112542.58,0
2,502,42,8,159660.8,3,1,0,113931.57,1


# **Check Missing Values**

In [None]:
data.isnull().sum()

Unnamed: 0,0
CreditScore,0
Age,0
Tenure,0
Balance,0
NumOfProducts,0
HasCrCard,0
IsActiveMember,0
EstimatedSalary,0
Exited,0


The dataset does not contain missing values, but a check must be performed.

# **Input & Output**

In [None]:
X=data.drop("Exited",axis=1)
y=data["Exited"]

In [None]:
X

Unnamed: 0,CreditScore,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary
0,619,42,2,0.00,1,1,1,101348.88
1,608,41,1,83807.86,1,0,1,112542.58
2,502,42,8,159660.80,3,1,0,113931.57
3,699,39,1,0.00,2,0,0,93826.63
4,850,43,2,125510.82,1,1,1,79084.10
...,...,...,...,...,...,...,...,...
9995,771,39,5,0.00,2,1,0,96270.64
9996,516,35,10,57369.61,1,1,1,101699.77
9997,709,36,7,0.00,1,0,1,42085.58
9998,772,42,3,75075.31,2,1,0,92888.52


# **Spliting the data into training and testing**

In [None]:
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=42)

# **Scale Data**
ANN performs better with normalized inputs → StandardScaler was applied.

In [None]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)

In [None]:
# Save the scaler using Pickle
with open('scaler.pkl', 'wb') as f:
    pickle.dump(scaler, f)

# **Make Architecture Of ANN  Model**

In [None]:
ann=Sequential()

In [None]:
ann.add(Dense(8,input_dim=8,activation="relu"))
ann.add(Dense(6,activation="relu"))
ann.add(Dense(4,activation="relu"))
ann.add(Dense(1,activation="sigmoid"))

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


# **Compile the Model**

In [None]:
ann.compile(optimizer="adam",loss="binary_crossentropy",metrics=["accuracy"])

# **Train The Model**

In [None]:
ann.fit(X_train,y_train,batch_size=100,epochs=50)

Epoch 1/50
[1m80/80[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3ms/step - accuracy: 0.4086 - loss: 0.7654
Epoch 2/50
[1m80/80[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7736 - loss: 0.6080
Epoch 3/50
[1m80/80[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7961 - loss: 0.4865
Epoch 4/50
[1m80/80[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7865 - loss: 0.4770
Epoch 5/50
[1m80/80[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7949 - loss: 0.4453
Epoch 6/50
[1m80/80[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7893 - loss: 0.4492
Epoch 7/50
[1m80/80[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7960 - loss: 0.4316
Epoch 8/50
[1m80/80[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8081 - loss: 0.4126
Epoch 9/50
[1m80/80[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[

<keras.src.callbacks.history.History at 0x7fa1085f8310>

# **Make Prediction**

In [None]:
X_test = scaler.transform(X_test)

In [None]:
pred=ann.predict(X_test)

[1m63/63[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step


In [None]:
pred

array([[0.05926437],
       [0.03218027],
       [0.07540065],
       ...,
       [0.6453797 ],
       [0.15301114],
       [0.1760318 ]], dtype=float32)

# **Testing Accuracy**

In [None]:
pred=ann.predict(X_test)
pred_data=[]
for i in pred:
  if i[0]>0.5:
    pred_data.append(1)
  else:
    pred_data.append(0)

[1m63/63[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step


In [None]:
from sklearn.metrics import accuracy_score

In [None]:
accuracy_score(y_test,pred_data)*100

85.0

# **Training Accuracy**

In [None]:
pred=ann.predict(X_train)
pred_data1=[]
for i in pred:
  if i[0]>0.5:
    pred_data1.append(1)
  else:
    pred_data1.append(0)

[1m250/250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step


In [None]:
accuracy_score(y_train,pred_data1)*100

85.925

# **Save Model**

In [None]:
import pickle
with open('Churn_model.pkl', 'wb') as f:
    pickle.dump(ann, f)

# **Conclusion – Key Insights**


The ANN model successfully predicts customer churn with good accuracy.

ANN captured non-linear relations better than simple models.

The model can help banks target at-risk customers and improve retention strategies.