### Customer Churn dataset
Customer churn refers to phenomenon where customers discontinue their relationship with the company


_In this notebook we will implement backpropagation of MLP to get an idea about __Backpropagation__ and then in next notebooks understand that deeply_

Get the data:

In [118]:
import kagglehub
import os
import shutil

# Download Customer churn dataset latest version
# This will store dataset in " C:\Users\Arun\.cache\kagglehub\datasets\muhammadshahidazeem\customer-churn-dataset\versions\1"
path = kagglehub.dataset_download("muhammadshahidazeem/customer-churn-dataset")

print("Path to dataset files:", path)

my_path = r"C:\Users\Arun\Documents\Documents\Deep Learning\1_Introduction_To_ANNs\datasets\customer_churn"
os.makedirs(my_path, exist_ok=True)

for file_name in os.listdir(path):
    src = os.path.join(path,file_name)
    dst = os.path.join(my_path,file_name)
    shutil.copy(src=src, dst=dst)

Path to dataset files: C:\Users\Arun\.cache\kagglehub\datasets\muhammadshahidazeem\customer-churn-dataset\versions\1


In [119]:
import pandas as pd
data = pd.read_csv(r"datasets\customer_churn\customer_churn_dataset-training-master.csv")
data.head()

Unnamed: 0,CustomerID,Age,Gender,Tenure,Usage Frequency,Support Calls,Payment Delay,Subscription Type,Contract Length,Total Spend,Last Interaction,Churn
0,2.0,30.0,Female,39.0,14.0,5.0,18.0,Standard,Annual,932.0,17.0,1.0
1,3.0,65.0,Female,49.0,1.0,10.0,8.0,Basic,Monthly,557.0,6.0,1.0
2,4.0,55.0,Female,14.0,4.0,6.0,18.0,Basic,Quarterly,185.0,3.0,1.0
3,5.0,58.0,Male,38.0,21.0,7.0,7.0,Standard,Monthly,396.0,29.0,1.0
4,6.0,23.0,Male,32.0,20.0,5.0,8.0,Basic,Monthly,617.0,20.0,1.0


_Getting info:_

In [120]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 440833 entries, 0 to 440832
Data columns (total 12 columns):
 #   Column             Non-Null Count   Dtype  
---  ------             --------------   -----  
 0   CustomerID         440832 non-null  float64
 1   Age                440832 non-null  float64
 2   Gender             440832 non-null  object 
 3   Tenure             440832 non-null  float64
 4   Usage Frequency    440832 non-null  float64
 5   Support Calls      440832 non-null  float64
 6   Payment Delay      440832 non-null  float64
 7   Subscription Type  440832 non-null  object 
 8   Contract Length    440832 non-null  object 
 9   Total Spend        440832 non-null  float64
 10  Last Interaction   440832 non-null  float64
 11  Churn              440832 non-null  float64
dtypes: float64(9), object(3)
memory usage: 40.4+ MB


_Using describe method:_

In [121]:
data.describe()

Unnamed: 0,CustomerID,Age,Tenure,Usage Frequency,Support Calls,Payment Delay,Total Spend,Last Interaction,Churn
count,440832.0,440832.0,440832.0,440832.0,440832.0,440832.0,440832.0,440832.0,440832.0
mean,225398.667955,39.373153,31.256336,15.807494,3.604437,12.965722,631.616223,14.480868,0.567107
std,129531.91855,12.442369,17.255727,8.586242,3.070218,8.258063,240.803001,8.596208,0.495477
min,2.0,18.0,1.0,1.0,0.0,0.0,100.0,1.0,0.0
25%,113621.75,29.0,16.0,9.0,1.0,6.0,480.0,7.0,0.0
50%,226125.5,39.0,32.0,16.0,3.0,12.0,661.0,14.0,1.0
75%,337739.25,48.0,46.0,23.0,6.0,19.0,830.0,22.0,1.0
max,449999.0,65.0,60.0,30.0,10.0,30.0,1000.0,30.0,1.0


_Check lables:_

In [122]:
print("Lables to classify:", data['Churn'].unique())

Lables to classify: [ 1.  0. nan]


_Drop null value:_

In [123]:
data[data.isna().any(axis=1)]

Unnamed: 0,CustomerID,Age,Gender,Tenure,Usage Frequency,Support Calls,Payment Delay,Subscription Type,Contract Length,Total Spend,Last Interaction,Churn
199295,,,,,,,,,,,,


In [124]:
data.drop(199295,axis=0,inplace=True)

_drop 'CustomerID' too, its of no use_

In [126]:
data.drop('CustomerID',axis=1,inplace=True)

_This is a big data we will use a portion of it:_

In [128]:
# separate X,y
X = data.iloc[:,:-1]
y = data['Churn']

In [156]:
from sklearn.model_selection import train_test_split
import numpy as np
X_train, X_remove, y_train, y_remove = train_test_split(X,y,train_size=0.03,shuffle=True,stratify=y)

_Correlation:_

In [132]:
## Feature importances/ Correlation
numerical_features = ['Age','Tenure','Usage Frequency','Support Calls','Payment Delay','Total Spend','Last Interaction','Churn']
corr_mat = data[numerical_features].corr()['Churn']
corr_mat.sort_values(ascending=False)

Churn               1.000000
Support Calls       0.574267
Payment Delay       0.312129
Age                 0.218394
Last Interaction    0.149616
Usage Frequency    -0.046101
Tenure             -0.051919
Total Spend        -0.429355
Name: Churn, dtype: float64

_Data Preparation_:
- Scaling and Encoding

In [133]:
print(X_train.columns)

Index(['Age', 'Gender', 'Tenure', 'Usage Frequency', 'Support Calls',
       'Payment Delay', 'Subscription Type', 'Contract Length', 'Total Spend',
       'Last Interaction'],
      dtype='object')


In [138]:
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder, StandardScaler

numerical_cols = ['Age','Tenure','Usage Frequency','Support Calls','Payment Delay','Total Spend','Last Interaction']
categorical_cols = ['Gender','Subscription Type','Contract Length']

prep_pipe = ColumnTransformer([
    ('scaler',StandardScaler(),numerical_cols),
    ('ohe',OneHotEncoder(),categorical_cols)
])

X_train_prepared = prep_pipe.fit_transform(X_train)

In [140]:
##See the result
column_names = (
    list(prep_pipe.named_transformers_['scaler'].get_feature_names_out(numerical_cols))
    + list(prep_pipe.named_transformers_['ohe'].get_feature_names_out(categorical_cols))
)

df_transformed = pd.DataFrame(X_train_prepared, columns=column_names)
df_transformed

Unnamed: 0,Age,Tenure,Usage Frequency,Support Calls,Payment Delay,Total Spend,Last Interaction,Gender_Female,Gender_Male,Subscription Type_Basic,Subscription Type_Premium,Subscription Type_Standard,Contract Length_Annual,Contract Length_Monthly,Contract Length_Quarterly
0,0.383276,-1.701330,-1.715750,-0.515331,-1.334050,-2.070395,1.793230,1.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0
1,-0.018312,-1.585217,0.024201,-0.841817,-0.246547,0.090399,1.096548,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0
2,1.828992,1.607888,0.024201,-0.188845,-1.092383,-2.203186,1.096548,1.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0
3,0.784864,1.143436,-0.787776,-0.188845,-1.575718,-1.731983,-0.877386,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0
4,-0.098630,0.446759,0.836178,2.096556,-0.246547,0.372290,1.677117,0.0,1.0,0.0,1.0,0.0,0.0,0.0,1.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13219,-1.062440,-0.714370,0.488188,-1.168303,-0.971549,0.154467,0.283752,0.0,1.0,0.0,1.0,0.0,0.0,0.0,1.0
13220,-0.660852,-1.352991,-1.483757,-0.841817,-0.125713,-0.363027,-1.225727,1.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0
13221,-0.580535,1.607888,-0.091796,-0.188845,0.478456,0.031571,-1.341841,0.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0
13222,-1.062440,-0.656314,-1.483757,-1.168303,0.478456,1.166561,-1.574068,0.0,1.0,1.0,0.0,0.0,1.0,0.0,0.0


### Build the architecture

In [141]:
import tensorflow
from tensorflow import keras
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Input

In [145]:
#There are two types of models: Sequential-layers stacked on after another, Non-Sequential
model = Sequential()  


#first layer is input layer with 15 neurons(for 15 inputs)
#activation function used is sigmoid
model.add(Input(shape=(X_train_prepared.shape[1],)))

#We adding dense layer: each in a layer is connected to each neuron in previous layer
#Hidden layer with 15 neurons
model.add(Dense(15,activation='sigmoid'))

#Output layer
model.add(Dense(1,activation='sigmoid'))

In [146]:
##Summary about model
model.summary()

- _There are 225 weights and 15 bias in layer 1 (from input to hidden layer)_
- _There are 15 weights and 1 bias in layer 2  (from hidden layer to output layer)_

In [147]:
##Compile the model
model.compile(loss='binary_crossentropy',optimizer='Adam',metrics=['accuracy'])  
# binary_crossentropy means log-loss function (which is for binary classification problems as we are facing)
# optimizer = 'Adam', there are more optimiers, but Adam is good for SGD

In [149]:
#Train the model
model.fit(X_train_prepared,y_train,batch_size=50,epochs=50,verbose=1,validation_split=0.2)

Epoch 1/50
[1m212/212[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.7308 - loss: 0.5835 - val_accuracy: 0.8408 - val_loss: 0.4846
Epoch 2/50
[1m212/212[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8652 - loss: 0.4238 - val_accuracy: 0.8733 - val_loss: 0.3653
Epoch 3/50
[1m212/212[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8782 - loss: 0.3391 - val_accuracy: 0.8828 - val_loss: 0.3057
Epoch 4/50
[1m212/212[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8846 - loss: 0.2979 - val_accuracy: 0.8900 - val_loss: 0.2744
Epoch 5/50
[1m212/212[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8895 - loss: 0.2764 - val_accuracy: 0.8938 - val_loss: 0.2575
Epoch 6/50
[1m212/212[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8910 - loss: 0.2643 - val_accuracy: 0.8960 - val_loss: 0.2473
Epoch 7/50
[1m212/212[0m 

<keras.src.callbacks.history.History at 0x11d184da780>

In [159]:
## The information about trained model is stored in "layers"
print(model.layers)

print(model.layers[1].get_weights())

[<Dense name=dense_6, built=True>, <Dense name=dense_7, built=True>]
[array([[ 2.6344788],
       [ 3.02962  ],
       [-1.9124633],
       [-1.3744726],
       [ 1.1410298],
       [ 4.32654  ],
       [ 1.6716506],
       [ 1.9446043],
       [ 2.7251005],
       [ 3.0100431],
       [ 2.542097 ],
       [-1.7008858],
       [ 2.882056 ],
       [-1.4749913],
       [-1.3744146]], dtype=float32), array([0.10938305], dtype=float32)]


- _There are 225 weights and 15 bias in layer 1 (from input to hidden layer)_
- _There are 15 weights and 1 bias in layer 2  (from hidden layer to output layer)_

_Prepare Test data:_

In [161]:
# Preparing test data for evaluations
test_data = pd.read_csv(r"C:\Users\Arun\Documents\Documents\Deep Learning\1_Introduction_To_ANNs\datasets\customer_churn\customer_churn_dataset-testing-master.csv")

X_test = test_data.iloc[:,1:-1]
y_test = test_data['Churn']

X_test_preped = prep_pipe.transform(X_test)

_Make predictions:_

In [162]:
y_pred_prob = model.predict(X_test_preped)
y_pred_prob

[1m2012/2012[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 632us/step


array([[0.9999997 ],
       [1.        ],
       [0.88793147],
       ...,
       [0.99999994],
       [1.        ],
       [1.        ]], shape=(64374, 1), dtype=float32)

Since we used 'sigmoid' activation function the output is 'probability'.   
Make a function to get binary output for binary classification

In [163]:
(y_pred_prob<0.5).sum()

np.int64(4112)

In [164]:
import numpy as np
y_pred = np.where(y_pred_prob>0.5, 1,0)

In [165]:
np.unique(y_pred,return_counts=True)

(array([0, 1]), array([ 4112, 60262]))

_Check accuracy:_

In [166]:
from sklearn.metrics import accuracy_score
accuracy_score(y_test,y_pred)

0.5339267406095629

# I GIVE UP, THE ACCURACY IS NOT INCREASING
# I AM DOING SOMETHING WRONG HERE, I WILL CHECK IT OUT LATER 
# DAMNNNNNNNNNNNNN BHAI MERTE NO HERA