### Data taken from https://archive.ics.uci.edu/dataset/563/iranian+churn+dataset. It is taken from an Iranian telecom company database. We aim to pedict whether a customer stays with the company (churn = 0) or leave it (churn = 1). For more info on data columns and what they mean, please check the aforementioned link. The data is also available in the repository.

In [1]:
import numpy as np
import pandas as pd
import tensorflow as tf

In [2]:
dataset = pd.read_csv('classification_data.csv')

In [3]:
dataset

Unnamed: 0,Call Failure,Complains,Subscription Length,Charge Amount,Seconds of Use,Frequency of use,Frequency of SMS,Distinct Called Numbers,Age Group,Tariff Plan,Status,Age,Customer Value,Churn
0,8,0,38,0,4370,71,5,17,3,1,1,30,197.640,0
1,0,0,39,0,318,5,7,4,2,1,2,25,46.035,0
2,10,0,37,0,2453,60,359,24,3,1,1,30,1536.520,0
3,10,0,38,0,4198,66,1,35,1,1,1,15,240.020,0
4,3,0,38,0,2393,58,2,33,1,1,1,15,145.805,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3145,21,0,19,2,6697,147,92,44,2,2,1,25,721.980,0
3146,17,0,17,1,9237,177,80,42,5,1,1,55,261.210,0
3147,13,0,18,4,3157,51,38,21,3,1,1,30,280.320,0
3148,7,0,11,2,4695,46,222,12,3,1,1,30,1077.640,0


**Feature matrix and target vector from the data**

In [4]:
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

**Columns that contain binary variables should be labeled with 0 and 1. The "Complains" column already satisfies this. However the columns named "Tariff plan" and "Status" are labeled with 1 and 2 and, therefore, we shift them by 1:**

In [5]:
X[:,9] = X[:,9] - 1
X[:,10] = X[:,10] - 1

**Train  - test splitting**

In [6]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

**Scaling the features via a standard scaler**

In [7]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

**The structure of our neural netwok classifier is determined below. You can add layers, change the number of nodes and/or activation function types and even add dropout regularization to the layers. The output layer is assigned a sigmoid activation function to ensure that the output is always between 0 and 1.**

In [8]:
ann = tf.keras.models.Sequential()
ann.add(tf.keras.layers.Dense(units=6, activation='relu'))
ann.add(tf.keras.layers.Dense(units=6, activation='relu'))
ann.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

**Below, the optimizer is called. For classification problems we use loss functions like binary_crossentropy.**

In [9]:
ann.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

**We use batches with size 32. We also set epoch to 100. You can modify both.**

In [10]:
ann.fit(X_train, y_train, batch_size = 32, epochs = 100)

Epoch 1/100
[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 3ms/step - accuracy: 0.6386 - loss: 0.6743
Epoch 2/100
[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.8595 - loss: 0.4968
Epoch 3/100
[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.8917 - loss: 0.3795
Epoch 4/100
[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.9005 - loss: 0.3108
Epoch 5/100
[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.9010 - loss: 0.2662
Epoch 6/100
[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.8991 - loss: 0.2378
Epoch 7/100
[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.8966 - loss: 0.2352
Epoch 8/100
[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.9012 - loss: 0.2229
Epoch 9/100
[1m79/79[0m [32m━━━━━━━━━━━━━━━━━

<keras.src.callbacks.history.History at 0x781201fcab70>

**The prediction phase. You can compare the predicted and actual values here.**

In [11]:
y_pred = ann.predict(X_test)
y_pred = (y_pred > 0.5).astype(int)


[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step


**Below we show the confusion matrix and accuracy score to evaluate the performance of the network. See if you can make it even more accurate.**

In [13]:
from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[510  11]
 [ 33  76]]


0.9301587301587302