In [1]:
##### Project:- Binary Classification on ‘Customer_Churn’using Keras #####

In [None]:
##### Problem Statement:-
##### You are the Data Scientist at a telecom company “Leo” whose customers are churning out to its competitors. You have to 
##### analyse the data of your company and find insights and stop your customers from churning out to other telecom companies.

In [2]:
##### Domain:– Telecom

##### Domain Context:–
##### Customer churn, in simple terms means that the customer has stopped doing business with the company and this is a common 
##### problem when it comes to telecom industries. To avoid this, companies use predictive analysis to gauge the factors 
##### responsible for a customer to leave the company. These churn prediction models help in finding out the customer base that 
##### are most likely to churn out.

In [4]:
##### Let's import the necessary libraries.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import io
import os
import sys
import time
import json
import re
from IPython.display import display
from time import strftime, gmtime

from sklearn.preprocessing import StandardScaler, OneHotEncoder, MinMaxScaler, KBinsDiscretizer, LabelEncoder
##### Column Transformer
from sklearn.compose import ColumnTransformer

In [None]:
##### Let's import the dataset.

customer = pd.read_csv('customer_churn.csv')

In [3]:
##### A. Data Manipulation:-

In [None]:
##### a. Find the total number of male customers.

sum(customer['gender']=="Male")

In [None]:
##### b. Find the total number of customers whose Internet Service is ‘DSL’.

sum(customer['InternetService']=="DSL")

In [None]:
##### c. Extract all the Female senior citizens whose Payment Method is Mailed check & store the result in ‘new_customer’.

new_customer=customer[(customer['gender']=='Female') &
(customer['SeniorCitizen']==1) & (customer['PaymentMethod']=='Mailed check')]
new_customer.head()

In [None]:
##### d. Extract all those customers whose tenure is less than 10 months or their Total charges is less than 500$ & store the 
#####    result in ‘new_customer’.

new_customer=customer[(customer['tenure']<10) | (customer['TotalCharges']<500)]
new_customer.head()

In [None]:
##### B. Data Visualization:-

In [None]:
##### a. Build a pie-chart to show the distribution of customers would be churning out.

names = customer["Churn"].value_counts().keys().tolist()
sizes= customer["Churn"].value_counts().tolist()

## We are starting off by extracting the names of the levels in the churn column, then we extracting the counts of the levels in
## the churn column.

plt.pie(sizes,labels=names,autopct="%0.1f%%")
plt.show()

## Using plt.pie(), we are making the pie-chart. ‘autopct’ parameter is used to add the percentage distribution in the plot.

In [None]:
##### b. Build a bar-plot to show the distribution of ‘Internet Service’.

plt.bar(customer['InternetService'].value_counts().keys().tolist(),customer['InternetServic
e'].value_counts().tolist(),color='orange')
                                                                            
## We are creating the bar-plot using plt.bar()
                                                                            
plt.xlabel('Categories of Internet Service')
plt.ylabel('Count of categories')
plt.title('Distribution of Internet Service')
plt.show() 

## Going ahead, we are assigning the x-label, y-label and title to the plot.

In [None]:
##### C. Model Building:-

In [None]:
##### a. Build a sequential model using Keras, to find out if the customerwouldchurn or not, using ‘tenure’ as the feature and 
#####    ‘Churn’ as the dependent/target column:-

##### i. The visible/input layer should have 12 nodes with ‘Relu’ as activation function.
##### ii. This model would have 1 hidden layer with 8 nodes and ‘Relu’ as activation function
##### iii. Use ‘Adam’ as the optimization algorithm
##### iv. Fit the model on the train set, with number of epochs to be 150
##### v. Predict the values on the test set and build a confusion matrix
##### vi. Plot the ‘Accuracy vs Epochs’ graph

In [None]:
x=customer[['tenure']]
y=customer[['Churn']]

## We are starting off by extracting the target and feature columns.

from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.30,random_state=42)

## Going ahead, we are dividing the data into train and test sets using train_test_split().
## Here, we are setting the test_size to be 0.30, which means 30% of the records go into the test set, while 70% of the records 
## go into the train set.

from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(12, input_dim=1, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

## After that we create an instance of a sequential model by using Sequential().
## Going ahead we will add the input layer to our model. This input layer would comprise of 12 nodes and would have ‘relu’ as 
## the activation function. After that we’ll add a hidden layer with 8 nodes and ‘relu’ as activation function. Finally, we’ll
## add the output layer which would comprise of just one node and ‘sigmoid’ as activation function.
## We are using ‘sigmoid’ here because this is a binary classification problem and ‘sigmoid’gives us a probability between0 & 1.

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

## Further, we’ll tune the model. Here, we are using ‘binary_crossentropy’ as our loss function because this is a binary 
## classification problem.
## Optimizer used is ‘adam’ and we would want to calculate the accuracy.

model.fit(x_train, y_train, epochs=150,validation_data=(x_test,y_test))

## Going ahead, we will fit the model on the train set and evaluate it on top of the test set. The number of epochs given over 
## here is 150.
## This gives us a final validation accuracy of 75.64%. But this is not the average accuracy across 150 epochs, so let’s also 
## find that:-

import numpy as np
np.mean(model.history.history['val_acc'])

## So, the mean accuracy comes out to be 75.62%.

y_pred=model.predict_classes(x_test)
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test,y_pred)

## Further, we will, predict the values on ‘x_test’ and build a confusion matrix with the actual values and the predicted values.

from matplotlib import pyplot as plt
plt.plot(model.history.history['acc'])
plt.plot(model.history.history['val_acc'])
plt.show()

In [None]:
##### b. Build the 2nd model using same target and feature variables:-

##### i. Add a drop-out layer after the input layer with drop-out value of 0.3
##### ii. Add a drop-out layer after the hidden layer with drop-out value of 0.2
##### iii. Predict the values on the test set and build a confusion matrix
##### iv. Plot the ‘Accuracy vs Epochs’ graph

In [None]:
model = Sequential()
model.add(Dense(12, input_dim=1, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(8, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1, activation='sigmoid'))

## Now, we are building our 2nd model, where we are adding a drop-out layer after the input layer and the hidden layer.
## Drop-out value of 0.3 means that 70% of the nodes in the input layer will be dropped out.
## Drop-out value of 0.2 means that 80% of the nodes in the hidden layer will be dropped out.

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=150,validation_data=(x_test,y_test))
y_pred = model.predict_classes(x_test)
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test,y_pred)
from matplotlib import pyplot as plt
plt.plot(model.history.history['acc'])
plt.plot(model.history.history['val_acc'])
plt.show()

## After this, we have fit the model and predicted the values.
## So, we see that the 2nd model gives us a final validation accuracy of 73.41%. Now, let’s calculate the mean validation 
## accuracy across 150 epochs:

import numpy as np
np.mean(model.history.history['val_acc'])

## So, the mean accuracy comes out to be 73.42%.

## By looking at this graph, we can infer that the validation accuracy is constantly 73.41%. Now, this tells us that something 
## is wrong with our model.

## The most probable explanation for this is the drop-out percentage is very high for the input layer and the hidden layer and 
## thus the model which we have built might be underfitting the data.

In [None]:
##### c. Build the 3rd model using ‘Tenure’, ’Monthly Charges’ & ‘Total Charges’ as the features and ‘Churn’ as the dependent/
#####    target column:-

##### i. The visible/input layer should have 12 nodes with ‘Relu’ as activation function.
##### ii. This model would have 1 hidden layer with 8 nodes and ‘Relu’ as activation function
##### iii. Use ‘Adam’ as the optimization algorithm
##### iv. Fit the model on the train set, with number of epochs to be 150
##### v. Predict the values on the test set and build a confusion matrix
##### vi. Plot the ‘Accuracy vs Epochs’ graph

In [None]:
x=customer[['MonthlyCharges','tenure','TotalCharges']]#Features
y=customer[['Churn']]#Target

## This time, we are taking ‘Monthly Charges’, ‘Total Charges’ and ‘Tenure’ as the features and ‘Churn’ as the target.

from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.30,random_state=42)
model = Sequential()
model.add(Dense(12, input_dim=3, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=150,validation_data=(x_test,y_test))
y_pred = model.predict_classes(x_test)
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test,y_pred)

from matplotlib import pyplot as plt
plt.plot(model.history.history['acc'])
plt.plot(model.history.history['val_acc'])
plt.show()

## After this, we divide the data into train and test sets and build the model on train test and predict the values on the test
## set.

## So, we see that we get a final validation accuracy of 78.58%.

## But, when we look at this graph, we see that there is a constant fluctuation in the validation accuracy.

## So, let’s find out the mean validation accuracy across 150 epochs:
import numpy as np
np.mean(model.history.history['val_acc'])

## And this gives a mean validation accuracy of 74.24%

In [None]:
##### Conclusion:-

##### The first model gave us a mean validation accuracy of 75.62%, the second model had accuracy of 73.42 and the third model 
##### had a mean validation accuracy of 74.24%.

##### The second model gave us the least accuracy because we added two dropout layers with high probabilities of dropout.

##### Now, there could be many factors why third model’s accuracy was less than that of first model. Most probably one or more 
##### of the features used during the model building could be of less significance leading to the reduction in accuracy.

##### It should also be kept in mind that these accuracy values are very specific to the hyperparameters used during the model 
##### building process such as optimizers, activation functions and number of epochs. If we were to tweak these hyperparameters
##### we would get completely different accuracy values for all the three models.