<center><h1 style="font-size:280%; font-family:cursive; background:yellow; color:black; border-radius:10px 10px; padding:10px;">Predicting customer churn</h1></center>



<p style="font-size:150%; font-family:cursive;">Churn prediction means detecting which customers are likely to cancel a subscription to a service based on how they use the service. It is a critical prediction for many businesses because acquiring new clients often costs more than retaining existing ones. Once you can identify those customers that are at risk of cancelling, you should know exactly what marketing action to take for each individual customer to maximise the chances that the customer will remain.
Different customers exhibit different behaviours and preferences, so they cancel their subscriptions for various reasons. It is critical, therefore, to proactively communicate with each of them in order to retain them in your customer list. You need to know which marketing action will be the most effective for each and every customer, and when it will be most effective.

<center><h1 style="font-size:200%; font-family:cursive; background:pink; color:black; border-radius:10px 10px; padding:10px;">Cycle of customer churn </h1></center>



<center><img src="https://miro.medium.com/max/456/1*Dvx1j18vyKyvLlIpxzVSmQ.png"></center>

<center><h1 style="font-size:280%; font-family:cursive; background:yellow; color:black; border-radius:10px 10px; padding:10px;">Why is it so important?</h1></center>



<p style="font-size:150%; font-family:cursive;">Customer churn is a common problem across businesses in many sectors. If you want to grow as a company, you have to invest in acquiring new clients. Every time a client leaves, it represents a significant investment lost. Both time and effort need to be channelled into replacing them. Being able to predict when a client is likely to leave, and offer them incentives to stay, can offer huge savings to a business.
As a result, understanding what keeps customers engaged is extremely valuable knowledge, as it can help you to develop your retention strategies, and to roll out operational practices aimed at keeping customers from walking out the door.
Predicting churn is a fact of life for any subscription business, and even slight fluctuations in churn can have a significant impact on your bottom line. We need to know: “Is this customer going to leave us within X months?” Yes or No? It is a binary classification task.

<center><h1 style="font-size:280%; font-family:cursive; background:yellow; color:black; border-radius:10px 10px; padding:10px;">What are the main challenges?</h1></center>



<p style="font-size:150%; font-family:cursive;">Churn prediction modelling techniques attempt to understand the precise customer behaviours and attributes that signal the risk and timing of customers leaving. It’s not a walk-in-the-park task so I mention just four points to consider.

<li style="font-size:150%; font-family:cursive;">To succeed at retaining customers who are ready to abandon your business, Marketers & Customer Success experts must be able to predict in advance which customers are going to churn and set up a plan of marketing actions that will have the greatest retention impact on each customer. The key here is to to be proactive and engage with these customers. While simple in theory, the realities involved with achieving this “proactive retention” goal are extremely challenging.</li>
<li style="font-size:150%; font-family:cursive;">The accuracy of the technique is critical to the success of any proactive retention efforts. If the Marketer is unaware of a customer about to churn, no action will be taken to retain that customer</li>
<li style="font-size:150%; font-family:cursive;">Special retention-focused offers or incentives may be provided to happy, active customers, resulting in reduced revenues for no good reason.</li>
<li style="font-size:150%; font-family:cursive;">Your churn prediction model should rely on (almost) real-time data to quantify the risk of churning, not on static data. Although you will be able to identify a certain percentage of at-risk customers with even static data, your predictions will be inaccurate.</li>

In [None]:
# import necccessary libraries
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import numpy as np
%matplotlib inline


In [None]:
# Read the Data 
df=pd.read_csv("../input/telco-customer-churn/WA_Fn-UseC_-Telco-Customer-Churn.csv")

In [None]:
# Show Starting 5 Rows
df.head()


In [None]:
df.shape

In [None]:
# Remove customerID column because it is useless
df.drop("customerID",axis=1,inplace=True)
df.dtypes

<center><h1 style="font-size:140%; font-family:cursive; background:Skyblue; color:black; border-radius:10px 10px; padding:10px;">Quick glance at above makes me realize that TotalCharges should be float but it is an object. Let's check what's going on with this column</h1></center>
​

In [None]:
df.TotalCharges.values


<center><h1 style="font-size:140%; font-family:cursive; background:Skyblue; color:black; border-radius:10px 10px; padding:10px;">Lets convert it to numbers</h1></center>
​

In [None]:
#pd.to_numeric("TotalCharges")


<center><h1 style="font-size:140%; font-family:cursive; background:Skyblue; color:black; border-radius:10px 10px; padding:10px;">Here you can see some values seems to be not numbers but blank string. Let's find out such rows</h1></center>
​

In [None]:
pd.to_numeric(df.TotalCharges,errors="coerce").isnull()

In [None]:
df[pd.to_numeric(df.TotalCharges,errors="coerce").isnull()]

In [None]:
df["TotalCharges"][488]

<center><h1 style="font-size:140%; font-family:cursive; background:Skyblue; color:black; border-radius:10px 10px; padding:10px;"> Remove rows with space in TotalCharges
</h1></center>

In [None]:
df1=df[df.TotalCharges!=' ']
df1.shape

In [None]:
df1.dtypes

In [None]:
df1.TotalCharges=pd.to_numeric(df1.TotalCharges)

In [None]:
df1.TotalCharges.dtypes

<center><h1 style="font-size:180%; font-family:cursive; background:pink; color:black; border-radius:10px 10px; padding:10px;">Data Visualization</h1></center>
​

In [None]:
MonthlyCharges_churn_no=df1[df1.Churn=="No"].MonthlyCharges
MonthlyCharges_churn_yes=df1[df1.Churn=="Yes"].MonthlyCharges
plt.xlabel("MonthlyCharges")
plt.ylabel("No of Customers")
plt.title("Customer Churn Prediction")
plt.hist([MonthlyCharges_churn_yes, MonthlyCharges_churn_no], rwidth=0.95, color=['green','red'],label=['Churn=Yes','Churn=No'])
plt.legend()

<center><h1 style="font-size:140%; font-family:cursive; background:Skyblue; color:black; border-radius:10px 10px; padding:10px;">  Let's Do it for Tenure</h1></center>

In [None]:
tanure_churn_no=df1[df1.Churn=="No"].tenure
tanure_churn_yes=df1[df1.Churn=="Yes"].tenure
plt.xlabel("Tenure")
plt.ylabel("No of Customers")
plt.title("Customer Churn Prediction")
plt.hist([tanure_churn_yes, tanure_churn_no], rwidth=0.95, color=['green','red'],label=['Churn=Yes','Churn=No'])
plt.legend()

In [None]:
for i in df1.columns:
    if df1[i].dtypes=="object":
        print(f'{i}: {df1[i].unique()}')


<center><h1 style="font-size:140%; font-family:cursive; background:Skyblue; color:black; border-radius:10px 10px; padding:10px;"> Some of the columns have no internet service or no phone service, that can be replaced with a simple No</h1></center>

In [None]:
df1.replace('No internet service','No',inplace=True)
df1.replace('No phone service','No',inplace=True)

In [None]:
# Replace Value of " Yes" and " No" to 1 and 0
yes_no_columns = ['Partner','Dependents','PhoneService','MultipleLines','OnlineSecurity','OnlineBackup',
                  'DeviceProtection','TechSupport','StreamingTV','StreamingMovies','PaperlessBilling','Churn']
for i in yes_no_columns:
    df1[i].replace({"Yes":1,"No":0},inplace=True)

In [None]:
for i in df1.columns:
    print(df1[i].unique())

In [None]:
df1['gender'].replace({'Female':1,'Male':0},inplace=True)


In [None]:
df1.gender.unique()


<center><h1 style="font-size:140%; font-family:cursive; background:Skyblue; color:black; border-radius:10px 10px; padding:10px;"> One hot encoding for categorical columns</h1></center>

In [None]:
df2 = pd.get_dummies(data=df1, columns=['InternetService','Contract','PaymentMethod'])
df2.columns

In [None]:
df2.head()

In [None]:
df2.dtypes

In [None]:
cols_to_scale = ['tenure','MonthlyCharges','TotalCharges']

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
df2[cols_to_scale] = scaler.fit_transform(df2[cols_to_scale])

In [None]:
df2.head(5)

In [None]:
for col in df2.columns:
    print(f'{col} : {df2[col].unique()}')



<center><h1 style="font-size:140%; font-family:cursive; background:Skyblue; color:black; border-radius:10px 10px; padding:10px;"> Train Test Split</h1></center>

In [None]:
X = df2.drop('Churn',axis='columns')
y = df2['Churn']

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=5)

In [None]:
X_train[:10]

In [None]:
import tensorflow as tf
from tensorflow import keras
model=keras.Sequential([
    keras.layers.Dense(26,input_shape=(26,),activation="relu"),
    keras.layers.Dense(15,activation="relu"),
    keras.layers.Dense(1,activation="sigmoid")
])
# opt = keras.optimizers.Adam(learning_rate=0.01)
model.compile(optimizer="adam",loss="binary_crossentropy",metrics=["accuracy"])
model.fit(X_train,y_train,epochs=100)

In [None]:
model.evaluate(X_test, y_test)

In [None]:
# Let do it for test data
y_predict = model.predict(X_test)
y_predict[:5]

In [None]:
y_p=[]
for i in y_predict:
    if i >= 0.5:
        y_p.append(1)
    else:
        y_p.append(0)

In [None]:
y_p[:10]

In [None]:
y_test[:10]

In [None]:
from sklearn.metrics import confusion_matrix ,classification_report
print(classification_report(y_test,y_p))

In [None]:
import seaborn as sns
confusion_metrix=tf.math.confusion_matrix(labels=y_test,predictions=y_p)
plt.figure(figsize = (10,7))
sns.heatmap(confusion_metrix, annot=True, fmt='d')
plt.xlabel('Predicted')
plt.ylabel('Truth')

In [None]:
# Accuracy
round((873+210)/(198+126+873+210),2)