# Customer Churn Rate


**Wikipedia** defines **[churn rate](https://en.wikipedia.org/wiki/Churn_rate) (sometimes called attrition rate), in its broadest sense, is a measure of the number of individuals or items moving out of a collective group over a specific period. It is one of two primary factors that determine the steady-state level of customers a business will support.**

The term is used in many contexts, but is most widely applied in business with respect to a contractual customer base, for example in businesses with a subscriber-based service model such as mobile telephone networks and pay TV operators. The term is also used to refer to participant turnover in peer-to-peer networks. Churn rate is an input into customer lifetime value modeling, and can be part of a simulator used to measure return on marketing investment using marketing mix modeling.

---

# Objective

Predict customers who are likely to leave the network so that we can specifically target them and try retaining them.

# Benefit
Retention cost is always less than customer acquisition cost. If we focus and try to retain customers who might leave but have not yet left, we could save a substantial amount of money.

---

### Dataset info
This dataset contains information regarding telecom subscribers. Based on this information, we will build a model to identify customers  who are most likely to leave the network to some other service provider.

The datasets can be found [here](https://www.kaggle.com/blastchar/telco-customer-churn).

The data set includes information about:

- Customers who left within the last month – the column is called Churn
- Services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device -protection, tech support, and streaming TV and movies
- Customer account information – how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges
- Demographic info about customers – gender, age range, and if they have partners and dependents

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pprint #better dictionary printing
import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
import plotly.tools as tls
import plotly.figure_factory as ff
import plotly.express as px

# Data Overview

Let us get to know our data.

In [None]:
churn = pd.read_csv('../input/telco-customer-churn/WA_Fn-UseC_-Telco-Customer-Churn.csv')

In [None]:
churn.head()

In [None]:
churn.info()

In [None]:
print('Rows: ', churn.shape[0])
print('Columns: ', churn.shape[1])
print('\nFeatures:\n ', churn.columns.tolist())
print('\nUnique Value Count:\n ', churn.nunique())
print('\nMissing Value:\n ', churn.isnull().sum())

In [None]:
def values(cols):
    d = {}
    for col in cols:
        x = churn[col].unique()
        d[col] = x
    pprint.pprint(d)

In [None]:
print('Feature Values: \n')
values(churn.columns)

# Data Manipulation

In [None]:
churn.TotalCharges.min()

TotalCharges column contains spaces which we will replace with nan.

In [None]:
churn['TotalCharges'] = churn['TotalCharges'].replace(' ',np.nan)

In [None]:
churn.head()

In [None]:
churn = churn[churn['TotalCharges'].notnull()]

In [None]:
churn['TotalCharges'] = churn['TotalCharges'].astype(float)

Replacing 'No Internet Service' to 'No' in columns:
- DeviceProtection
- OnlineBackup
- OnlineSecurity
- StreamingMovies
- StreamingTV
- TechSupport

In [None]:
churn['SeniorCitizen'] = churn['SeniorCitizen'].replace({1: 'Yes', 0: 'No'})

In [None]:
churn['tenure'].min()

In [None]:
churn['tenure'].max()

In [None]:
def tenure_slabs(value):
    if value <= 12:
        return 'ten_0-12'
    elif (value > 12) & (value <= 24):
        return 'ten_12-24'
    elif (value > 24) & (value <= 36):
        return 'ten_24-36'
    elif (value > 36) & (value <= 48):
        return 'ten_36-48'
    elif (value > 48) & (value <= 60):
        return 'ten_48-60'
    elif (value > 60) & (value <= 72):
        return 'ten_60-72'

In [None]:
churn['tenure_duration'] = churn['tenure'].apply(tenure_slabs) #to categorical column

# EDA

In [None]:
def make_df(data, col):
    df = pd.DataFrame(data[col].value_counts(normalize = True)*100)
    df = df.reset_index()
    return df

In [None]:
gen = make_df(churn, 'gender')
gen.head()

In [None]:
px.bar(gen, x = 'index', y = 'gender', title = 'Gender Distribution: Overall data')

We have almost equal number of male and females in the dataset.

In [None]:
sen = make_df(churn, 'SeniorCitizen')
sen.head()

In [None]:
px.bar(sen, x = 'index', y = 'SeniorCitizen', title = 'Senior Citizen Distribution: Overall data')

~16% of the users are senior citizens.

In [None]:
labels = churn['Churn'].value_counts().keys().tolist()
vals = churn['Churn'].value_counts().values.tolist()

fig = go.Figure(data = go.Pie(labels = labels, values = vals))
fig.update_traces(hoverinfo = 'label+value', marker = dict(colors = ['rgb(124,185,232)', 'gold']), hole = .5)
fig.update(layout_title_text = 'Customer Churn Data: Overall data', layout_showlegend = True)
fig.show()

## Deep Dive

Let us now deep dive by separating the dataset based on customers who have left the telecom network and those who have stayed with the network.

In [None]:
churn.shape

In [None]:
y_churn = churn[churn['Churn'] == 'Yes'] #customers who have left the network
n_churn = churn[churn['Churn'] == 'No'] #customers who have stayed with the network

In [None]:
print('Number of people who left the telecom:', y_churn.shape[0])
print('Number of people who did not left the telecom:', n_churn.shape[0])

In [None]:
def plot_pie(labels, values):
    fig = go.Figure(data = go.Pie(labels = labels, values = values))
    fig.update_traces(hoverinfo='label+value', marker = dict(colors = ['royal blue', 'gold']), hole = .5)
    fig.show()
    
def label(col, churn):
    if churn == 1:
        x = y_churn[col].value_counts().keys().tolist()
        return x
    else:
        x = n_churn[col].value_counts().keys().tolist()
        return x

def values(col, churn):
    if churn == 1:
        x = y_churn[col].value_counts().values.tolist()
        return x
    else: 
        x = n_churn[col].value_counts().values.tolist()
        return x

In [None]:
from plotly.subplots import make_subplots

In [None]:
def make_pies(column, title):
    specs = [[{'type':'domain'}, {'type':'domain'}]]
    colors = ['rgb(124,185,232)','rgb(255,213,0)','rgb(25,77,0)','rgb(255,126,0)','rgb(153,255,102)']
    fig = make_subplots(rows = 1, cols = 2, specs = specs)
    fig.add_trace(go.Pie(labels = label(column, 1), values = values(column, 1), name = 'Churn', marker_colors = colors), 1,1)
    fig.add_trace(go.Pie(labels = label(column, 0), values = values(column, 0), name = 'Non-churn', marker_colors = colors), 1,2)
    fig.update_traces(hoverinfo = 'label+value', hole = 0.6)
    fig.update(layout_title_text = title +': Churn Vs. Non-churn customers')
    fig.update_layout(annotations = [dict(text = 'Churn',x=0.18, y=0.5, font_size=20, showarrow=False),
                                    dict(text = 'Non-churn',x=0.85, y=0.5, font_size=20, showarrow=False)])
    fig.show()

In [None]:
make_pies('gender', 'Gender')

In [None]:
make_pies('SeniorCitizen', 'Senior Citizen')

Let's plot people with/without internet service.

In [None]:
make_pies('DeviceProtection', 'Device Protection')

In [None]:
make_pies('OnlineBackup', 'Online Backup')

In [None]:
make_pies('OnlineSecurity', 'Online Security')

In [None]:
make_pies('StreamingMovies', 'Streaming Movies')

In [None]:
make_pies('StreamingTV','Streaming TV')

In [None]:
make_pies('TechSupport', 'Tech Support')

In [None]:
make_pies('Contract','Contract')

In [None]:
make_pies('SeniorCitizen', 'Senior Citizen')

In [None]:
make_pies('Partner','Partner')

In [None]:
make_pies('Dependents', 'Dependents')

In [None]:
make_pies('PhoneService', 'Phone Service')

In [None]:
make_pies('PaperlessBilling', 'Paperless Billing')

In [None]:
make_pies('PaymentMethod', 'Payment Method')

In [None]:
def make_hist(column, title):
    #fig = make_subplots(rows = 1, cols = 2)
    fig = go.Figure()
    fig.add_trace(go.Histogram(x = n_churn[column], name = 'Non-churn'))
    fig.add_trace(go.Histogram(x = y_churn[column], name = 'Churn'))
    #fig.append_trace(h1, 1,1)
    #fig.append_trace(h2, 1,2)
    fig.update_layout(title_text = title+': Churn Vs. Non-churn customers', 
                      xaxis_title_text = 'Value', yaxis_title_text = 'Count',
                     bargap = 0.2,
                     bargroupgap = 0.1
                     )
    fig.show()

In [None]:
make_hist('TotalCharges', 'Total Charge')

In [None]:
make_hist('tenure_duration', 'Tenure Duration')

In [None]:
avg_charges = churn.groupby('tenure_duration').mean().reset_index()

## change

In [None]:
fig = px.bar(avg_charges, x = 'tenure_duration', y = 'MonthlyCharges')
fig.show()

In [None]:
churn.columns.tolist()

# Preprocessing

In [None]:
churn.nunique()

In [None]:
category_cols = []
for col in churn.columns.tolist():
    if (churn[col].nunique() <= 6):
        category_cols.append(col)
print(category_cols)

We will now use LabelEncoder to encode our data in numeric form which is necessary for ML models.

In [None]:
from sklearn import preprocessing

churn2 = churn.copy()
le = preprocessing.LabelEncoder()
churn2[category_cols] = churn2[category_cols].apply(le.fit_transform)
churn2.head()

We will scale numeric value columns.

In [None]:
numeric_cols = ['tenure', 'MonthlyCharges', 'TotalCharges']

In [None]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

scaled_values = scaler.fit_transform(churn2[numeric_cols])
scaled_values = pd.DataFrame(scaled_values, columns = numeric_cols)
scaled_values.head()

In [None]:
scaled_values.isnull().sum()

In [None]:
churn2.isnull().sum()

In [None]:
churn2 = churn2.drop(columns = numeric_cols, axis = 1)
churn2 = churn2.merge(scaled_values, how = 'left', left_index = True, right_index = True)
churn2.head()
churn2 = churn2.dropna()

In [None]:
correlation = churn2.corr()
correlation

In [None]:
corr_col = correlation.columns.tolist()


In [None]:
fig = go.Figure(data = go.Heatmap(z = correlation,
                                 x = corr_col,
                                 y = corr_col)
               )

fig.update_layout(title = 'Correlation Matrix', width = 800, height = 800)
fig.update_xaxes(tickangle = 90)
fig.show()

# Modelling

Now we have converted all the columns in numeric form and can start modelling.

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix,accuracy_score,classification_report
from sklearn.metrics import roc_auc_score,roc_curve,scorer

In [None]:
t_cols = []
for i in churn2.columns:
    if (i != 'Churn') & (i != 'customerID'):
        t_cols.append(i)

In [None]:
train_data = churn2[t_cols]
target = churn2['Churn']

In [None]:
x_train, x_test, y_train, y_test = train_test_split(train_data, target, test_size = 0.3, random_state = 1)

lr = LogisticRegression(solver = 'liblinear')

lr.fit(x_train, y_train)

In [None]:
predictions_lr = lr.predict(x_test)

In [None]:
probs = lr.predict_proba(x_test)

In [None]:
acc_lr = lr.score(x_test, y_test)
print('The accuracy of this model is:', round(acc_lr, 3)*100, '%')
print('\n')

print('Classification Report:\n')
clf_report_lr = classification_report(y_test, predictions_lr)
print(clf_report_lr)
print('\n')

con_matrix = confusion_matrix(y_test, predictions_lr)
print('Confusion Matrix:\n')
print(con_matrix)
print('\n')

roc_auc = roc_auc_score(y_test, predictions_lr)
print('Area under the curve:',roc_auc)

## Visualizing Model Performance 

In [None]:
#plotting confusion matrix
fig = go.Figure(data = go.Heatmap(z = con_matrix,
                                 x = ['Not Churn', 'Churn'],
                                 y = ['Not Churn', 'Churn'],
                                 colorscale = 'Cividis',
                                 showscale = False))
fig.update_layout(title = 'Confusion Matrix')
fig.show()

In [None]:
import scikitplot as skplt

In [None]:
#plotting ROC curve
skplt.metrics.plot_roc(y_test, probs, figsize = (8,8), title = 'ROC Curves: Logistic Regression Model')

# Decision Tree Model

In [None]:
from sklearn.tree import DecisionTreeClassifier

In [None]:
tree = DecisionTreeClassifier()

In [None]:
tree.fit(x_train, y_train)

In [None]:
predict_tree = tree.predict(x_test)

In [None]:
def model_metrics(algo, x, y, preds):
    score = algo.score(x,y)
    print('The accuracy of this model is:', round(score, 3)*100, '%')
    print('\n')
    
    print('Classification Report:\n')
    clf_report = classification_report(y, preds)
    print(clf_report_lr)
    print('\n')
    
    con_matrix = confusion_matrix(y, preds)
    print('Confusion Matrix:\n')
    print(con_matrix)
    print('\n')
    
    roc_auc = roc_auc_score(y, preds)
    print('Area under the curve:',roc_auc)
    
    return con_matrix

In [None]:
c = model_metrics(tree, x_test, y_test, predict_tree )

In [None]:
def plot_confusion(con_mat, model_name):
    fig = go.Figure(data = go.Heatmap(z = con_mat,
                                      x = ['Not Churn','Churn'],
                                      y = ['Not Churn','Churn'],
                                      colorscale = 'Cividis',
                                      showscale = False
                                     ))
    fig.update_layout(title = 'Confusion Matrix: '+ model_name)
    fig.show()

In [None]:
plot_confusion(c, 'Decision Tree')

# Recursive Feature Elimination

In [None]:
from sklearn.feature_selection import RFE

lr = LogisticRegression(solver = 'liblinear')
rfe = RFE(lr,10)
rfe = rfe.fit(x_train, y_train.values.ravel())

rfe.support_
rfe.ranking_


In [None]:
churn_rfe = pd.DataFrame({"rfe_support" :rfe.support_,
                       "columns" : [i for i in churn2.columns if i not in churn2[['customerID', 'Churn']]],
                       "ranking" : rfe.ranking_,
                      })
rfe_cols = churn_rfe[churn_rfe["rfe_support"] == True]["columns"].tolist()

In [None]:
rfe_cols

### New train and test data based on RFE

In [None]:
rfe_xtrain = x_train[rfe_cols]
rfe_ytrain = y_train
rfe_xtest = x_test[rfe_cols]
rfe_ytest = y_test

In [None]:
lr.fit(rfe_xtrain, rfe_ytrain)

In [None]:
predictions_rfe_lr = lr.predict(rfe_xtest)
probs_rfe = rfe.predict_proba(x_test)

In [None]:
print(rfe_xtest.shape)
print(rfe_ytest.shape)
#acc_rfe_lr = lr.score(rfe_xtest, rfe_ytest)
#print('Model Accuracy is:', acc_rfe_lr*100, '%')

In [None]:
predictions_rfe_lr

In [None]:
acc_lr_rfe = lr.score(rfe_xtest, rfe_ytest)
print('The accuracy of this model is:', round(acc_lr_rfe, 3)*100, '%')

In [None]:
c_rfe = model_metrics(rfe, x_test, y_test, predictions_rfe_lr)

In [None]:
plot_confusion(c_rfe, 'Recursive Feature Elimination')

In [None]:
skplt.metrics.plot_roc(y_test, probs_rfe, figsize = (8,8), title = 'ROC Curve: RFE Model')

# Conclusion

We will use the first model based on logistic regression as that gives us the highest accuracy of 79.9%.