# Customer Retention Metrics

The processes required to discover and document how well a corporation retains customers are referred to as customer retention measurement. Organizations measure their effectiveness in this area to see how well they are serving client needs and whether they are earning their business over time.

In this part, we will examine and contrast some of the customer retention indicators that are relevant to maintaining clients from past campaigns.


In [66]:
from datetime import datetime, timedelta
from sklearn.cluster import KMeans
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import warnings

%matplotlib inline
warnings.filterwarnings("ignore")

In [67]:
df=pd.read_csv('segmentation_data.csv')

In [68]:
df.head()

Unnamed: 0,id,customer_age,job_type,marital,education,default,balance,housing_loan,personal_loan,communication_type,...,num_contacts_in_campaign,days_since_prev_campaign_contact,num_contacts_prev_campaign,prev_campaign_outcome,term_deposit_subscribed,spending_score,customer_age_cluster,balance_cluster,spending_score_cluster,OverallScore
0,id_43823,28.0,management,single,tertiary,no,285.0,yes,no,email,...,4.0,0.0,0,unknown,0.0,40,1,0,0,1
1,id_32289,34.0,blue-collar,married,secondary,no,934.0,no,yes,cellular,...,2.0,132.0,1,other,0.0,83,1,0,1,2
2,id_10523,46.0,technician,married,secondary,no,656.0,no,no,cellular,...,4.0,0.0,0,unknown,0.0,12,0,0,2,2
3,id_43951,34.0,services,single,secondary,no,2.0,yes,no,email,...,3.0,0.0,0,unknown,0.0,81,1,0,1,2
4,id_40992,41.0,blue-collar,married,primary,no,1352.0,yes,no,cellular,...,2.0,0.0,0,unknown,0.0,80,1,0,1,2


In [69]:
print('shape of df {}'.format(df.shape))

shape of df (45211, 23)


In [70]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 45211 entries, 0 to 45210
Data columns (total 23 columns):
 #   Column                            Non-Null Count  Dtype  
---  ------                            --------------  -----  
 0   id                                45211 non-null  object 
 1   customer_age                      45211 non-null  float64
 2   job_type                          45211 non-null  object 
 3   marital                           45211 non-null  object 
 4   education                         45211 non-null  object 
 5   default                           45211 non-null  object 
 6   balance                           45211 non-null  float64
 7   housing_loan                      45211 non-null  object 
 8   personal_loan                     45211 non-null  object 
 9   communication_type                45211 non-null  object 
 10  day_of_month                      45211 non-null  int64  
 11  month                             45211 non-null  object 
 12  last

In [72]:
# sum of null values

df.isnull().sum()

id                                  0
customer_age                        0
job_type                            0
marital                             0
education                           0
default                             0
balance                             0
housing_loan                        0
personal_loan                       0
communication_type                  0
day_of_month                        0
month                               0
last_contact_duration               0
num_contacts_in_campaign            0
days_since_prev_campaign_contact    0
num_contacts_prev_campaign          0
prev_campaign_outcome               0
term_deposit_subscribed             0
spending_score                      0
customer_age_cluster                0
balance_cluster                     0
spending_score_cluster              0
OverallScore                        0
dtype: int64

In [73]:
# summary statistics

df.describe()

Unnamed: 0,customer_age,balance,day_of_month,last_contact_duration,num_contacts_in_campaign,days_since_prev_campaign_contact,num_contacts_prev_campaign,term_deposit_subscribed,spending_score,customer_age_cluster,balance_cluster,spending_score_cluster,OverallScore
count,45211.0,45211.0,45211.0,45211.0,45211.0,45211.0,45211.0,45211.0,45211.0,45211.0,45211.0,45211.0,45211.0
mean,39.647342,1342.655836,15.806419,255.2132,2.753334,41.015195,0.580323,0.07507,53.030789,0.642078,0.389949,0.99509,2.027117
std,12.034303,2998.286959,8.322476,257.692091,3.090163,99.792615,2.303441,0.263508,26.810608,0.518793,0.99068,0.816455,1.360198
min,0.0,-8020.0,1.0,0.0,0.0,0.0,0.0,0.0,7.0,0.0,0.0,0.0,0.0
25%,32.0,60.0,8.0,100.0,1.0,0.0,0.0,0.0,30.0,0.0,0.0,0.0,1.0
50%,38.0,434.0,16.0,178.0,2.0,0.0,0.0,0.0,53.0,1.0,0.0,1.0,2.0
75%,48.0,1404.5,21.0,316.0,3.0,0.0,0.0,0.0,76.0,1.0,0.0,2.0,3.0
max,97.0,102128.0,31.0,4900.0,63.0,871.0,275.0,1.0,99.0,2.0,3.0,2.0,7.0


### Basic Terminology:

Customer Retention Rate = Total Customers from Campaign Subscribed / Total Customers Sent Communication

Customer Churn Rate = 1 - Customer Retention Rate

In [74]:
previous_campaign = df['prev_campaign_outcome'].value_counts()
previous_campaign

unknown    36959
failure     4901
other       1840
success     1511
Name: prev_campaign_outcome, dtype: int64

In [75]:
total_count = df['prev_campaign_outcome'].count()
total_count

45211

In [76]:
current_campaign = df['term_deposit_subscribed'].value_counts()
current_campaign

0.0    41817
1.0     3394
Name: term_deposit_subscribed, dtype: int64

In [77]:
Retention_prev_camp = previous_campaign['success']/total_count
Retention_prev_camp

0.0334210700935613

In [78]:
Retention_cur_camp = current_campaign[1]/total_count
Retention_cur_camp

0.07507022627236734

### Loyal Customers

Knowing how many loyal customers you have is crucial because they are the most valuable members of your customer base. That's because they're not only the ones that drive the most sales, but they're also the ones who are most likely to spread great word about your company. You can take advantage of opportunities to collect testimonials and increase consumer advocacy by identifying who these loyal customers are.

Loyal Customer Rate = Number of Repeat Customers / Total Customers

In [79]:
df.term_deposit_subscribed = df.term_deposit_subscribed.replace([0],'failure')
df.term_deposit_subscribed = df.term_deposit_subscribed.replace([1],'success')
df.groupby('term_deposit_subscribed').describe()

Unnamed: 0_level_0,customer_age,customer_age,customer_age,customer_age,customer_age,customer_age,customer_age,customer_age,balance,balance,...,spending_score_cluster,spending_score_cluster,OverallScore,OverallScore,OverallScore,OverallScore,OverallScore,OverallScore,OverallScore,OverallScore
Unnamed: 0_level_1,count,mean,std,min,25%,50%,75%,max,count,mean,...,75%,max,count,mean,std,min,25%,50%,75%,max
term_deposit_subscribed,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
failure,41817.0,39.562044,11.788649,0.0,32.0,38.0,48.0,95.0,41817.0,1304.543104,...,2.0,2.0,41817.0,2.011957,1.352569,0.0,1.0,2.0,3.0,7.0
success,3394.0,40.698291,14.690009,0.0,31.0,38.0,50.0,97.0,3394.0,1812.237478,...,2.0,2.0,3394.0,2.213907,1.438052,0.0,1.0,2.0,3.0,7.0


In [80]:
c=0
prev = df['prev_campaign_outcome']
cur = df['term_deposit_subscribed']
for x,y in zip(prev, cur):
    if(x==y):
        c+=1

In [83]:
new_customers = (previous_campaign['success'] + current_campaign[0] - c)/total_count
new_customers

0.8429143350069673

In [84]:
loyal_customers = 1 - (previous_campaign['success'] + current_campaign[0] - c)/total_count
loyal_customers

0.15708566499303267