<h1 style="text-align:center">Telecom Customer Churn</h1>

## Part 2: Analysis of Customer Lifetime Value
Lifetime Value (LTV) is the total worth to a business of a customer over the period of their relationship to the company.

In [1]:
# importing required libraries
import sqlite3
import pandas as pd

In [2]:
# creating a sql connection
conn = sqlite3.connect("customer_churn.db")

In [3]:
# creating a dataframe from the customer_churn database
df = pd.read_sql("select * from churn_all;", conn)

In [4]:
# displaying the top 5 rows of the dataframe
df.head()

Unnamed: 0,CustomerID,Gender,SeniorCitizen,Partner,Dependents,State,Latitude,Longitude,ZipCode,PhoneService,...,TechSupport,StreamingTV,StreamingMovies,Tenure,Contract,PaymentMethod,PaperlessBilling,MonthlyCharges,TotalCharges,Churn
0,3668-QPYBK,Male,No,No,No,California,33.964131,-118.272783,90003,Yes,...,No,No,No,2,Month-to-month,Mailed0check,Yes,53.85,108.15,Yes
1,9237-HQITU,Female,No,No,Yes,California,34.059281,-118.30742,90005,Yes,...,No,No,No,2,Month-to-month,Electronic0check,Yes,70.7,151.65,Yes
2,9305-CDSKC,Female,No,No,Yes,California,34.048013,-118.293953,90006,Yes,...,No,Yes,Yes,8,Month-to-month,Electronic0check,Yes,99.65,820.5,Yes
3,7892-POOKP,Female,No,Yes,Yes,California,34.062125,-118.315709,90010,Yes,...,Yes,Yes,Yes,28,Month-to-month,Electronic0check,Yes,104.8,3046.05,Yes
4,0280-XJGEX,Male,No,No,Yes,California,34.039224,-118.266293,90015,Yes,...,No,Yes,Yes,49,Month-to-month,Bank0transfer0(automatic),Yes,103.7,5036.3,Yes


In [5]:
# displaying all the columns
print(df.columns)

Index(['CustomerID', 'Gender', 'SeniorCitizen', 'Partner', 'Dependents',
       'State', 'Latitude', 'Longitude', 'ZipCode', 'PhoneService',
       'MultipleLines', 'InternetService', 'OnlineSecurity', 'OnlineBackup',
       'DeviceProtection', 'TechSupport', 'StreamingTV', 'StreamingMovies',
       'Tenure', 'Contract', 'PaymentMethod', 'PaperlessBilling',
       'MonthlyCharges', 'TotalCharges', 'Churn'],
      dtype='object')


---
### 1. Understanding the Customer Lifetime Value

#### Question 1: What was the average LTV of the customers who unsubscribed the service? And how long do customers usually stay in the service?

In [6]:
# extracting those customers who churned (churn = yes)
churn_df = pd.read_sql("select * from churn_all where Churn = 'Yes'", conn)

In [7]:
# displaying the top 5 rows
churn_df.head()

Unnamed: 0,CustomerID,Gender,SeniorCitizen,Partner,Dependents,State,Latitude,Longitude,ZipCode,PhoneService,...,TechSupport,StreamingTV,StreamingMovies,Tenure,Contract,PaymentMethod,PaperlessBilling,MonthlyCharges,TotalCharges,Churn
0,3668-QPYBK,Male,No,No,No,California,33.964131,-118.272783,90003,Yes,...,No,No,No,2,Month-to-month,Mailed0check,Yes,53.85,108.15,Yes
1,9237-HQITU,Female,No,No,Yes,California,34.059281,-118.30742,90005,Yes,...,No,No,No,2,Month-to-month,Electronic0check,Yes,70.7,151.65,Yes
2,9305-CDSKC,Female,No,No,Yes,California,34.048013,-118.293953,90006,Yes,...,No,Yes,Yes,8,Month-to-month,Electronic0check,Yes,99.65,820.5,Yes
3,7892-POOKP,Female,No,Yes,Yes,California,34.062125,-118.315709,90010,Yes,...,Yes,Yes,Yes,28,Month-to-month,Electronic0check,Yes,104.8,3046.05,Yes
4,0280-XJGEX,Male,No,No,Yes,California,34.039224,-118.266293,90015,Yes,...,No,Yes,Yes,49,Month-to-month,Bank0transfer0(automatic),Yes,103.7,5036.3,Yes


In [8]:
# examining the distrubution of TotalCharges
churn_df["TotalCharges"].describe()

count    1869.000000
mean     1531.796094
std      1890.822994
min        18.850000
25%       134.500000
50%       703.550000
75%      2331.300000
max      8684.800000
Name: TotalCharges, dtype: float64

Around 20% of the TotalCharges are very high. So, we can divide the dataset to study each distribution.

In [9]:
# finding the 80th percentile of the data in TotalCharges
churn_df.TotalCharges.quantile(0.8)

2840.4100000000003

The Lifetime Value of 80% of the people who left the company was below $2840

In [10]:
# selecting all the customers who have TotalCharges less than or equal to $2840
pd.read_sql("select * from churn_all where TotalCharges <= 2840.41", conn)

Unnamed: 0,CustomerID,Gender,SeniorCitizen,Partner,Dependents,State,Latitude,Longitude,ZipCode,PhoneService,...,TechSupport,StreamingTV,StreamingMovies,Tenure,Contract,PaymentMethod,PaperlessBilling,MonthlyCharges,TotalCharges,Churn
0,3668-QPYBK,Male,No,No,No,California,33.964131,-118.272783,90003,Yes,...,No,No,No,2,Month-to-month,Mailed0check,Yes,53.85,108.15,Yes
1,9237-HQITU,Female,No,No,Yes,California,34.059281,-118.307420,90005,Yes,...,No,No,No,2,Month-to-month,Electronic0check,Yes,70.70,151.65,Yes
2,9305-CDSKC,Female,No,No,Yes,California,34.048013,-118.293953,90006,Yes,...,No,Yes,Yes,8,Month-to-month,Electronic0check,Yes,99.65,820.50,Yes
3,4190-MFLUW,Female,No,Yes,No,California,34.066367,-118.309868,90020,Yes,...,Yes,No,No,10,Month-to-month,Credit0card0(automatic),No,55.20,528.35,Yes
4,8779-QRDMV,Male,Yes,No,No,California,34.023810,-118.156582,90022,No,...,No,No,Yes,1,Month-to-month,Electronic0check,Yes,39.65,39.65,Yes
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4745,8456-QDAVC,Male,No,No,No,California,32.852947,-114.850784,92283,Yes,...,No,Yes,No,19,Month-to-month,Bank0transfer0(automatic),Yes,78.70,1495.10,No
4746,7750-EYXWZ,Female,No,No,No,California,34.159534,-116.425984,92284,No,...,Yes,Yes,Yes,12,One0year,Electronic0check,No,60.65,743.30,No
4747,2569-WGERO,Female,No,No,No,California,34.341737,-116.539416,92285,Yes,...,No internet service,No internet service,No internet service,72,Two0year,Bank0transfer0(automatic),Yes,21.15,1419.40,No
4748,6840-RESVB,Male,No,Yes,Yes,California,34.667815,-117.536183,92301,Yes,...,Yes,Yes,Yes,24,One0year,Mailed0check,Yes,84.80,1990.50,No


In [11]:
churn_df.query("TotalCharges <= 2840.41")

Unnamed: 0,CustomerID,Gender,SeniorCitizen,Partner,Dependents,State,Latitude,Longitude,ZipCode,PhoneService,...,TechSupport,StreamingTV,StreamingMovies,Tenure,Contract,PaymentMethod,PaperlessBilling,MonthlyCharges,TotalCharges,Churn
0,3668-QPYBK,Male,No,No,No,California,33.964131,-118.272783,90003,Yes,...,No,No,No,2,Month-to-month,Mailed0check,Yes,53.85,108.15,Yes
1,9237-HQITU,Female,No,No,Yes,California,34.059281,-118.307420,90005,Yes,...,No,No,No,2,Month-to-month,Electronic0check,Yes,70.70,151.65,Yes
2,9305-CDSKC,Female,No,No,Yes,California,34.048013,-118.293953,90006,Yes,...,No,Yes,Yes,8,Month-to-month,Electronic0check,Yes,99.65,820.50,Yes
5,4190-MFLUW,Female,No,Yes,No,California,34.066367,-118.309868,90020,Yes,...,Yes,No,No,10,Month-to-month,Credit0card0(automatic),No,55.20,528.35,Yes
6,8779-QRDMV,Male,Yes,No,No,California,34.023810,-118.156582,90022,No,...,No,No,Yes,1,Month-to-month,Electronic0check,Yes,39.65,39.65,Yes
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1863,1122-JWTJW,Male,No,Yes,Yes,California,32.698964,-115.886656,92259,Yes,...,No,No,No,1,Month-to-month,Mailed0check,Yes,70.65,70.65,Yes
1864,1699-HPSBG,Male,No,No,No,California,33.745746,-116.514215,92264,Yes,...,Yes,Yes,No,12,One0year,Electronic0check,Yes,59.80,727.80,Yes
1865,8775-CEBBJ,Female,No,No,No,California,32.790282,-115.689559,92273,Yes,...,No,No,No,9,Month-to-month,Bank0transfer0(automatic),Yes,44.20,403.35,Yes
1866,6894-LFHLY,Male,Yes,No,No,California,34.264124,-114.717964,92280,Yes,...,No,No,No,1,Month-to-month,Electronic0check,Yes,75.75,75.75,Yes


In [12]:
# dividing the data by the 80th percentile of the data from the TotalCharges variable
total_charges_under80 = churn_df.query("TotalCharges <= 2840.41")
total_charges_above80 = churn_df.query("TotalCharges > 2840.41")

In [13]:
# displaying the distribution of TotalCharges paid by people under 80th percentile
total_charges_under80.TotalCharges.describe()

count    1495.000000
mean      711.265819
std       766.848197
min        18.850000
25%        85.025000
50%       371.650000
75%      1128.225000
max      2839.650000
Name: TotalCharges, dtype: float64

In [14]:
# displaying the distribution of TotalCharges paid by people above 80th percentile
total_charges_above80.TotalCharges.describe()

count     374.000000
mean     4811.723262
std      1436.724288
min      2841.550000
25%      3522.462500
50%      4571.100000
75%      5891.212500
max      8684.800000
Name: TotalCharges, dtype: float64

In [15]:
# displaying the distribution of Tenure of people under 80th percentile
total_charges_under80.Tenure.describe()

count    1495.000000
mean        9.935117
std        10.742349
min         1.000000
25%         1.000000
50%         6.000000
75%        15.000000
max        61.000000
Name: Tenure, dtype: float64

In [16]:
# displaying the distribution of Tenure of people above 80th percentile
total_charges_above80.Tenure.describe()

count    374.000000
mean      50.133690
std       12.334841
min       27.000000
25%       40.000000
50%       49.500000
75%       60.000000
max       72.000000
Name: Tenure, dtype: float64

Answer:<br>
<i>The average LTV of 80% of the customers who unsubscribed is ~711 dollars and their tenure is ~10 months. On the other hand, the average LTV of top 20% of the customers who unsubscribed is ~4811 dollars and their tenure is ~50 months.</i>

<i>The customers in the top 20% have a much higher tenure and that is the reason for their high LTV.</i>

---
#### Question 2: What kinds of services did the customers subscribed when they were customers?

#### Service: PhoneService