# Understanding Factors Contributing to Customer Churn at Telco

## Problem

We are tasked with understanding the factors that are contributing to customer churn at Telco and what we can do to mitigate customer churn.  Telco is a fictional telecommunications company that was created by IBM.

## Data Collection

<b><u> Some of the Basic Information </u></b><br>
Two datasets were pulled from the IBM community datasets.  There is a main dataset and a dataset with some survey information on understanding the reason the customer decided to leave.  This data was pulled on 8/17/2021.  The files used in the analysis will also be on the github repository.

<b>Github repository: </b> https://github.com/blakingtheice/Telco-Customer-Churn

<b> Links to Datasets </b>

Main dataset: https://community.ibm.com/accelerators/catalog/content/Customer-churn
<br>Survey Data: https://community.ibm.com/accelerators/catalog/content/Telco-customer-churn-status-and-reason-for-leaving


<b> Reading in the data </b>

In [12]:
import pandas as pd
import sys
import os

## Importing the data in

file_path = os.getcwd()
main_data = pd.read_excel(file_path + "\\Data\\CustomerChurn.xlsx")
survey_data = pd.read_excel(file_path + "\\Data\\Telco_customer_churn_status.xlsx")

<b><u> Dataset Descriptions </u></b><br>
<b> Main Dataset </b>

The main dataset includes a variety of information on the services that the customer uses, whether they have churned or not, and information like whether the customer has dependents.

We can take a look at how the data is stored in general and the general types by eyeballing the output below:

In [10]:
print(main_data.head())

   LoyaltyID Customer ID Senior Citizen Partner Dependents  Tenure  \
0     318537  7590-VHVEG             No     Yes         No       1   
1     152148  5575-GNVDE             No      No         No      34   
2     326527  3668-QPYBK             No      No         No       2   
3     845894  7795-CFOCW             No      No         No      45   
4     503388  9237-HQITU             No      No         No       2   

  Phone Service    Multiple Lines Internet Service Online Security  ...  \
0            No  No phone service              DSL              No  ...   
1           Yes                No              DSL             Yes  ...   
2           Yes                No              DSL             Yes  ...   
3            No  No phone service              DSL             Yes  ...   
4           Yes                No      Fiber optic              No  ...   

  Device Protection Tech Support Streaming TV Streaming Movies  \
0                No           No           No               No

<b> Survey Dataset </b>

The survey dataset has information about why a customer churned, their satisfaction scores, and customer life time value (CLTV).

We can take a quick look at what the dataset looks like below:

In [11]:
print(survey_data.head())

  Customer ID  Count Quarter  Satisfaction Score Customer Status Churn Label  \
0  8779-QRDMV      1      Q3                   3         Churned         Yes   
1  7495-OOKFY      1      Q3                   3         Churned         Yes   
2  1658-BYGOY      1      Q3                   2         Churned         Yes   
3  4598-XLKNJ      1      Q3                   2         Churned         Yes   
4  4846-WHAFZ      1      Q3                   2         Churned         Yes   

   Churn Value  Churn Score  CLTV   Churn Category  \
0            1           91  5433       Competitor   
1            1           69  5302       Competitor   
2            1           81  3179       Competitor   
3            1           88  5337  Dissatisfaction   
4            1           67  2793            Price   

                   Churn Reason  
0  Competitor offered more data  
1  Competitor made better offer  
2  Competitor made better offer  
3     Limited range of services  
4            Extra data 

<b> Merging the Data Together </b>

In [19]:
main_merge = main_data.merge(survey_data,how='left',on='Customer ID')

## Taking a look at how many blanks there are

print(main_merge.isnull().sum())

LoyaltyID                0
Customer ID              0
Senior Citizen           0
Partner                  0
Dependents               0
Tenure                   0
Phone Service            0
Multiple Lines           0
Internet Service         0
Online Security          0
Online Backup            0
Device Protection        0
Tech Support             0
Streaming TV             0
Streaming Movies         0
Contract                 0
Paperless Billing        0
Payment Method           0
Monthly Charges          0
Total Charges            0
Churn                    0
Count                    0
Quarter                  0
Satisfaction Score       0
Customer Status          0
Churn Label              0
Churn Value              0
Churn Score              0
CLTV                     0
Churn Category        5174
Churn Reason          5174
dtype: int64


## Exploring the Data

Stuff goes here

### Taking in the Survey Results:  What can we find?

We want to understand more about why customer churn has been happening.  The survey data has some useful information on this that we may want to take a look at.

In [None]:
## Code goes here for:
## 1) Chart showing proportion of customers that have churned
## 2) Chart showing distribution of different reasons for churn

Commentary on the graphs above will go here.

### Churn vs Other Factors

Stuff goes here

## Next Steps

### What are some of the quick wins that can be leveraged?

### Is there additional project work that needs to be completed?