# Telecom customer churn analysis

## Introduction

Customer churn, also known as customer attrition or customer defections refers to a company losing customers. Telecommunication service companies often use customer attrition rates as one of their key business metrics because the cost of acquiring a new customer far exceeds that of retaining an existing customer.

In this case study, you will analyze a customer churn dataset for a telecommunications company that provides phone and internet services in California.

### Questions to ask

- How many customers joined the company during the last quarter? How many customers churned?
- What is the customer profile for a customer that churned, joined, and stayed? Are they different?
- What seem to be the key drivers of customer churn?
- Is the company losing high value customers? If so, how can they retain them?
- Can you build a prediction model that our client can use to 

[Customer Attrition - Wikipedia](https://en.wikipedia.org/wiki/Customer_attrition)

In [2]:
import pandas as pd
import numpy as np

In [3]:
import glob

glob.glob('*.csv')

['telecom_zipcode_population.csv', 'telecom_customer_churn.csv']

In [5]:
pd.set_option('display.max_columns', 100)

## About the dataset

The dataset contains churn data for a Telecommunications company that provides phone and internet services to 7,043 customers in California, and includes details about customer demographics, location, services, and current status.

Source: [Maven Analytics Data Playground](https://www.mavenanalytics.io/data-playground)

### Data dictionary

The table below describes each column in the `'telecom_customer_churn.csv'` file.

| Field                             | Description                                                                                                                                                                                                                  |
|-----------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CustomerID                        | A unique ID that identifies each customer                                                                                                                                                                                    |
| Gender                            | The customer's gender: Male, Female                                                                                                                                                                                          |
| Age                               | The customer's current age, in years, at the time the fiscal quarter   ended (Q2 2022)                                                                                                                                       |
| Married                           | Indicates if the customer is married: Yes, No                                                                                                                                                                                |
| Number of Dependents              | Indicates the number of dependents that live with the customer   (dependents could be children, parents, grandparents, etc.)                                                                                                 |
| City                              | The city of the customer's primary residence in California                                                                                                                                                                   |
| Zip Code                          | The zip code of the customer's primary residence                                                                                                                                                                             |
| Latitude                          | The latitude of the customer's primary residence                                                                                                                                                                             |
| Longitude                         | The longitude of the customer's primary residence                                                                                                                                                                            |
| Number of Referrals               | Indicates the number of times the customer has referred a friend or   family member to this company to date                                                                                                                  |
| Tenure in Months                  | Indicates the total amount of months that the customer has been with the   company by the end of the quarter specified above                                                                                                 |
| Offer                             | Identifies the last marketing offer that the customer accepted: None,   Offer A, Offer B, Offer C, Offer D, Offer E                                                                                                          |
| Phone Service                     | Indicates if the customer subscribes to home phone service with the   company: Yes, No                                                                                                                                       |
| Avg Monthly Long Distance Charges | Indicates the customer's average long distance charges, calculated to the   end of the quarter specified above (if the customer is not subscribed to home   phone service, this will be 0)                                   |
| Multiple Lines                    | Indicates if the customer subscribes to multiple telephone lines with the   company: Yes, No (if the customer is not subscribed to home phone service,   this will be No)                                                    |
| Internet Service                  | Indicates if the customer subscribes to Internet service with the   company: Yes, No                                                                                                                                         |
| Internet Type                     | Indicates the customer's type of internet connection: DSL, Fiber Optic,   Cable (if the customer is not subscribed to internet service, this will be   None)                                                                 |
| Avg Monthly GB Download           | Indicates the customer's average download volume in gigabytes, calculated   to the end of the quarter specified above (if the customer is not subscribed   to internet service, this will be 0)                              |
| Online Security                   | Indicates if the customer subscribes to an additional online security   service provided by the company: Yes, No (if the customer is not subscribed   to internet service, this will be No)                                  |
| Online Backup                     | Indicates if the customer subscribes to an additional online backup   service provided by the company: Yes, No (if the customer is not subscribed   to internet service, this will be No)                                    |
| Device Protection Plan            | Indicates if the customer subscribes to an additional device protection   plan for their Internet equipment provided by the company: Yes, No (if the   customer is not subscribed to internet service, this will be No)      |
| Premium Tech Support              | Indicates if the customer subscribes to an additional technical support   plan from the company with reduced wait times: Yes, No (if the customer is   not subscribed to internet service, this will be No)                  |
| Streaming TV                      | Indicates if the customer uses their Internet service to stream   television programing from a third party provider at no additional fee: Yes,   No (if the customer is not subscribed to internet service, this will be No) |
| Streaming Movies                  | Indicates if the customer uses their Internet service to stream movies   from a third party provider at no additional fee: Yes, No (if the customer is   not subscribed to internet service, this will be No)                |
| Streaming Music                   | Indicates if the customer uses their Internet service to stream music   from a third party provider at no additional fee: Yes, No (if the customer is   not subscribed to internet service, this will be No)                 |
| Unlimited Data                    | Indicates if the customer has paid an additional monthly fee to have   unlimited data downloads/uploads: Yes, No (if the customer is not subscribed   to internet service, this will be No)                                  |
| Contract                          | Indicates the customer's current contract type: Month-to-Month, One Year,   Two Year                                                                                                                                         |
| Paperless Billing                 | Indicates if the customer has chosen paperless billing: Yes, No                                                                                                                                                              |
| Payment Method                    | Indicates how the customer pays their bill: Bank Withdrawal, Credit Card,   Mailed Check                                                                                                                                     |
| Monthly Charge                    | Indicates the customer's current total monthly charge for all their   services from the company                                                                                                                              |
| Total Charges                     | Indicates the customer's total charges, calculated to the end of the   quarter specified above                                                                                                                               |
| Total Refunds                     | Indicates the customer's total refunds, calculated to the end of the   quarter specified above                                                                                                                               |
| Total Extra Data Charges          | Indicates the customer's total charges for extra data downloads above   those specified in their plan, by the end of the quarter specified above                                                                             |
| Total Long Distance Charges       | Indicates the customer's total charges for long distance above those   specified in their plan, by the end of the quarter specified above                                                                                    |
| Total Revenue                     | Indicates the company's total revenue from this customer, calculated to   the end of the quarter specified above (Total Charges - Total Refurnds +   Total Extra Data Charges + Total Lond Distance Charges)                 |
| Customer Status                   | Indicates the status of the customer at the end of the quarter: Churned,   Stayed, or Joined                                                                                                                                 |
| Churn Category                    | A high-level category for the customer's reason for churning, which is   asked when they leave the company: Attitude, Competitor, Dissatisfaction,   Other, Price (directly related to Churn Reason)                         |
| Churn Reason                      | A customer's specific reason for leaving the company, which is asked when   they leave the company (directly related to Churn Category)                                                                                      |

In [6]:
df_churn = pd.read_csv('telecom_customer_churn.csv')
df_churn.head()

Unnamed: 0,Customer ID,Gender,Age,Married,Number of Dependents,City,Zip Code,Latitude,Longitude,Number of Referrals,Tenure in Months,Offer,Phone Service,Avg Monthly Long Distance Charges,Multiple Lines,Internet Service,Internet Type,Avg Monthly GB Download,Online Security,Online Backup,Device Protection Plan,Premium Tech Support,Streaming TV,Streaming Movies,Streaming Music,Unlimited Data,Contract,Paperless Billing,Payment Method,Monthly Charge,Total Charges,Total Refunds,Total Extra Data Charges,Total Long Distance Charges,Total Revenue,Customer Status,Churn Category,Churn Reason
0,0002-ORFBO,Female,37,Yes,0,Frazier Park,93225,34.827662,-118.999073,2,9,,Yes,42.39,No,Yes,Cable,16.0,No,Yes,No,Yes,Yes,No,No,Yes,One Year,Yes,Credit Card,65.6,593.3,0.0,0,381.51,974.81,Stayed,,
1,0003-MKNFE,Male,46,No,0,Glendale,91206,34.162515,-118.203869,0,9,,Yes,10.69,Yes,Yes,Cable,10.0,No,No,No,No,No,Yes,Yes,No,Month-to-Month,No,Credit Card,-4.0,542.4,38.33,10,96.21,610.28,Stayed,,
2,0004-TLHLJ,Male,50,No,0,Costa Mesa,92627,33.645672,-117.922613,0,4,Offer E,Yes,33.65,No,Yes,Fiber Optic,30.0,No,No,Yes,No,No,No,No,Yes,Month-to-Month,Yes,Bank Withdrawal,73.9,280.85,0.0,0,134.6,415.45,Churned,Competitor,Competitor had better devices
3,0011-IGKFF,Male,78,Yes,0,Martinez,94553,38.014457,-122.115432,1,13,Offer D,Yes,27.82,No,Yes,Fiber Optic,4.0,No,Yes,Yes,No,Yes,Yes,No,Yes,Month-to-Month,Yes,Bank Withdrawal,98.0,1237.85,0.0,0,361.66,1599.51,Churned,Dissatisfaction,Product dissatisfaction
4,0013-EXCHZ,Female,75,Yes,0,Camarillo,93010,34.227846,-119.079903,3,3,,Yes,7.38,No,Yes,Fiber Optic,11.0,No,No,No,Yes,Yes,No,No,Yes,Month-to-Month,Yes,Credit Card,83.9,267.4,0.0,0,22.14,289.54,Churned,Dissatisfaction,Network reliability


In [7]:
df_churn['Customer Status'].value_counts()

Stayed     4720
Churned    1869
Joined      454
Name: Customer Status, dtype: int64

In [8]:
df_churn['Churn Reason'].value_counts()

Competitor had better devices                313
Competitor made better offer                 311
Attitude of support person                   220
Don't know                                   130
Competitor offered more data                 117
Competitor offered higher download speeds    100
Attitude of service provider                  94
Price too high                                78
Product dissatisfaction                       77
Network reliability                           72
Long distance charges                         64
Service dissatisfaction                       63
Moved                                         46
Extra data charges                            39
Limited range of services                     37
Poor expertise of online support              31
Lack of affordable download/upload speed      30
Lack of self-service on Website               29
Poor expertise of phone support               12
Deceased                                       6
Name: Churn Reason, 

In [9]:
df_churn.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7043 entries, 0 to 7042
Data columns (total 38 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   Customer ID                        7043 non-null   object 
 1   Gender                             7043 non-null   object 
 2   Age                                7043 non-null   int64  
 3   Married                            7043 non-null   object 
 4   Number of Dependents               7043 non-null   int64  
 5   City                               7043 non-null   object 
 6   Zip Code                           7043 non-null   int64  
 7   Latitude                           7043 non-null   float64
 8   Longitude                          7043 non-null   float64
 9   Number of Referrals                7043 non-null   int64  
 10  Tenure in Months                   7043 non-null   int64  
 11  Offer                              7043 non-null   objec

## Customer profiles

### Churn customers