<h1 style='background-color: #171738; padding-left: 40px; padding-top: 10px; padding-bottom: 5px; padding-right: 10px; color: #DFF3E4; font-size: 18px; box-sizing: border-box;'>
Telco Churn<br><br>
Description
In the rapidly evolving telecommunications industry, understanding and mitigating customer churn has become a critical business concern. This data science project aims to analyze customer churn patterns using historical data and build predictive models to identify customers at risk of churning. By doing so, this project aims to provide actionable insights to Telco, enabling them to implement targeted retention strategies.<br><br>
Goals
<ul>
<li>Data Collection and Preprocessing: Gather and clean Telco customer data to create a comprehensive dataset suitable for analysis.</li>
<li>Exploratory Data Analysis: Perform exploratory analysis to identify trends, patterns, and potential correlations related to customer churn.</li>
<li>Feature Importance Determination: Employ machine learning techniques to assess the importance of various features in predicting churn, aiding in identifying critical factors.</li>
<li>Model Building and Evaluation: Develop predictive models for customer churn, compare their performance, and select the most effective one for accurate churn prediction.</li>


<h1 style='background-color: #171738; padding-left: 40px; padding-top: 10px; padding-bottom: 10px; padding-right: 10px; color: #DFF3E4; font-size: 18px; box-sizing: border-box;'>
Imports</h1>

In [2]:
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
import seaborn as sns
import os

from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.preprocessing import MinMaxScaler
from sklearn.linear_model import LogisticRegression

from sklearn.metrics import classification_report, confusion_matrix
from prepare import telco_pipeline

<h1 style='background-color: #171738; padding-left: 40px; padding-top: 10px; padding-bottom: 10px; padding-right: 10px; color: #DFF3E4; font-size: 18px; box-sizing: border-box;'>Acquire
<br>
<ul>
<li>Data acquired from Codeup servers.</li>
<li>It contained 7,043 rows and 20 columns before cleaning</li>
<li>Each row represents a customer of Telco.</li>
<li>Each column represents a feature of the customer's account.</li>

<h1 style='background-color: #171738; padding-left: 40px; padding-top: 10px; padding-bottom: 10px; padding-right: 10px; color: #DFF3E4; font-size: 18px; box-sizing: border-box;'>Prepare<br>
<br>

* Removed unnecessary columns associated with joining table IDs
* Checked for nulls in the data
    * Replaced new customer's total charges value from blank to 0
    * Replaced internet service type nulls with "No internet service"
* Checked that column data types were appropriate
    * Changed data type of total charges from object to float
* Encoded categorical variables to binary "dummy" variables
    * Removed columns that did not contain useful information (duplicate info)
* Renamed columns to promote readability
* Split data into train, validate and test, stratifying on 'churn'
* Outliers have not been removed for this iteration of the project

<h1 style='background-color: #171738; padding-left: 40px; padding-top: 10px; padding-bottom: 10px; padding-right: 10px; color: #DFF3E4; font-size: 18px; box-sizing: border-box;'>Dictionary<br>
<br>

| Feature| Description |
|:-------|:------------|
| Senior Citizen| Whether the customer is a senior citizen (0 for no, 1 for yes)|
| Tenure| Number of months the customer has been with the Telco|
| Monthly Charges| Monthly charges incurred by the customer|
| Total Charges| Total charges incurred by the customer over the entire period|
| Male| Gender of the customer (True for male, False for female)|
| Partner| Whether the customer has a partner (spouse) (True for yes, False for no)|
| Dependents| Whether the customer has dependents (True for yes, False for no)|
| Phone Service| Whether the customer has phone service (True for yes, False for no)|
| Multiple Lines| Whether the customer has multiple phone lines (True for yes, False for no)|
| Online Security| Whether the customer has online security service (True for yes, False for no)|
| Online Backup| Whether the customer has online backup service (True for yes, False for no)|
| Device Protection| Whether the customer has device protection service (True for yes, False for no)|
| Tech Support| Whether the customer has technical support service (True for yes, False for no)|
| Streaming TV| Whether the customer has streaming TV service (True for yes, False for no)|
| Streaming Movies| Whether the customer has streaming movie service (True for yes, False for no)|
| Paperless Billing| Whether the customer has opted for paperless billing (True for yes, False for no)|
| Churn| Whether the customer has churned (True for churned, False for not churned)|
| Contract Month| Whether the customer is on a month-to-month contract (True for yes, False for no)|
| Contract One Year| Whether the customer is on a one-year contract (True for yes, False for no)|
| Contract Two Year| Whether the customer is on a two-year contract (True for yes, False for no)|
| Internet DSL| Whether the customer uses DSL internet service (True for yes, False for no)|
| Internet Fiber Optic| Whether the customer uses fiber optic internet service (True for yes, False for no)|
| Payment Bank Transfer| Whether the customer pays through bank transfer (True for yes, False for no)|
| Payment Credit Card| Whether the customer pays through credit card (True for yes, False for no)|
| Payment Electronic Check| Whether the customer pays through electronic check (True for yes, False for no)|
| Payment Mailed Check| Whether the customer pays through mailed check (True for yes, False for no)|

</h1>

<h1 style='background-color: #171738; padding-left: 40px; padding-top: 10px; padding-bottom: 10px; padding-right: 10px; color: #DFF3E4; font-size: 18px; box-sizing: border-box;'>
A brief look at the data

In [4]:
# acquiring, preparing, and adding features to data
# splitting data into train, validate, and test
train, val, test = telco_pipeline()
train.head()

Unnamed: 0,senior_citizen,tenure,monthly_charges,total_charges,male,partner,dependents,phone_service,multiple_lines,online_security,...,churn,contract_month,contract_one_year,contract_two_year,internet_dsl,internet_fiber_optic,payment_bank_transfer,payment_credit_card,payment_electronic_check,payment_mailed_check
5609,0,14,76.45,1117.55,True,False,False,True,False,False,...,False,True,False,False,False,True,False,False,True,False
2209,0,5,70.0,347.4,True,False,False,True,False,False,...,True,False,True,False,True,False,False,False,False,True
6919,0,35,75.2,2576.2,True,True,False,True,True,False,...,True,True,False,False,False,True,False,False,True,False
2284,0,58,86.1,4890.5,True,True,False,True,True,True,...,False,False,False,True,True,False,False,False,True,False
845,0,2,49.6,114.7,False,False,False,True,False,False,...,True,True,False,False,True,False,False,False,False,True


<h1 style='background-color: #171738; padding-left: 40px; padding-top: 10px; padding-bottom: 10px; padding-right: 10px; color: #DFF3E4; font-size: 18px; box-sizing: border-box;'>A summery of the data

In [5]:
train.describe()

Unnamed: 0,senior_citizen,tenure,monthly_charges,total_charges
count,4930.0,4930.0,4930.0,4930.0
mean,0.160852,32.526369,65.110751,2306.877262
std,0.367432,24.595024,30.136981,2285.182364
min,0.0,0.0,18.4,0.0
25%,0.0,9.0,36.3,401.15
50%,0.0,29.0,70.575,1412.275
75%,0.0,56.0,90.05,3855.025
max,1.0,72.0,118.75,8684.8


In [10]:
train.shape, val.shape, test.shape

((4930, 26), (1056, 26), (1057, 26))

<h1 style='background-color: #171738; padding-left: 40px; padding-top: 10px; padding-bottom: 10px; padding-right: 10px; color: #DFF3E4; font-size: 18px; box-sizing: border-box;'>Explore

<h1 style='background-color: #171738; padding-left: 40px; padding-top: 10px; padding-bottom: 10px; padding-right: 10px; color: #DFF3E4; font-size: 18px; box-sizing: border-box;'>

<h1 style='background-color: #171738; padding-left: 40px; padding-top: 10px; padding-bottom: 10px; padding-right: 10px; color: #DFF3E4; font-size: 18px; box-sizing: border-box;'>