# 1 - Introduction

## 1.1 - Table of Content
Problem Definition [[ReadMe.md](ReadMe.md)] \
Data Preparation & Data-Driven Insights [[Exploratory Data Analysis](Jupyter%20Notebooks/1%20-%20Exploratory%20Data%20Analysis%20(EDA).ipynb), [Machine Learning](Jupyter%20Notebooks/2%20-%20Machine%20Learning.ipynb), [Additional Machine Learning](Jupyter%20Notebooks/3%20-%20Additional%20Machine%20Learning%20(Logistic%20Regression,%20K%20Nearest%20Neighbours,%20Random%20Forest).ipynb)] \
Exploratory Data Analysis [[Exploratory Data Analysis](Jupyter%20Notebooks/1%20-%20Exploratory%20Data%20Analysis%20(EDA).ipynb)] \
Machine Learning (Binary Tree Classification) [[Machine Learning](Jupyter%20Notebooks/2%20-%20Machine%20Learning.ipynb)] \
Additional Machine Learning (Logistic Regression, K-Nearest Neighbours, Random Forest) [[Additional Machine Learning](Jupyter%20Notebooks/3%20-%20Additional%20Machine%20Learning%20(Logistic%20Regression,%20K%20Nearest%20Neighbours,%20Random%20Forest).ipynb)]


## 1.2 - Problem Definition
In the fast-changing telco industry, customer retention is a rising issuing issue as competing telcos offer more attractive deals to lure potential customers. Hence, we wish to understand why customers switch telco companies and aim to predict the likelihood of existing customers changing their telco provider so that telcos can implement changes to retain customers before it is too late. 

In today's fast-changing telecommunications industry, the battle for customer loyalty and retention has become increasingly fierce. Telcos are continually innovating and offering attractive deals to entice potential customers, leading to a growing concern around customer churn. Studies have shown acquiring a new customer can cost five to twenty-five times more than retaining an existing customer (Singh & Khan, 2018). Furthermore, increasing customer retention by 5% can increase profits from 25-95% (Gallo, 2014). Hence, high customer retention rates greatly impacts a telco's business. As the allure of competitive deals tempts customers to switch providers, it has become imperative to delve into the reasons behind this trend. Therefore, our aim is to uncover the motivations driving customers to switch telco companies and develop predictive models that can anticipate the likelihood of existing customers changing their provider. By gaining insights into these dynamics, telcos can proactively implement strategies to retain their customers before they consider switching.

### 1.2.1 - References
- Gallo, A. (2014, October 29). The Value of Keeping the Right Customers. Harvard Business Review. https://hbr.org/2014/10/the-value-of-keeping-the-right-customers
- Singh, R., & Khan, I. A. (2018). An approach to increase customer retention and loyalty in B2C world. International journal of scientific and research publications, 2(11) (ISSN: 2250-3153). http://www.ijsrp.org/research-paper-1112.php?rp=P11433

## 1.3 - Understanding the dataset
The dataset we will be using is from an IBM Sample Dataset for Telco Customer Churn, available on Kaggle (https://www.kaggle.com/blastchar/telco-customer-churn).

In [5]:
import pandas as pd
df_initial = pd.read_csv('Telco_Customer_Churn.csv')
df_initial.columns

Index(['customerID', 'gender', 'SeniorCitizen', 'Partner', 'Dependents',
       'tenure', 'PhoneService', 'MultipleLines', 'InternetService',
       'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport',
       'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling',
       'PaymentMethod', 'MonthlyCharges', 'TotalCharges', 'Churn'],
      dtype='object')

The dataset features can be broadly categorized into Three sections: User Information, Services Information and Contract & Payment Information. Below is a summary of the features available in the `Telco_Customer_Churn.csv` dataset.


### 1.3.1 - User Information
- `customerID` (Numerical): Unique ID to identify each customer.
- `gender` (Categorical): The customer's gender (male or female).
- `SeniorCitizen` (Categorical): Whether the customer is a senior citizen or not (1 for Yes, 0 for No).
- `Partner` (Categorical): Whether the customer has a partner or not (Yes or No).
- `Dependents` (Categorical): Whether the customer is living with dependents (children, parents, grandparents) or not (Yes, No).
- `tenure` (Numerical): Number of months the customer has stayed with the telco company.
- `PhoneService` (Categorical): Whether the customer has a phone service subscription (Yes or No).


### 1.3.2 - Services Information
- `MultipleLines` (Categorical): Whether the customer has multiple lines or not (Yes, No, or No phone service).
- `InternetService` (Categorical): Customer's internet service provider (DSL, Fiber optic, or No internet service).
- `OnlineSecurity` (Categorical): Whether the customer has online security service or not (Yes, No, or No internet service).
- `OnlineBackup` (Categorical): Whether the customer has online backup service or not (Yes, No, or No internet service).
- `DeviceProtection` (Categorical): Whether the customer has device protection service or not (Yes, No, or No internet service).
- `TechSupport` (Categorical): Whether the customer has tech support service or not (Yes, No, or No internet service).
- `StreamingTV` (Categorical): Whether the customer has streaming TV service or not (Yes, No, or No internet service).
- `StreamingMovies` (Categorical): Whether the customer has streaming movies service or not (Yes, No, or No internet service).


### 1.3.3 - Contract and Payment Information
- `Contract` (Categorical): The contract term of the customer (Month-to-month, One year, Two year).
- `PaperlessBilling` (Categorical): Whether the customer has paperless billing or not (Yes, No).
- `PaymentMethod` (Categorical): The customer's payment method (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic)).
- `MonthlyCharges` (Numerical): The total amount charged to the customer monthly.
- `TotalCharges` (Numerical): The total amount charged to the customer.

### 1.3.4 - Prediction Target
- `Churn` (Categorical): Whether the customer churned or not (Yes or No).