## Project: Telco Customer Churn Prediction

In the telecommunications industry, customers are able to choose from multiple service providers and actively switch from one operator to another. In this highly competitive market, the telecommunications industry experiences an average of 15-25% annual churn rate.

- The Business Pain Point:
Acquiring a new customer is estimated to be 5 to 25 times more expensive than retaining an existing one. Therefore, for a Telco company, Customer Retention is the most critical strategy to maximize profit.

- The Goal:
The management wants to reduce customer churn by identifying customers who are likely to leave before they actually leave. If we can predict who is at risk, the marketing team can offer them special discounts or better plans to keep them.

2. Problem Statement

- Objective: Develop a Machine Learning solution to predict whether a customer will Churn (leave the company) or Stay based on their account information, demographic details, and service usage.

3. The Dataset

We will use the Telco Customer Churn dataset.

Source: [Kaggle - Telco Customer Churn](https://www.kaggle.com/datasets/blastchar/telco-customer-churn)



**Importing the Dependencies**

**Task: Reading and Exploring the Data**

**TASK: Confirm quickly with .info() methods the datatypes and non-null values in your dataframe.**

**TASK: Get a quick statistical summary of the numeric columns with .describe() , you should notice that many columns are categorical, meaning you will eventually need to convert them to dummy variables.**

## General Feature Exploration

**TASK: Confirm that there are no NaN cells by displaying NaN values per feature column.**

**TASK:Display the balance of the class labels (Churn) with a Count Plot.**

**TASK: Explore the distrbution of TotalCharges between Churn categories with a Box Plot or Violin Plot.**

**TASK: Create a boxplot showing the distribution of TotalCharges per Contract type, also add in a hue coloring based on the Churn class.**

**TASK: Create a bar plot showing the correlation of the following features to the class label. Keep in mind, for the categorical features, you will need to convert them into dummy variables first, as you can only calculate correlation for numeric features.**

    ['gender', 'SeniorCitizen', 'Partner', 'Dependents','PhoneService', 'MultipleLines', 
     'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport', 'InternetService',
       'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling', 'PaymentMethod']

***Note, we specifically listed only the features above, you should not check the correlation for every feature, as some features have too many unique instances for such an analysis, such as customerID***

# Churn Analysis

**This section focuses on segementing customers based on their tenure, creating "cohorts", allowing us to examine differences between customer cohort segments.**

**TASK: What are the 3 contract types available?**

**TASK: Create a histogram displaying the distribution of 'tenure' column, which is the amount of months a customer was or has been on a customer.**

**TASK: Now use the seaborn documentation as a guide to create histograms separated by two additional features, Churn and Contract.**

**TASK: Display a scatter plot of Total Charges versus Monthly Charges, and color hue by Churn.**

### Creating Cohorts based on Tenure

**Let's begin by treating each unique tenure length, 1 month, 2 month, 3 month...N months as its own cohort.**

**TASK: Treating each unique tenure group as a cohort, calculate the Churn rate (percentage that had Yes Churn) per cohort. For example, the cohort that has had a tenure of 1 month should have a Churn rate of 61.99%. You should have cohorts 1-72 months with a general trend of the longer the tenure of the cohort, the less of a churn rate. This makes sense as you are less likely to stop service the longer you've had it.**

**TASK: Now that you have Churn Rate per tenure group 1-72 months, create a plot showing churn rate per months of tenure.**

## Feature Engineering

### Broader Cohort Groups
**TASK: Based on the tenure column values, create a new column called Tenure Cohort that creates 4 separate categories:**
   * '0-12 Months'
   * '24-48 Months'
   * '12-24 Months'
   * 'Over 48 Months'    

**Task**
- Create 'Family' column
- Logic: Partner (1/0) + Dependents (1/0)
- Result: 0 = Single, 1 = Has Partner OR Child, 2 = Has Both

**Task**
- Create 'ServiceCount' column (How many products or services do they buy?)

**TASK: Create a scatterplot of Total Charges versus Monthly Charts,colored by Tenure Cohort defined in the previous task.**

**TASK: Create a count plot showing the churn count per cohort.**

**Task: Label the target Column**

# Predictive Modeling

**Let's explore different classification based methods**


**TASK: Import the libraries required for making pipelines, scaling, encoding, spliting the data and applying grid search**

**Task: Train Test Split**

**Task: Using ColumnTransformer do Preproocessing of the coulumns**

**Task: Import all the required models and define the model parameters needed to pass to the Grid Search**

**Train the models using Pipelines and GridSearch and store the scores for comparision**

**Sort the Scores on basis of Recall and Plot a Barplot for model comparision and find the best model**

## All the Best for your future Endeavours!!