# Churn Prediction Project

<p align="left">
  <img src="https://uruit.com/blog/wp-content/uploads/2020/11/Churn1-1024x724.jpg" width="600">
</p>

**Churn** is a phenomenon where customers stop using the services of a company. Therefore, churn prediction involves identifying customers who are most likely to terminate their contracts in the near future. If a company can do this, it can offer discounts or special deals on its services in order to retain those customers.

Of course, we can apply machine learning to this problem: using historical data about customers who have already left and building a model to identify current customers who are likely to leave. This is a **binary classification** task. The target variable we want to predict is categorical and has only two possible outcomes: **will leave** or **will not leave**.


## Project Context and Goals

Telecommunication company is experiencing a problem that some of their customers are churning and switch to competitors.  
Our aim is to develop a system to identify such users and offer them incentives that will encourage them to stay.  
  
We want to target these customers with our marketing messages and provide discounts. We would also like to understand why the model believes that certain customers are about to leave, and for that we need to be able to interpret its predictions.

We have collected a dataset that contains certain information about our customers: which services they used, how much they paid, and how long they stayed with us. We also know which customers terminated their contracts and stopped using our services (as a result of churn). We will use this information as the target variable in a machine learning model and predict it using all the other available information.

## Dataset


According to the description, the dataset contains the following information:

- **Customer services** — telephone service; multiple lines; Internet; technical support; and additional services such as online security, backup, device protection, and streaming TV;

- **Account information** — how long the customer has been with the company, contract type, and payment method;

- **Charges** — how much the customer paid for the last month and in total;

- **Demographic information** — gender, age, whether the customer has dependents or a partner;

- **Churn** — yes/no, whether the customer left the company during the last month.


## Packages

In [7]:
import numpy as np
import pandas as pd

import seaborn as sns
from matplotlib import pyplot as plt

pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)


In [8]:
df = pd.read_csv("WA_Fn-UseC_-Telco-Customer-Churn.csv")
display(df.head())
print(f"{len(df)} rows and {df.shape[1]} columns")

Unnamed: 0,customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn
0,7590-VHVEG,Female,0,Yes,No,1,No,No phone service,DSL,No,Yes,No,No,No,No,Month-to-month,Yes,Electronic check,29.85,29.85,No
1,5575-GNVDE,Male,0,No,No,34,Yes,No,DSL,Yes,No,Yes,No,No,No,One year,No,Mailed check,56.95,1889.5,No
2,3668-QPYBK,Male,0,No,No,2,Yes,No,DSL,Yes,Yes,No,No,No,No,Month-to-month,Yes,Mailed check,53.85,108.15,Yes
3,7795-CFOCW,Male,0,No,No,45,No,No phone service,DSL,Yes,No,Yes,Yes,No,No,One year,No,Bank transfer (automatic),42.3,1840.75,No
4,9237-HQITU,Female,0,No,No,2,Yes,No,Fiber optic,No,No,No,No,No,No,Month-to-month,Yes,Electronic check,70.7,151.65,Yes


7043 rows and 21 columns


#### Columns definition

- **CustomerID** — customer identifier;

- **Gender** — male/female;

- **SeniorCitizen** — whether the customer is a senior citizen (0/1);

- **Partner** — whether the customer lives with a partner (yes/no);

- **Dependents** — whether the customer has dependents (yes/no);

- **Tenure** — number of months since the contract started;

- **PhoneService** — whether the customer has phone service (yes/no);

- **MultipleLines** — whether the customer has multiple phone lines (yes/no/no phone service);

- **InternetService** — type of internet service (no/DSL/fiber optic);

- **OnlineSecurity** — whether online security is enabled (yes/no/no internet);

- **OnlineBackup** — whether online backup service is enabled (yes/no/no internet);

- **DeviceProtection** — whether device protection service is enabled (yes/no/no internet);

- **TechSupport** — whether the customer has technical support (yes/no/no internet);

- **StreamingTV** — whether streaming TV service is enabled (yes/no/no internet);

- **StreamingMovies** — whether streaming movie service is enabled (yes/no/no internet);

- **Contract** — type of contract (month-to-month/one year/two year);

- **PaperlessBilling** — whether paperless billing is enabled (yes/no);

- **PaymentMethod** — payment method (electronic check, mailed check, bank transfer, credit card);

- **MonthlyCharges** — monthly amount charged (numeric);

- **TotalCharges** — total amount charged (numeric);

- **Churn** — whether the customer terminated the contract (yes/no).
