# Customer Churn Prediction (CCP) Project - Telecom industry

## Introduction:
- During the enormous increase in numbers of customers who are using the communication sector and in numbers of companies, the competitive level between companies raised.
- Each company tries to survive in this competition through many strategies, The Main strategies are:
1) upsell existing customers.
2) increase duration of retention of their customers,
3) acquire new customers.
- Companies are concerned about seeking to keep or retain their customers as they are considered that as a profit, and it is cheaper to keep them than to earn a new one.
- Each company tries to keep its customers, by making them more loyal.
- Customers are great ambassadors in the market as the company can use them for making advertising of the company's product or service.

In [None]:
import pandas as pd # linear algebra - analysis and manipulation tool
from matplotlib import pyplot as plt #for creating static, animated, and interactive visualizations in Python
#This type of plot is commonly used in neuroscience for representing neural events, where it is commonly called a spike raster, dot raster, or raster plot.
import numpy as np # data processing, CSV file I/O (e.g. pd.read_csv)
%matplotlib inline 
# Seeing your plots in the notebook

In [None]:
df = pd.read_csv("../input/telco-customer-churn/WA_Fn-UseC_-Telco-Customer-Churn.csv")
df.sample(5)

In [None]:
df.drop('customerID' , axis = 'columns' , inplace = True)
df.dtypes

This is a problem in TotalCharges column, because it must be a float64 data type, not a string data type.
We must change this type from string to float.

In [None]:
df.TotalCharges.values

See our numbers are string, and this is wrong.

In [None]:
df.MonthlyCharges.values

In [None]:
pd.to_numeric(df.TotalCharges)

ValueError Because some of values have " " space into it.
First,
we must solve the problem of having space into our values.

In [None]:
pd.to_numeric(df.TotalCharges, errors='coerce') 

We can use ( <b>errors{‘ignore’, ‘raise’, ‘coerce’}, default ‘raise’ </b>) with our function <b>pd.to_numeric.</b>

If <b>‘raise’</b>, then invalid parsing will raise an exception.

If <b>‘coerce’</b>, then invalid parsing will be set as NaN (Not a Number).
<br>
apple  NaN
<br>
'1.3'  1.3
<br>
If <b>‘ignore’</b>, then invalid parsing will return the input.
<br>
0    apple
<br>
1      1.0
<br>
2        2
<br>
3       -3

In [None]:
pd.to_numeric(df.TotalCharges, errors='coerce').isnull()

In [None]:
df[pd.to_numeric(df.TotalCharges, errors='coerce').isnull()]

Scroll to see TotalCharges is empty in last cell.

In [None]:
df[pd.to_numeric(df.TotalCharges, errors='coerce').isnull()].shape #to see how many rows will remove

In [None]:
df.shape

As we see we have 7043 rows in our dataset, so when we remove 11, the dataset is not affected.

In [None]:
df.iloc[753]['TotalCharges'] # it is Blank,This is for sure

In [None]:
df1 = df[df.TotalCharges !=' ']
df1.shape #Store new datafram in new one to remove 11 rows that have a blank TotalCharges value

In [None]:
df1.dtypes

In [None]:
df1.TotalCharges = pd.to_numeric(df1.TotalCharges)

In [None]:
df1.dtypes

TotalCharges is float.

## Let's do some visualizations to see our data clearly

<b>As a company or business, it is a big loss if you lose a loyal customer, because the cost of keeping a customer is much less than bringing in a new customer with a expensive or attractive offer.
- It is important to know the number of loyal customers who leave the company and to know the reasons for this because it is one of the strong indicators of business loss.
</b>

In [None]:
df1[df1.Churn == 'No'] #Customer that not leaving business yet.

<b>
The duration of the client's stay with the company determines his loyalty or not
We can assume that a customer's stay for more than 20 months makes him a loyal customer.
    </b>

In [None]:
tenure_churn_no = df1[df1.Churn =='No'].tenure
tenure_churn_no #This is Customer that not leaving business and tenure or duration that are stay with company.

In [None]:
tenure_churn_yes = df1[df1.Churn =='Yes'].tenure
tenure_churn_yes #This is Customers who left within the last month and tenure or duration that are spent with the company.

In [None]:
tenure_churn_no = df1[df1.Churn =='No'].tenure
tenure_churn_yes = df1[df1.Churn =='Yes'].tenure

plt.xlabel(" Tenure (Months) ")
plt.ylabel("Number of Customers")
plt.title("Customer Churn Prediction Visualization")
plt.hist([tenure_churn_yes, tenure_churn_no], color = ['red', 'green'] , label= ['Churn=Yes', 'Churn=No'])
plt.legend()