# CUSTOMER CHURN ANALYSIS PROJECT

This project analyzes customer churn behavior to identify key factors that influence customer attrition and  provide actionable insights for retention strategies.


## Data Loading

In [37]:
import pandas as pd 
df = pd.read_csv ("WA_Fn-UseC_-Telco-Customer-Churn.csv")
df.head()

Unnamed: 0,customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,...,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn
0,7590-VHVEG,Female,0,Yes,No,1,No,No phone service,DSL,No,...,No,No,No,No,Month-to-month,Yes,Electronic check,29.85,29.85,No
1,5575-GNVDE,Male,0,No,No,34,Yes,No,DSL,Yes,...,Yes,No,No,No,One year,No,Mailed check,56.95,1889.5,No
2,3668-QPYBK,Male,0,No,No,2,Yes,No,DSL,Yes,...,No,No,No,No,Month-to-month,Yes,Mailed check,53.85,108.15,Yes
3,7795-CFOCW,Male,0,No,No,45,No,No phone service,DSL,Yes,...,Yes,Yes,No,No,One year,No,Bank transfer (automatic),42.3,1840.75,No
4,9237-HQITU,Female,0,No,No,2,Yes,No,Fiber optic,No,...,No,No,No,No,Month-to-month,Yes,Electronic check,70.7,151.65,Yes


## Data Overview

In [38]:
df.shape

(7043, 21)

In [40]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7043 entries, 0 to 7042
Data columns (total 21 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   customerID        7043 non-null   object 
 1   gender            7043 non-null   object 
 2   SeniorCitizen     7043 non-null   int64  
 3   Partner           7043 non-null   object 
 4   Dependents        7043 non-null   object 
 5   tenure            7043 non-null   int64  
 6   PhoneService      7043 non-null   object 
 7   MultipleLines     7043 non-null   object 
 8   InternetService   7043 non-null   object 
 9   OnlineSecurity    7043 non-null   object 
 10  OnlineBackup      7043 non-null   object 
 11  DeviceProtection  7043 non-null   object 
 12  TechSupport       7043 non-null   object 
 13  StreamingTV       7043 non-null   object 
 14  StreamingMovies   7043 non-null   object 
 15  Contract          7043 non-null   object 
 16  PaperlessBilling  7043 non-null   object 


In [41]:
df.describe()

Unnamed: 0,SeniorCitizen,tenure,MonthlyCharges
count,7043.0,7043.0,7043.0
mean,0.162147,32.371149,64.761692
std,0.368612,24.559481,30.090047
min,0.0,0.0,18.25
25%,0.0,9.0,35.5
50%,0.0,29.0,70.35
75%,0.0,55.0,89.85
max,1.0,72.0,118.75


## Data Cleaning

In [42]:
# Convert TotalCharges to numeric (some values may be blank)
df["TotalCharges"] = pd.to_numeric(df["TotalCharges"], errors="coerce")

# Remove rows with missing values
df = df.dropna()

# Check if missing values remain
df.isnull().sum()

customerID          0
gender              0
SeniorCitizen       0
Partner             0
Dependents          0
tenure              0
PhoneService        0
MultipleLines       0
InternetService     0
OnlineSecurity      0
OnlineBackup        0
DeviceProtection    0
TechSupport         0
StreamingTV         0
StreamingMovies     0
Contract            0
PaperlessBilling    0
PaymentMethod       0
MonthlyCharges      0
TotalCharges        0
Churn               0
dtype: int64

## Exploratory Data Analysis (EDA)

### Churn Distribution

In [43]:
df["Churn"].value_counts()

Churn
No     5163
Yes    1869
Name: count, dtype: int64

In [44]:
df["Churn"].value_counts(normalize=True)

Churn
No     0.734215
Yes    0.265785
Name: proportion, dtype: float64

##### Churn Rate Analysis
Approximately 26.6% of customers have churned.

This represents a moderate churn rate, indicating that more than one-quarter of the customer base has left the company. Further analysis is required to indetify the main drivers of customer attrition.


### Churn by Contract Type

In [45]:
pd.crosstab(df["Contract"], df["Churn"], normalize="index")

Churn,No,Yes
Contract,Unnamed: 1_level_1,Unnamed: 2_level_1
Month-to-month,0.572903,0.427097
One year,0.887228,0.112772
Two year,0.971513,0.028487


##### Churn by Contract Type
Customers with month-to-month contracts show a significantly higher churn rate.(approximately 42.7%) compared to those with longer-term contracts.

One year contract customers have a churn rate of about 11.3%, while  customers with two year contracts exhibit a very low churn rate of only 2.8%.

This suggests that longer contract commitments strongly reduce customer attrition.
Encouraging customers to switch from monthly plans to long-term contracts could be an effective retention strategy.

### Churn by Customer Tenure

In [46]:
df.groupby("Churn")["tenure"].mean()

Churn
No     37.650010
Yes    17.979133
Name: tenure, dtype: float64

#### Churn by Customer Tenure
Customers who churned have an average tenure of approximately 18 months, while retained customers stay for about 37.6 months on average.

This indicates that customers are more likely to leave during earlier stages of their relationship with the company. Customer loyalty appears to increase over time.

Focusing on improving early customer experience and engagement may help reduce churn significantly.

### Churn by Monthly Charges

In [47]:
df.groupby("Churn")["MonthlyCharges"].mean()

Churn
No     61.307408
Yes    74.441332
Name: MonthlyCharges, dtype: float64

##### Churn by Monthly Charges 
Customers who churned pay higher monthly charges on average (approximately 74.4 USD) compared to retained customers(61.3 USD)

This suggests that higher pricing may contribute to customer attrition.
Customers with more expensive plans appear more likely to leave, possibly due to perceived lack of value or affordability concerns.

### At-Risk Customer Profile

In [48]:
pd.crosstab(
    [df["Contract"], df["InternetService"], df["PaymentMethod"]],
    df["Churn"],
    normalize="index"
).sort_values(by="Yes", ascending=False).head(10) 

Unnamed: 0_level_0,Unnamed: 1_level_0,Churn,No,Yes
Contract,InternetService,PaymentMethod,Unnamed: 3_level_1,Unnamed: 4_level_1
Month-to-month,Fiber optic,Electronic check,0.396327,0.603673
Month-to-month,Fiber optic,Mailed check,0.492537,0.507463
Month-to-month,Fiber optic,Bank transfer (automatic),0.544343,0.455657
Month-to-month,Fiber optic,Credit card (automatic),0.583618,0.416382
Month-to-month,DSL,Electronic check,0.594937,0.405063
Month-to-month,DSL,Mailed check,0.692098,0.307902
Month-to-month,DSL,Credit card (automatic),0.72973,0.27027
One year,Fiber optic,Electronic check,0.739796,0.260204
Month-to-month,No,Mailed check,0.793846,0.206154
Month-to-month,No,Bank transfer (automatic),0.8,0.2


##### At-Risk Customer Profile
Customers with month-to-month contracts, fiber optic internet service, and electronic check payment method have highest churn rate (approximately 60%).

This indicates that customers without long-term commitment and using non-automatic payment methods are more likely to leave.
Targeted retention strategies for this segment could significantly reduce overall churn.

### Key Insights
-Approximately 26.6% of customers have churned.

-Month-to-month contracts exhibit significantly higher churn rates compared to long-term contracts.

-Customers with higher monthly charges are more likely to churn.

-Short-tenure customers show greater churn risk than long-term customers.

-Customers using electronic check payment methods demonstrate higher churn rates.

-The highest-risk segment consists of customers with month-to-month contracts, fiber optic internet service, and electronic check payment method.

### Conclusion
The analysis revealed that customer churn is primarily driven by contract type, pricing and customer tenure. Month-to-month customers, those with higher monthly charges and newer customers are significantly more likely to leave.

Encouraging long-term conracts, offering competitive pricing and improving early customer engagement may help reduce churn and increase customer retention.