# üìä Customer Churn Prediction ‚Äî Exploratory Data Analysis (EDA)

This notebook explores the Telco Customer Churn dataset to understand customer behavior and identify key factors contributing to churn.

The insights from this analysis will guide feature engineering and model selection in later stages of the project.

In [2]:
# Import Libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

plt.style.use("default")

ModuleNotFoundError: No module named 'pandas'

In [None]:
#Load Dataset

df = pd.read_csv("../data/raw/WA_Fn-UseC_-Telco-Customer-Churn.csv")
df.head()

In [None]:
#Dataset Overview

df.shape
df.info()

In [None]:
# Target Variable Distribution

df['Churn'].value_counts(normalize=True) * 100

sns.countplot(x='Churn', data=df)
plt.title("Churn Distribution")
plt.show()

In [None]:
# Missing Values Check

df.isnull().sum()

In [None]:
# Summary Statistics

df.describe()

In [None]:
# Churn vs Numerical Features

numerical_cols = ['tenure', 'MonthlyCharges']

for col in numerical_cols:
    sns.boxplot(x='Churn', y=col, data=df)
    plt.title(f"{col} vs Churn")
    plt.show()

In [None]:
# Churn vs Categorical Features

categorical_cols = ['Contract', 'PaymentMethod', 'InternetService']

for col in categorical_cols:
    sns.countplot(x=col, hue='Churn', data=df)
    plt.xticks(rotation=45)
    plt.title(f"{col} vs Churn")
    plt.show()

In [None]:
# Correlation Analysis

df_encoded = df.copy()
df_encoded['Churn'] = df_encoded['Churn'].map({'Yes': 1, 'No': 0})

corr = df_encoded[['tenure', 'MonthlyCharges', 'Churn']].corr()

sns.heatmap(corr, annot=True, cmap="coolwarm")
plt.title("Correlation Heatmap")
plt.show()

## üîç Key Insights from EDA

- Customers with shorter tenure are more likely to churn.
- Month-to-month contracts show higher churn rates.
- Higher monthly charges correlate with increased churn.
- Certain payment methods are associated with higher churn risk.

These insights will guide feature engineering and model selection.

## üìå Summary & Next Steps

The exploratory analysis revealed several strong indicators of churn,
particularly related to contract type, tenure, and monthly charges.

In the next phase, the dataset will be cleaned, encoded, and transformed
into a model-ready format, followed by feature engineering.