# End-to-End Customer Churn Project (Telco Dataset)

In this notebook we develop an end-to-end customer churn prediction project using
the IBM Telco Customer Churn dataset.

We will follow a typical data science workflow:

1. **Business understanding** – What is churn and why do we care?
2. **Data understanding** – Explore the dataset and basic patterns.
3. **Data cleaning & preprocessing** – Handle data types, missing values, and encodings.
4. **Exploratory data analysis (EDA)** – Understand how features relate to churn.
5. **Feature engineering & modelling** – Build models to predict churn.
6. **Evaluation & model comparison** – Compare models with appropriate metrics.
7. **Interpretation & business insights** – Translate model results into actions.
8. **(Optional) Model export** – Save the best model for later use.

The target variable is **Churn**, which tells us whether a customer left the
company (Yes) or stayed (No). Our goal is to build a model that predicts
which customers are at high risk of churn, and to understand the key drivers
behind churn.

---


## 2. Imports and Configuration

In this section we import all the libraries that will be used throughout the
project and set some basic configuration options.

We will use:

- `pandas` and `numpy` for data manipulation.
- `matplotlib` and `seaborn` for visualization.
- `scikit-learn` for preprocessing, modelling, and evaluation.
- `typing` for explicit type hints.

We also set a random seed to make results reproducible.


In [None]:
from __future__ import annotations

from pathlib import Path
from typing import Dict, List, Tuple

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.compose import ColumnTransformer
from sklearn.metrics import (
    accuracy_score,
    classification_report,
    confusion_matrix,
    roc_auc_score,
    RocCurveDisplay,
)
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.dummy import DummyClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier

# Set plotting style and global options
sns.set(style="whitegrid")
plt.rcParams["figure.figsize"] = (8, 5)

RANDOM_STATE: int = 42
np.random.seed(RANDOM_STATE)

DATA_PATH: Path = Path("data") / "WA_Fn-UseC_-Telco-Customer-Churn.csv"

# Quick sanity check: fail early if the file is not present.
if not DATA_PATH.exists():
    raise FileNotFoundError(
        f"Data file not found at {DATA_PATH.resolve()}. "
        "Please download the Telco churn CSV from Kaggle and place it under the 'data/' directory."
    )


## 3. Business Understanding

### What is churn?

**Customer churn** (or attrition) happens when a customer stops using a service.
In telecom, churn often means cancelling a subscription, which directly impacts
recurring revenue.

### Why predict churn?

- Acquiring new customers is usually more expensive than retaining existing ones.
- If we can **identify high-risk customers early**, we can:
  - Offer discounts or special plans.
  - Improve service quality for specific segments.
  - Prioritize customer support resources.

### Our goal

Given customer information (contract type, tenure, payment method, monthly
charges, etc.), we want to:

1. Predict whether a customer is likely to churn in the near future.
2. Understand which factors are associated with higher churn risk.

This allows the business to design targeted **retention campaigns** instead of
generic, expensive actions.

---

**Section summary**

We framed churn as a business problem: predicting and understanding churn helps
the company protect recurring revenue by targeting retention efforts effectively.


## 4. Data Loading and Initial Inspection

In this section we:

1. Load the Telco churn dataset from CSV.
2. Inspect the first rows to understand the structure.
3. Check column types, missing values, and basic statistics.

We are mainly interested in:

- The **target variable**: `Churn`.
- The **features**: customer demographics, services, contract type, tenure,
  monthly and total charges, etc.


In [None]:
def load_telco_data(path: Path) -> pd.DataFrame:
    """Load the Telco customer churn dataset from a CSV file.

    Args:
        path: Path to the CSV file.

    Returns:
        pandas DataFrame containing the Telco churn data.

    Raises:
        FileNotFoundError: If the file does not exist.
        ValueError: If the resulting DataFrame is empty.
    """
    if not path.exists():
        raise FileNotFoundError(f"File not found: {path!s}")

    df: pd.DataFrame = pd.read_csv(path)

    if df.empty:
        raise ValueError(f"Loaded DataFrame is empty: {path!s}")

    return df


telco_df: pd.DataFrame = load_telco_data(DATA_PATH)

# Basic inspection
display(telco_df.head())
display(telco_df.info())
display(telco_df.describe(include="all").T)


### Section summary

We successfully loaded the Telco churn dataset and inspected its structure.
We now know:

- The dataset has one row per customer.
- The target `Churn` indicates whether the customer left (Yes) or stayed (No).
- Many features are categorical (e.g. `gender`, `Contract`, `PaymentMethod`).
- A few are numeric (e.g. `tenure`, `MonthlyCharges`, `TotalCharges`).

Next we will clean and preprocess the data to make it suitable for modelling.


## 5. Data Cleaning and Basic Preprocessing

The main goals in this step:

1. **Handle data types** – some numeric columns may be stored as strings.
2. **Identify and handle missing values**.
3. **Remove obvious data issues**, like duplicate customers if any.

For the IBM Telco dataset, a known issue is that `TotalCharges` can contain
spaces for customers with very low tenure, which makes it a non-numeric column.
We will:

- Convert `TotalCharges` to numeric.
- Coerce errors to `NaN`, then decide how to handle them.


In [None]:
def clean_telco_data(raw_df: pd.DataFrame) -> pd.DataFrame:
    """Clean the Telco dataset: fix data types, handle missing values, and drop duplicates.

    Args:
        raw_df: Original Telco churn DataFrame.

    Returns:
        Cleaned DataFrame.

    Raises:
        ValueError: If critical columns are missing.
    """
    df = raw_df.copy()

    required_columns: List[str] = ["customerID", "Churn", "TotalCharges"]
    missing_required: List[str] = [c for c in required_columns if c not in df.columns]
    if missing_required:
        raise ValueError(f"Missing required columns: {missing_required}")

    # Convert TotalCharges to numeric (spaces or non-numeric values -> NaN)
    df["TotalCharges"] = pd.to_numeric(df["TotalCharges"], errors="coerce")



    # Count missing values per column
    missing_counts: pd.Series = df.isna().sum()
    print("Missing values per column:")
    display(missing_counts[missing_counts > 0])

    # In this dataset, missing TotalCharges correspond to very new customers.
    # We will drop rows with missing TotalCharges to simplify the modelling.
    before_rows: int = df.shape[0]
    df = df.dropna(subset=["TotalCharges"])
    after_rows: int = df.shape[0]
    print(f"Dropped {before_rows - after_rows} rows with missing TotalCharges.")

    # Drop duplicate customer IDs if any (should be very rare)
    before_rows = df.shape[0]
    df = df.drop_duplicates(subset=["customerID"])
    after_rows = df.shape[0]
    print(f"Dropped {before_rows - after_rows} duplicate customerID rows.")

    # Optional: reset index
    df = df.reset_index(drop=True)

    return df


telco_df_clean: pd.DataFrame = clean_telco_data(telco_df)
display(telco_df_clean.head())


### Section summary

We cleaned the dataset by:

- Converting `TotalCharges` to numeric and dropping rows with missing values.
- Removing any duplicate `customerID` entries.
- Resetting the index.

The data is now in a more consistent state. Next we will perform exploratory
data analysis to understand the churn patterns.


## 6. Exploratory Data Analysis (EDA)

### 6.1 Churn distribution (class balance)

We start by understanding the distribution of the target variable `Churn`:

- How many customers churned vs. stayed?
- Is the dataset imbalanced?

This will guide our choice of metrics and model evaluation strategy.


In [None]:
def plot_churn_distribution(df: pd.DataFrame, target_col: str = "Churn") -> None:
    """Plot the distribution of the churn target variable.

    Args:
        df: Clean Telco DataFrame.
        target_col: Name of the churn column.

    Raises:
        KeyError: If the target column is not in the DataFrame.
    """
    if target_col not in df.columns:
        raise KeyError(f"Column {target_col!r} not found in DataFrame.")

    churn_counts: pd.Series = df[target_col].value_counts()
    churn_rate: float = (churn_counts.get("Yes", 0) / churn_counts.sum()) * 100.0

    print(f"Churn rate: {churn_rate:.2f}%")
    ax = sns.countplot(data=df, x=target_col)
    ax.set_title("Churn distribution")
    ax.bar_label(ax.containers[0])
    plt.show()


plot_churn_distribution(telco_df_clean)


### Section summary

We observed the number of churned vs. non-churned customers and computed the
overall churn rate.

Typically, churn is around 25–30% for this dataset, which means:
- The classes are **somewhat imbalanced**, but not extremely.
- We should pay attention to metrics like **recall** for the churned class,
  not only overall accuracy.


### 6.2 Numerical features vs churn

We now explore how numeric features such as:

- `tenure` (months as a customer),
- `MonthlyCharges`,
- `TotalCharges`

relate to churn. We look at distributions split by churn status and compare
typical values between churned and non-churned customers.


In [None]:
numeric_cols: List[str] = ["tenure", "MonthlyCharges", "TotalCharges"]

# Check that these columns exist
for col in numeric_cols:
    if col not in telco_df_clean.columns:
        raise KeyError(f"Expected numeric column {col!r} not found in DataFrame.")

fig, axes = plt.subplots(1, len(numeric_cols), figsize=(18, 4))

for ax, col in zip(axes, numeric_cols):
    sns.kdeplot(
        data=telco_df_clean,
        x=col,
        hue="Churn",
        common_norm=False,
        fill=True,
        alpha=0.5,
        ax=ax,
    )
    ax.set_title(f"{col} distribution by churn")


plt.tight_layout()
plt.show()


### Section summary

From the numeric feature distributions we usually see patterns like:

- Churned customers often have **shorter tenure** (they leave earlier).
- Churned customers may have **higher monthly charges**.
- Total charges can be lower for churned customers (they leave early, so they
  have not paid for many months).

These patterns support the business intuition: new or high-cost customers may
be more fragile and at higher risk.


### 6.3 Categorical features vs churn

We now look at selected categorical features and their relationship with churn:

- `Contract` – Month-to-month vs. one or two year contracts.
- `InternetService` – DSL, Fiber optic, or no internet service.
- `PaymentMethod` – e.g. electronic check, credit card, bank transfer.

We compute churn rates per category to see where churn is higher.


In [None]:
def compute_churn_rate_by_category(
    df: pd.DataFrame, category_col: str, target_col: str = "Churn"
) -> pd.DataFrame:
    """Compute churn rate for each category in a given column.

    Args:
        df: Clean Telco DataFrame.
        category_col: Name of the categorical column.
        target_col: Churn column, expected values 'Yes'/'No'.

    Returns:
        DataFrame with counts and churn rate per category.

    Raises:
        KeyError: If requested columns are not present.
    """
    for col in (category_col, target_col):
        if col not in df.columns:
            raise KeyError(f"Column {col!r} not found in DataFrame.")

    grouped = (
        df.groupby(category_col)[target_col]
        .value_counts()
        .unstack(fill_value=0)
        .rename(columns={"Yes": "ChurnYes", "No": "ChurnNo"})
    )
    grouped["Total"] = grouped["ChurnYes"] + grouped["ChurnNo"]
    grouped["ChurnRate"] = grouped["ChurnYes"] / grouped["Total"]
    return grouped.sort_values("ChurnRate", ascending=False)


categorical_to_inspect: List[str] = ["Contract", "InternetService", "PaymentMethod"]

for col in categorical_to_inspect:
    print(f"\n=== Churn rate by {col} ===")
    summary_df = compute_churn_rate_by_category(telco_df_clean, col)
    display(summary_df)

    ax = sns.barplot(
        data=summary_df.reset_index(),
        x=col,
        y="ChurnRate",
        order=summary_df.index,
    )
    ax.set_title(f"Churn rate by {col}")
    ax.set_ylabel("Churn rate")
    ax.set_xlabel(col)
    ax.set_ylim(0, 1)
    ax.bar_label(ax.containers[0], fmt="%.2f")
    plt.xticks(rotation=30)
    plt.tight_layout()
    plt.show()


### Section summary

We inspected churn rates for key categorical variables.

Typical findings:

- **Month-to-month contracts** often have much higher churn than longer-term
  contracts.
- Certain internet types (e.g. fiber) may be associated with higher churn.
- Some payment methods (e.g. electronic check) can correlate with higher churn.

These insights hint at potential **intervention points**, such as:
- Encouraging customers to move from month-to-month to longer contracts.
- Investigating service quality for high-churn segments.
- Adjusting payment experience for risky payment methods.


## 7. Train–Test Split and Feature Preprocessing

Before building models, we need to:

1. Separate **features (X)** and **target (y)**.
2. Split the data into **training** and **test** sets.
3. Define preprocessing:
   - One-hot encode categorical variables.
   - Scale numerical variables (for models like logistic regression).

We will:

- Drop `customerID` (identifier, not useful for prediction).
- Use `ColumnTransformer` + `Pipeline` from scikit-learn to keep preprocessing
  and modelling together in a clean way.


In [None]:
# Separate target and features
TARGET_COL: str = "Churn"

if TARGET_COL not in telco_df_clean.columns:
    raise KeyError(f"Target column {TARGET_COL!r} not found in DataFrame.")

X: pd.DataFrame = telco_df_clean.drop(columns=[TARGET_COL, "customerID"])
y: pd.Series = telco_df_clean[TARGET_COL].map({"No": 0, "Yes": 1})  # binary 0/1

# Identify categorical and numerical columns
categorical_cols: List[str] = [
    col for col in X.columns if X[col].dtype == "O"
]
numeric_cols: List[str] = [
    col for col in X.columns if col not in categorical_cols
]

print("Categorical columns:", categorical_cols)
print("Numeric columns:", numeric_cols)

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.2,
    stratify=y,
    random_state=RANDOM_STATE,
)

print("Train size:", X_train.shape, "Test size:", X_test.shape)

# Define preprocessing transformer
numeric_transformer = Pipeline(
    steps=[
        ("scaler", StandardScaler()),
    ]
)

categorical_transformer = Pipeline(
    steps=[
        ("encoder", OneHotEncoder(handle_unknown="ignore")),
    ]
)

preprocessor = ColumnTransformer(
    transformers=[
        ("num", numeric_transformer, numeric_cols),
        ("cat", categorical_transformer, categorical_cols),
    ]
)


### Section summary

We:

- Prepared features and target, mapping `Churn` to 0/1.
- Performed a stratified train–test split (preserving the churn ratio).
- Defined a preprocessing pipeline that:
  - Scales numerical features.
  - One-hot encodes categorical features.

Next we will build baseline and more advanced models using this preprocessing.


## 8. Modelling

We will compare three models:

1. **Baseline** – `DummyClassifier` that predicts the majority class.
2. **Logistic Regression** – simple, interpretable linear model.
3. **Random Forest** – non-linear, tree-based ensemble.

First we define a helper function that:

- Fits a model pipeline.
- Computes metrics on train and test sets.
- Prints a classification report and ROC-AUC.


In [None]:
from sklearn.base import BaseEstimator


def evaluate_classifier(
    name: str,
    model: BaseEstimator,
    X_train: pd.DataFrame,
    X_test: pd.DataFrame,
    y_train: pd.Series,
    y_test: pd.Series,
) -> Dict[str, float]:
    """Fit a classifier and evaluate it on train and test data.

    Args:
        name: Model name (for printing).
        model: Unfitted sklearn estimator or pipeline.
        X_train: Training features.
        X_test: Test features.
        y_train: Training labels (0/1).
        y_test: Test labels (0/1).

    Returns:
        Dictionary with key metrics on the test set.

    Raises:
        ValueError: If y_train or y_test contain values outside {0, 1}.
    """
    # Type / value checks
    unique_y_train = set(y_train.unique())
    unique_y_test = set(y_test.unique())
    allowed_values = {0, 1}
    if not unique_y_train.issubset(allowed_values) or not unique_y_test.issubset(
        allowed_values
    ):
        raise ValueError("y_train and y_test should contain only 0/1 values.")

    print(f"\n===== {name} =====")
    model.fit(X_train, y_train)

    y_pred_train = model.predict(X_train)
    y_pred_test = model.predict(X_test)

    # Some models may not support predict_proba; handle gracefully
    if hasattr(model, "predict_proba"):
        y_proba_test = model.predict_proba(X_test)[:, 1]
        roc_auc = roc_auc_score(y_test, y_proba_test)
    else:
        y_proba_test = None
        roc_auc = np.nan

    acc_train = accuracy_score(y_train, y_pred_train)
    acc_test = accuracy_score(y_test, y_pred_test)

    print(f"Train accuracy: {acc_train:.3f}")
    print(f"Test accuracy:  {acc_test:.3f}")
    if not np.isnan(roc_auc):
        print(f"Test ROC-AUC:  {roc_auc:.3f}")

    print("\nClassification report (test):")
    print(classification_report(y_test, y_pred_test, target_names=["No churn", "Churn"]))

    # Confusion matrix
    cm = confusion_matrix(y_test, y_pred_test)
    sns.heatmap(
        cm,
        annot=True,
        fmt="d",
        cmap="Blues",
        xticklabels=["Pred No", "Pred Yes"],
        yticklabels=["True No", "True Yes"],
    )
    plt.title(f"Confusion matrix - {name}")
    plt.ylabel("True label")
    plt.xlabel("Predicted label")
    plt.show()

    # ROC curve if probabilities available
    if y_proba_test is not None:
        RocCurveDisplay.from_predictions(y_test, y_proba_test)
        plt.title(f"ROC curve - {name}")
        plt.show()

    return {
        "model": name,
        "train_accuracy": acc_train,
        "test_accuracy": acc_test,
        "roc_auc": float(roc_auc) if not np.isnan(roc_auc) else np.nan,
    }


### 8.2 Baseline model: Dummy classifier

A baseline helps us assess whether our complex models are actually useful.

We use a `DummyClassifier` that always predicts the most frequent class
(usually "No churn"). Any reasonable model should clearly outperform this
baseline, especially in recall for the churn class.


In [None]:
baseline_clf = Pipeline(
    steps=[
        ("preprocess", preprocessor),
        ("clf", DummyClassifier(strategy="most_frequent", random_state=RANDOM_STATE)),
    ]
)

baseline_metrics = evaluate_classifier(
    name="Baseline (Most Frequent)",  # type: ignore[arg-type]
    model=baseline_clf,
    X_train=X_train,
    X_test=X_test,
    y_train=y_train,
    y_test=y_test,
)
baseline_metrics


### 8.3 Logistic Regression

Logistic regression is:

- A linear model for binary classification.
- Interpretable via its coefficients.
- A good first non-trivial model.

We train it on the preprocessed features and evaluate it on the test set.


In [None]:
log_reg_clf = Pipeline(
    steps=[
        ("preprocess", preprocessor),
        (
            "clf",
            LogisticRegression(
                max_iter=1000,
                random_state=RANDOM_STATE,
                n_jobs=-1,
            ),
        ),
    ]
)

log_reg_metrics = evaluate_classifier(
    name="Logistic Regression",  # type: ignore[arg-type]
    model=log_reg_clf,
    X_train=X_train,
    X_test=X_test,
    y_train=y_train,
    y_test=y_test,
)
log_reg_metrics


### 8.4 Random Forest

Random Forest is an ensemble of decision trees:

- Captures non-linear relationships and interactions.
- Provides feature importance measures.
- Often a strong baseline for tabular data.

We train a reasonably sized forest (not heavily tuned) and compare its
performance against logistic regression and the dummy baseline.


In [None]:
rf_clf = Pipeline(
    steps=[
        ("preprocess", preprocessor),
        (
            "clf",
            RandomForestClassifier(
                n_estimators=200,
                max_depth=None,
                min_samples_split=4,
                min_samples_leaf=2,
                random_state=RANDOM_STATE,
                n_jobs=-1,
            ),
        ),
    ]
)

rf_metrics = evaluate_classifier(
    name="Random Forest",  # type: ignore[arg-type]
    model=rf_clf,
    X_train=X_train,
    X_test=X_test,
    y_train=y_train,
    y_test=y_test,
)
rf_metrics


### 8.5 Model comparison

We now put the main metrics side-by-side for:

- Baseline dummy model
- Logistic Regression
- Random Forest


In [None]:
results_df = pd.DataFrame([baseline_metrics, log_reg_metrics, rf_metrics])
display(results_df)


### Section summary

We trained and evaluated three models:

- **Dummy baseline** – low recall and low ROC-AUC, as expected.
- **Logistic Regression** – improved accuracy and recall; provides interpretable coefficients.
- **Random Forest** – usually highest ROC-AUC and better recall for churn at the cost
  of interpretability.

The exact numbers may differ slightly due to randomness, but a good model should
substantially outperform the dummy classifier, especially on the churn (positive) class.


## 9. Feature Importance and Interpretation

Understanding *why* the model predicts churn is as important as the predictive
performance. Here we:

1. Extract feature importances from the Random Forest.
2. Map them back to the original feature names (after one-hot encoding).
3. Interpret the most influential drivers of churn.

This helps translate model results into actionable business insights.


In [None]:
def get_feature_names_from_preprocessor(
    preprocessor: ColumnTransformer,
) -> List[str]:
    """Extract the final feature names after the ColumnTransformer.

    Args:
        preprocessor: Fitted ColumnTransformer.

    Returns:
        List of feature names corresponding to the transformed columns.
    """
    feature_names: List[str] = []

    # Numeric features (passthrough -> just column names after scaler)
    num_features: List[str] = preprocessor.named_transformers_["num"].named_steps[
        "scaler"
    ].get_feature_names_out(numeric_cols).tolist()
    feature_names.extend(num_features)

    # Categorical features (OneHotEncoder produces multiple columns)
    ohe: OneHotEncoder = preprocessor.named_transformers_["cat"].named_steps["encoder"]
    cat_features: List[str] = ohe.get_feature_names_out(categorical_cols).tolist()
    feature_names.extend(cat_features)

    return feature_names


# Fit the RF pipeline to the full training data (if not already fitted)
rf_clf.fit(X_train, y_train)

# Extract the underlying RF model and preprocessor
fitted_preprocessor: ColumnTransformer = rf_clf.named_steps["preprocess"]  # type: ignore[assignment]
fitted_rf: RandomForestClassifier = rf_clf.named_steps["clf"]  # type: ignore[assignment]

rf_feature_importances: np.ndarray = fitted_rf.feature_importances_
rf_feature_names: List[str] = get_feature_names_from_preprocessor(fitted_preprocessor)

# Build a DataFrame of importances
importance_df = (
    pd.DataFrame(
        {
            "feature": rf_feature_names,
            "importance": rf_feature_importances,
        }
    )
    .sort_values("importance", ascending=False)
    .head(20)
)

display(importance_df)

plt.figure(figsize=(8, 6))
sns.barplot(data=importance_df, x="importance", y="feature")
plt.title("Top 20 feature importances (Random Forest)")
plt.tight_layout()
plt.show()


### Section summary

The feature importance plot highlights which variables most influence the
Random Forest predictions. Typical important drivers include:

- Contract type (e.g. month-to-month vs. long-term).
- Tenure.
- Monthly and total charges.
- Internet-related services and additional options.

These results align with our EDA and business intuition. They can guide:

- Targeted offers (e.g. discounts for high-risk segments).
- Product changes (e.g. improving problematic services).
- Contract design (e.g. incentives to move away from high-risk contract types).


## 10. Business Insights and Next Steps

### Key findings

From our analysis and models, we learned that:

1. **Churn rate** is non-trivial (around a quarter to a third of customers), so
   retention strategies are important.
2. **Short-tenure, month-to-month customers** are at much higher risk.
3. Customers with **higher monthly charges** also show increased churn, which may
   reflect price sensitivity or perceived value issues.
4. Certain **services and payment methods** correlate with higher churn.

### Possible actions

Based on these findings, the company could consider:

- **Onboarding and early-life programs** for new customers to increase engagement
  and satisfaction.
- **Targeted offers** (discounts, service bundles) for high-risk segments indicated
  by the model.
- **Reviewing pricing and plan structure** for customers with very high monthly
  charges.
- **Monitoring service quality** for configurations with high churn rates.

### Model improvements and future work

To make this churn model production-ready, we could:

- Tune hyperparameters using cross-validation (GridSearchCV/RandomizedSearchCV).
- Try additional algorithms (Gradient Boosting, XGBoost, etc.).
- Use **calibrated probabilities** and choose thresholds based on business costs
  (e.g. cost of contacting a customer vs. cost of losing them).
- Implement regular retraining and monitoring (data drift, performance decay).
- Combine with time-based or survival models to predict **time-to-churn**.

---

**Overall summary**

We built a complete churn prediction workflow:

- Framed the business problem.
- Explored and cleaned the Telco churn dataset.
- Performed EDA to understand patterns of churn.
- Built and compared several models.
- Interpreted model outputs to derive business-oriented insights.

This notebook can serve as a solid template for churn analysis in other
industries (banking, subscriptions, SaaS, etc.) by swapping in the relevant
dataset and adjusting the features accordingly.
