# Bank Customer Churn â€” Modeling

## Purpose of this notebook
This notebook focuses on training and evaluating machine learning models
for customer churn prediction.

The main goals are:
- Load preprocessed datasets and artifacts
- Train a strong baseline model
- Evaluate model performance using appropriate metrics
- Establish a reference point for more advanced models

All modeling steps rely on artifacts created in `02_preprocessing.ipynb`.


In [4]:
# Core libraries
import numpy as np
import pandas as pd

# System utilities
from pathlib import Path

# Modeling
from sklearn.linear_model import LogisticRegression

# Evaluation
from sklearn.metrics import (
    roc_auc_score,
    average_precision_score,
    classification_report,
    confusion_matrix,
    RocCurveDisplay,
    PrecisionRecallDisplay,
)

# Reproducibility
RANDOM_STATE = 42


## Load Preprocessing Artifacts

In this notebook we load preprocessing outputs generated in
`02_preprocessing.ipynb`.

Artifacts are expected to be present in the current Colab runtime
under the `artifacts/` directory.


In [5]:
from pathlib import Path
import pandas as pd

ARTIFACTS_DIR = Path("/content/artifacts")

assert ARTIFACTS_DIR.exists(), (
    "Artifacts directory not found. "
    "Run 02_preprocessing.ipynb in the same Colab runtime first."
)

print("Using artifacts directory:", ARTIFACTS_DIR.resolve())

X_train = pd.read_parquet(ARTIFACTS_DIR / "X_train.parquet")
X_test  = pd.read_parquet(ARTIFACTS_DIR / "X_test.parquet")

y_train = pd.read_parquet(ARTIFACTS_DIR / "y_train.parquet")["churn"]
y_test  = pd.read_parquet(ARTIFACTS_DIR / "y_test.parquet")["churn"]

X_train.shape, X_test.shape



AssertionError: Artifacts directory not found. Run 02_preprocessing.ipynb in the same Colab runtime first.

In [None]:
import pandas as pd

X_train = pd.read_parquet(ARTIFACTS_DIR / "X_train.parquet")
X_test  = pd.read_parquet(ARTIFACTS_DIR / "X_test.parquet")

y_train = pd.read_parquet(ARTIFACTS_DIR / "y_train.parquet")["churn"]
y_test  = pd.read_parquet(ARTIFACTS_DIR / "y_test.parquet")["churn"]

X_train.shape, X_test.shape


In [None]:
X_train.head(), y_train.value_counts(normalize=True)
