# Customer Churn — Final Model & Evaluation

## Objective
Train, evaluate, and interpret the final selected model.

This notebook is part of an end-to-end customer churn classification project.
All preprocessing, modeling, and evaluation steps are designed to be:
- Leakage-safe
- Reproducible
- Interview-defensible


## Customer Churn Prediction — Final Model & Delivery

This notebook finalizes the churn prediction project by introducing a
Decision Tree model, comparing all trained models, and translating results
into actionable business insights.


### Why Decision Trees?

Decision Trees are non-linear models that split data based on feature values.
They are intuitive, interpretable, and can capture complex relationships
that linear models may miss.


## Importing Libraries

In [3]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt                 
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

## Reading data and Separating features X from Target Variables y

In [26]:
churn = pd.read_csv('../data/churn_preprocessed.csv')
X = churn.drop(columns='Churn', errors='ignore')
y = churn['Churn']

## Train/Test Splitting

In [27]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, shuffle=True, stratify=y)

## Decision Tree Model

In [32]:
dt_model = DecisionTreeClassifier(max_depth=5)
dt_model.fit(X_train, y_train)
y_pred = dt_model.predict(X_test)

## Decision Tree Model Evaluation

In [33]:
dt_model_accuracy = accuracy_score(y_test, y_pred)
dt_model_confusion_matrix = confusion_matrix(y_test, y_pred)
dt_model_classification_report = classification_report(y_test, y_pred)


print(f"Decision Tree Model Accuracy: {dt_model_accuracy}")
print(f"Decision Tree Model Confusion Matrix: \n{dt_model_confusion_matrix}")
print(f"Decision Tree Model Classification Report: \n{dt_model_classification_report}")

Decision Tree Model Accuracy: 0.794180269694819
Decision Tree Model Confusion Matrix: 
[[917 118]
 [172 202]]
Decision Tree Model Classification Report: 
              precision    recall  f1-score   support

           0       0.84      0.89      0.86      1035
           1       0.63      0.54      0.58       374

    accuracy                           0.79      1409
   macro avg       0.74      0.71      0.72      1409
weighted avg       0.79      0.79      0.79      1409



### Feature Importance

Decision Trees provide feature importance scores that indicate which
variables most strongly influence predictions.


### Model Comparison Summary

- Logistic Regression: stable, interpretable baseline
- kNN: sensitive to scaling and data distribution
- Decision Tree: captures non-linear patterns but risks overfitting


### Final Model Selection

The final model was selected based on its ability to balance churn detection
(recall) and false positives, while remaining interpretable and stable
on unseen data.


### Business Insights

- Customers with shorter tenure are more likely to churn.
- Contract type and monthly charges strongly influence churn risk.
- Proactive retention efforts should target high-risk customer segments.


### Limitations

- The dataset is static and does not capture behavioral changes over time.
- Class imbalance may affect recall for churners.
- Further feature engineering could improve performance.


### Project Summary

This project demonstrated an end-to-end machine learning workflow for
customer churn prediction, from preprocessing to model selection and
business interpretation.
