# Customer Churn Prediction (Logistic Regression)

This notebook builds a churn model using **logistic regression** on a synthetic dataset located at `../Datasets/customer_churn.csv`.

## Steps
1. Load data & quick EDA
2. Train/test split
3. Logistic Regression
4. Evaluation (classification report, ROC-AUC, confusion matrix)
5. Simple business insights


In [0]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, roc_curve, auc, confusion_matrix

df = pd.read_csv('../Datasets/customer_churn.csv')
df.head()

In [0]:
# Quick EDA
display(df.describe())
df['churned'].value_counts().plot(kind='bar'); plt.title('Class Balance (Churned)'); plt.show()
df[['tenure_months','monthly_charges','support_tickets']].hist(figsize=(8,3)); plt.tight_layout(); plt.show()

In [0]:
# Train/test split
X = df[['tenure_months','monthly_charges','support_tickets','is_promo_user']]
y = df['churned']
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=42,stratify=y)
model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)
pred = model.predict(X_test)
print(classification_report(y_test, pred))

In [0]:
# ROC Curve
probs = model.predict_proba(X_test)[:,1]
fpr, tpr, thresholds = roc_curve(y_test, probs)
roc_auc = auc(fpr, tpr)
plt.figure(); plt.plot(fpr, tpr, label=f'ROC AUC = {roc_auc:.3f}')
plt.plot([0,1],[0,1],'--'); plt.xlabel('False Positive Rate'); plt.ylabel('True Positive Rate'); plt.title('ROC Curve'); plt.legend(); plt.show()

In [0]:
# Confusion Matrix
cm = confusion_matrix(y_test, pred)
print('Confusion Matrix:\n', cm)
plt.figure();
plt.imshow(cm, interpolation='nearest'); plt.title('Confusion Matrix'); plt.colorbar();
tick_marks = [0,1]
plt.xticks(tick_marks, ['No Churn', 'Churn'])
plt.yticks(tick_marks, ['No Churn', 'Churn'])
plt.xlabel('Predicted'); plt.ylabel('Actual'); plt.tight_layout(); plt.show()

## Business Takeaways
- **Support tickets** and **higher monthly charges** are associated with higher churn probability.
- **Longer tenure** reduces churn risk.
- Consider retention offers for high‑risk segments (high tickets, rising charges, low tenure).