## üìå Task 4: Churn Prediction Model

The objective of this task is to build and evaluate machine learning classification models
to predict customer churn in a telecommunications company.

Multiple classification algorithms are implemented and compared using appropriate
evaluation metrics. The best-performing model is selected for further analysis,
keeping in mind the imbalanced nature of churn data.


## üì¶ Load Required Libraries

In this step, we import the necessary Python libraries for:
- Data handling
- Model training
- Model evaluation


In [13]:
import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import (
    accuracy_score,
    precision_score,
    recall_score,
    f1_score,
    roc_auc_score
)

from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier

import warnings
warnings.filterwarnings("ignore")


## üìÇ Load the Cleaned Dataset

The cleaned dataset prepared in **Task-1 (Data Preparation)** is used
for building the churn prediction models.


In [14]:
data = pd.read_csv(
    "../Task-1_Data_Preparation/dataset/Telco_Customer_Churn_Dataset_cleaned.csv"
)

data.head()


Unnamed: 0,customerID,SeniorCitizen,tenure,MonthlyCharges,TotalCharges,gender_Male,Partner_Yes,Dependents_Yes,PhoneService_Yes,MultipleLines_No phone service,...,StreamingTV_Yes,StreamingMovies_No internet service,StreamingMovies_Yes,Contract_One year,Contract_Two year,PaperlessBilling_Yes,PaymentMethod_Credit card (automatic),PaymentMethod_Electronic check,PaymentMethod_Mailed check,Churn_Yes
0,7590-VHVEG,0,1,29.85,29.85,False,True,False,False,True,...,False,False,False,False,False,True,False,True,False,False
1,5575-GNVDE,0,34,56.95,1889.5,True,False,False,True,False,...,False,False,False,True,False,False,False,False,True,False
2,3668-QPYBK,0,2,53.85,108.15,True,False,False,True,False,...,False,False,False,False,False,True,False,False,True,True
3,7795-CFOCW,0,45,42.3,1840.75,True,False,False,False,True,...,False,False,False,True,False,False,False,False,False,False
4,9237-HQITU,0,2,70.7,151.65,False,False,False,True,False,...,False,False,False,False,False,True,False,True,False,True


## üß© Define Features and Target Variable

- **Target Variable:** `Churn_Yes`
- **Features:** All remaining columns except `customerID`


In [15]:
X = data.drop(["Churn_Yes", "customerID"], axis=1)
y = data["Churn_Yes"]


## üîÄ Train‚ÄìTest Split

The dataset is split into training and testing sets.
Stratified sampling is used to preserve the original churn distribution.


In [16]:
X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.2,
    random_state=42,
    stratify=y
)


## ‚öñÔ∏è Feature Scaling

Feature scaling is applied **only for Logistic Regression**,
as tree-based models do not require scaled features.


In [17]:
scaler = StandardScaler()

X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)


## üìä Model Evaluation Strategy

Models are evaluated using the following metrics:
- Accuracy
- Precision
- Recall
- F1-Score
- ROC-AUC

Since churn data is imbalanced, **Recall and F1-Score** are prioritized.


### Logistic Regression Model

Logistic Regression is used as a baseline model due to its simplicity
and interpretability.

Class imbalance is handled using `class_weight="balanced"`.


In [18]:
log_reg = LogisticRegression(
    max_iter=1000,
    class_weight="balanced"
)

log_reg.fit(X_train_scaled, y_train)

y_pred_lr = log_reg.predict(X_test_scaled)
y_prob_lr = log_reg.predict_proba(X_test_scaled)[:, 1]

lr_metrics = {
    "Model": "Logistic Regression",
    "Accuracy": accuracy_score(y_test, y_pred_lr),
    "Precision": precision_score(y_test, y_pred_lr),
    "Recall": recall_score(y_test, y_pred_lr),
    "F1-Score": f1_score(y_test, y_pred_lr),
    "ROC-AUC": roc_auc_score(y_test, y_prob_lr)
}

lr_metrics


{'Model': 'Logistic Regression',
 'Accuracy': 0.7395315826827538,
 'Precision': 0.5060240963855421,
 'Recall': 0.786096256684492,
 'F1-Score': 0.6157068062827226,
 'ROC-AUC': 0.8412978893797307}

### Decision Tree Classifier

Decision Tree is a rule-based model capable of capturing
non-linear relationships in the data.

Class imbalance is addressed using balanced class weights.


In [19]:
dt = DecisionTreeClassifier(
    random_state=42,
    class_weight="balanced"
)

dt.fit(X_train, y_train)

y_pred_dt = dt.predict(X_test)
y_prob_dt = dt.predict_proba(X_test)[:, 1]

dt_metrics = {
    "Model": "Decision Tree",
    "Accuracy": accuracy_score(y_test, y_pred_dt),
    "Precision": precision_score(y_test, y_pred_dt),
    "Recall": recall_score(y_test, y_pred_dt),
    "F1-Score": f1_score(y_test, y_pred_dt),
    "ROC-AUC": roc_auc_score(y_test, y_prob_dt)
}

dt_metrics


{'Model': 'Decision Tree',
 'Accuracy': 0.7310149041873669,
 'Precision': 0.49318801089918257,
 'Recall': 0.4839572192513369,
 'F1-Score': 0.4885290148448043,
 'ROC-AUC': 0.6525911286780852}

### Random Forest Classifier

Random Forest is an ensemble learning method that combines multiple
decision trees to improve performance and reduce overfitting.

Hyperparameter tuning is performed using **GridSearchCV**
with **F1-score** as the optimization metric.


In [20]:
rf = RandomForestClassifier(
    random_state=42,
    class_weight="balanced"
)

param_grid = {
    "n_estimators": [100, 200],
    "max_depth": [None, 10, 20],
    "min_samples_split": [2, 5],
    "min_samples_leaf": [1, 2]
}

grid_rf = GridSearchCV(
    rf,
    param_grid,
    cv=3,
    scoring="f1",
    n_jobs=-1
)

grid_rf.fit(X_train, y_train)

best_rf = grid_rf.best_estimator_


## üîç Random Forest Evaluation

The best Random Forest model is evaluated on the test dataset.


In [21]:
y_pred_rf = best_rf.predict(X_test)
y_prob_rf = best_rf.predict_proba(X_test)[:, 1]

rf_metrics = {
    "Model": "Random Forest",
    "Accuracy": accuracy_score(y_test, y_pred_rf),
    "Precision": precision_score(y_test, y_pred_rf),
    "Recall": recall_score(y_test, y_pred_rf),
    "F1-Score": f1_score(y_test, y_pred_rf),
    "ROC-AUC": roc_auc_score(y_test, y_prob_rf)
}

rf_metrics


{'Model': 'Random Forest',
 'Accuracy': 0.772888573456352,
 'Precision': 0.552734375,
 'Recall': 0.7566844919786097,
 'F1-Score': 0.6388261851015802,
 'ROC-AUC': 0.8426489963574362}

## üìà Model Performance Comparison

All model performances are compared to select the best model.

Metric values are rounded for better readability.


In [22]:
results = pd.DataFrame([lr_metrics, dt_metrics, rf_metrics])
results = results.round(3)
results


Unnamed: 0,Model,Accuracy,Precision,Recall,F1-Score,ROC-AUC
0,Logistic Regression,0.74,0.506,0.786,0.616,0.841
1,Decision Tree,0.731,0.493,0.484,0.489,0.653
2,Random Forest,0.773,0.553,0.757,0.639,0.843


‚ö†Ô∏è Note: Accuracy alone is not sufficient for churn prediction due to class imbalance.
F1-Score and Recall are prioritized to reduce false negatives (missed churners).


## ‚úÖ Final Model Selection

Among all evaluated models, **Random Forest** achieved the best overall performance.

**Reasons:**
- Highest F1-Score
- Strong ROC-AUC
- Handles non-linear relationships well
- Suitable for imbalanced datasets


## üîç Feature Importance Analysis

Important features influencing customer churn are extracted
from the Random Forest model.


In [23]:
feature_importance = pd.Series(
    best_rf.feature_importances_,
    index=X.columns
).sort_values(ascending=False)

feature_importance.head(10)


tenure                                  0.172815
TotalCharges                            0.140697
MonthlyCharges                          0.105536
Contract_Two year                       0.100271
InternetService_Fiber optic             0.067723
PaymentMethod_Electronic check          0.050826
Contract_One year                       0.039511
OnlineSecurity_Yes                      0.037791
TechSupport_Yes                         0.028310
DeviceProtection_No internet service    0.019274
dtype: float64

Feature importance scores indicate relative influence, not causation.
They should be interpreted alongside EDA and business context.


## üíæ Saving the Final Model

The best-performing **Random Forest model**, selected using GridSearchCV,
is saved to disk for reuse in subsequent tasks.

Saving the trained model ensures:

- Reproducibility of results  
- No need to retrain the model in later tasks  
- Consistent evaluation across Task-4 and Task-5  

The saved model will be loaded directly in **Task-5 ‚Äî Model Evaluation and Interpretation**.


In [24]:
import joblib

# Save best Random Forest model
joblib.dump(best_rf, "best_random_forest_model.pkl")


['best_random_forest_model.pkl']

### üìÅ Saved Model Artifact

The trained model has been saved as:

- **File name:** `best_random_forest_model.pkl`
- **Location:** `Task-4_Churn_Prediction_Model/`

This file will be reused in Task-5 to evaluate the model on unseen test data
without retraining, ensuring a clean and production-style workflow.


## üß† Conclusion

This task successfully implemented and evaluated multiple machine learning models
for customer churn prediction.

The Random Forest model demonstrated the best performance and was selected
as the final model. It provides strong predictive power and interpretability,
making it suitable for real-world customer retention strategies.

Further business insights are covered in **Task-5**.
