# Hyperparameter Tuning â€“ Random Forest Optimization

In this section, the best baseline model (Random Forest) is optimized using GridSearchCV.

Objective:
- Improve ROC-AUC performance
- Reduce variance
- Improve generalization

5-fold cross-validation is used during tuning.
ROC-AUC is used as the optimization metric due to the medical nature of the problem.

In [1]:
import pandas as pd
import numpy as np
import joblib

from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.metrics import accuracy_score, roc_auc_score, classification_report
from sklearn.ensemble import RandomForestClassifier

In [2]:
# Load dataset
df = pd.read_csv('../data/heart_cleaned.csv')

X = df.drop('HeartDisease', axis=1)
y = df['HeartDisease']

# Stratified split
X_train, X_test, y_train, y_test = train_test_split(
    X, y,
    test_size=0.2,
    random_state=42,
    stratify=y
)

# Load scaler
scaler = joblib.load('../models/scaler.pkl')
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

## Baseline Random Forest Performance (Before Tuning)

We evaluate the previously trained baseline model to compare improvement after tuning.

In [3]:
# Load baseline model
rf_baseline = joblib.load('../models/random_forest.pkl')

y_pred_base = rf_baseline.predict(X_test)
y_prob_base = rf_baseline.predict_proba(X_test)[:,1]

print("Baseline Accuracy:", accuracy_score(y_test, y_pred_base))
print("Baseline ROC-AUC:", roc_auc_score(y_test, y_prob_base))

Baseline Accuracy: 0.8695652173913043
Baseline ROC-AUC: 0.924079387852702


## Hyperparameter Search Space

The following parameters are tuned:

- n_estimators: Number of trees
- max_depth: Controls tree complexity
- min_samples_split: Prevents overfitting
- min_samples_leaf: Controls leaf size

These parameters directly affect bias-variance tradeoff.

In [4]:
param_grid = {
    'n_estimators': [100, 200, 300],
    'max_depth': [None, 5, 10, 20],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4]
}

## Running GridSearchCV

5-fold cross-validation is used.
ROC-AUC is used as the scoring metric.

In [5]:
rf = RandomForestClassifier(random_state=42)

grid_search = GridSearchCV(
    estimator=rf,
    param_grid=param_grid,
    cv=5,
    scoring='roc_auc',
    n_jobs=-1,
    verbose=1
)

grid_search.fit(X_train, y_train)

print("Best Parameters:", grid_search.best_params_)
print("Best CV ROC-AUC:", grid_search.best_score_)

Fitting 5 folds for each of 108 candidates, totalling 540 fits
Best Parameters: {'max_depth': 5, 'min_samples_leaf': 1, 'min_samples_split': 2, 'n_estimators': 300}
Best CV ROC-AUC: 0.9284798930869392


In [6]:
rf_tuned = grid_search.best_estimator_

y_pred_tuned = rf_tuned.predict(X_test)
y_prob_tuned = rf_tuned.predict_proba(X_test)[:,1]

print("\n=== Tuned Random Forest Performance ===")
print("Accuracy:", accuracy_score(y_test, y_pred_tuned))
print("ROC-AUC:", roc_auc_score(y_test, y_prob_tuned))
print(classification_report(y_test, y_pred_tuned))


=== Tuned Random Forest Performance ===
Accuracy: 0.8858695652173914
ROC-AUC: 0.925872788139646
              precision    recall  f1-score   support

           0       0.88      0.87      0.87        82
           1       0.89      0.90      0.90       102

    accuracy                           0.89       184
   macro avg       0.88      0.88      0.88       184
weighted avg       0.89      0.89      0.89       184



In [7]:
comparison = pd.DataFrame({
    "Model": ["Baseline RF", "Tuned RF"],
    "Accuracy": [
        accuracy_score(y_test, y_pred_base),
        accuracy_score(y_test, y_pred_tuned)
    ],
    "ROC-AUC": [
        roc_auc_score(y_test, y_prob_base),
        roc_auc_score(y_test, y_prob_tuned)
    ]
})

comparison

Unnamed: 0,Model,Accuracy,ROC-AUC
0,Baseline RF,0.869565,0.924079
1,Tuned RF,0.88587,0.925873


## Overfitting Assessment

To evaluate potential overfitting, we compare:

- Cross-validation ROC-AUC
- Test ROC-AUC

If cross-validation performance is significantly higher than test performance,
it may indicate overfitting.

In [8]:
print("Best CV ROC-AUC:", grid_search.best_score_)
print("Test ROC-AUC (Tuned):", roc_auc_score(y_test, y_prob_tuned))

difference = grid_search.best_score_ - roc_auc_score(y_test, y_prob_tuned)

print("Difference (CV - Test):", difference)

Best CV ROC-AUC: 0.9284798930869392
Test ROC-AUC (Tuned): 0.925872788139646
Difference (CV - Test): 0.002607104947293215


## Tuning Results Interpretation

- The tuned model shows a slight improvement in ROC-AUC compared to the baseline model.
- Accuracy also increases moderately, indicating improved predictive performance.
- The small gain suggests that the baseline Random Forest was already well-optimized.
- Hyperparameter tuning primarily helped stabilize and marginally enhance generalization.
- The difference between cross-validation and test ROC-AUC is small,
  indicating minimal overfitting and good generalization.

Given the medical context, even modest improvements in ROC-AUC contribute to better diagnostic reliability.