## CVD Prediction - Cardiovascular Disease Dataset (Source: https://www.kaggle.com/datasets/sulianova/cardiovascular-disease-dataset/data)
Model Training and Evaluation

In [25]:
#load preprocessed data 
import pandas as pd
train_df = pd.read_csv("./data_subsets/train_25M_75F.csv")

X_test = pd.read_csv("./data_splits/X_test.csv")
y_test = pd.read_csv("./data_splits/y_test.csv")

#check out the data
train_df.head()

Unnamed: 0,source_id,gender,age,bmiclass,MAP,cholesterol,gluc,smoke,alco,active,cardio
0,36155,0,6,2,4,1,1,0,1,1,1
1,34616,0,5,2,5,1,1,0,0,1,1
2,29127,0,2,3,2,1,1,0,0,0,1
3,4008,0,6,1,3,1,1,0,0,0,1
4,33452,0,6,2,3,3,3,0,0,1,1


In [26]:
#drop source_id
train_df = train_df.drop(columns=["source_id"])

In [27]:
TARGET = "cardio"
SENSITIVE = "gender"   # 1 = Male, 0 = Female


# Identify feature types
binary_cols = ["gender", "smoke", "alco", "active"]
categorical_cols = ["age", "bmiclass", "MAP", "cholesterol", "gluc"]

X_train = train_df.drop(columns=[TARGET])
y_train = train_df[TARGET]

In [28]:
#ONE-HOT ENCODE CATEGORICALS; KEEP SCALED NUMERICS AS-IS 

import pandas as pd
from sklearn.preprocessing import OneHotEncoder

# 1) fit encoder on TRAIN categoricals only
ohe = OneHotEncoder(handle_unknown="ignore", drop="if_binary", sparse_output=False)
ohe.fit(X_train[categorical_cols])

# 2) transform TRAIN and TEST
X_train_cat = pd.DataFrame(
    ohe.transform(X_train[categorical_cols]),
    columns=ohe.get_feature_names_out(categorical_cols),
    index=X_train.index
)
X_test_cat = pd.DataFrame(
    ohe.transform(X_test[categorical_cols]),
    columns=ohe.get_feature_names_out(categorical_cols),
    index=X_test.index
)

# 3) concatenate: encoded categoricals + scaled numerics
X_train_ready = pd.concat([X_train_cat, X_train[binary_cols]], axis=1)
X_test_ready  = pd.concat([X_test_cat,  X_test[binary_cols]],  axis=1)

print("Final feature shapes:", X_train_ready.shape, X_test_ready.shape)

Final feature shapes: (30000, 28) (11256, 28)


### Traditional ML Models - Baseline: K-Nearest Neighbors (KNN) & Decision Tree (DT)

In [29]:
#import required libraries
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score,
    classification_report, confusion_matrix
)

#define a function 
def evaluate_model(y_true, y_pred, model_name):
    print(f"=== {model_name} Evaluation ===")
    print("Accuracy :", accuracy_score(y_true, y_pred))
    print("Precision:", precision_score(y_true, y_pred, average='binary'))
    print("Recall   :", recall_score(y_true, y_pred, average='binary'))
    print("F1 Score :", f1_score(y_true, y_pred, average='binary'))
    print("\nClassification Report:\n", classification_report(y_true, y_pred))
    print("Confusion Matrix:\n", confusion_matrix(y_true, y_pred))
    print("\n" + "="*40 + "\n")

In [30]:
# KNN
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train_ready, y_train)

y_pred_knn = knn.predict(X_test_ready)
y_prob_knn = knn.predict_proba(X_test_ready)[:, 1] 

evaluate_model(y_test, y_pred_knn, "KNN")


# Decision Tree
from sklearn.tree import DecisionTreeClassifier
dt = DecisionTreeClassifier(random_state=42)
dt.fit(X_train_ready, y_train)

y_pred_dt = dt.predict(X_test_ready)
y_prob_dt = dt.predict_proba(X_test_ready)[:, 1]  
evaluate_model(y_test, y_pred_dt, "Decision Tree")

=== KNN Evaluation ===
Accuracy : 0.666133617626155
Precision: 0.6652914798206278
Recall   : 0.6622031780039279
F1 Score : 0.6637437365783823

Classification Report:
               precision    recall  f1-score   support

           0       0.67      0.67      0.67      5655
           1       0.67      0.66      0.66      5601

    accuracy                           0.67     11256
   macro avg       0.67      0.67      0.67     11256
weighted avg       0.67      0.67      0.67     11256

Confusion Matrix:
 [[3789 1866]
 [1892 3709]]


=== Decision Tree Evaluation ===
Accuracy : 0.6918976545842217
Precision: 0.7151502925156344
Recall   : 0.6329226923763613
F1 Score : 0.6715286986171624

Classification Report:
               precision    recall  f1-score   support

           0       0.67      0.75      0.71      5655
           1       0.72      0.63      0.67      5601

    accuracy                           0.69     11256
   macro avg       0.69      0.69      0.69     11256
weighted

# Model Interpretations: KNN and DT

## 1. KNN
- **Accuracy:** 0.666  
- **Precision:** 0.665  
- **Recall:** 0.662  
- **F1 Score:** 0.664   

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 3789         | 1866         |
| **Actual: 1** | 1892         | 3709         |

- **False negatives:** 1892 (positives missed)  
- **False positives:** 1866 (negatives flagged)  

**Interpretation:**  
The KNN model shows **balanced precision and recall (~0.66)**, leading to moderate overall performance.  
It misses nearly **1900 positives** and misclassifies a similar number of negatives.  
The model captures general patterns but struggles with sharper class boundaries.

---

## 2. Decision Tree
- **Accuracy:** 0.692  
- **Precision:** 0.715  
- **Recall:** 0.633  
- **F1 Score:** 0.672  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4243         | 1412         |
| **Actual: 1** | 2056         | 3545         |

- **False negatives:** 2056 (positives missed)  
- **False positives:** 1412 (negatives flagged)  

**Interpretation:**  
The Decision Tree achieves **higher precision (0.715)** and **accuracy (0.692)** compared to KNN, but with slightly lower recall (0.633).  
It identifies negatives more effectively (class 0 recall: 0.75), but misses more positives (2056).  
This makes it a **conservative model**: fewer false alarms, but at the cost of overlooking some positive cases.

---

### KNN Improvement
The code improves the KNN model by performing a **grid search** over key hyperparameters (`n_neighbors`, `weights`, and `distance metric`) to find the configuration that yields the best performance. After selecting the optimal model, it further explores **decision threshold tuning** to boost recall, which is critical in medical prediction tasks. 

In [31]:
from sklearn.model_selection import GridSearchCV, StratifiedKFold
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import recall_score, precision_score, f1_score
import numpy as np

# 1) Hyperparameter tuning for KNN 
param_grid = {
    "n_neighbors": list(range(1, 31)),
    "weights": ["uniform", "distance"],
    "metric": ["euclidean", "manhattan", "minkowski"],  # minkowski with p=2 is euclidean
}

cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)

grid = GridSearchCV(
    KNeighborsClassifier(),
    param_grid=param_grid,
    cv=cv,
    scoring="f1",        
    n_jobs=-1,
    verbose=0,
    refit=True
)

# Fit 
grid.fit(X_train_ready, y_train)

print("Best KNN params:", grid.best_params_)
print("Best CV F1:", grid.best_score_)

best_knn = grid.best_estimator_

# 2) Evaluate best KNN on TEST 
y_pred_knn_best = best_knn.predict(X_test_ready)
y_prob_knn_best = best_knn.predict_proba(X_test_ready)[:, 1]   

evaluate_model(y_test, y_pred_knn_best, "KNN (best params)")

Best KNN params: {'metric': 'euclidean', 'n_neighbors': 29, 'weights': 'uniform'}
Best CV F1: 0.6942235070474742
=== KNN (best params) Evaluation ===
Accuracy : 0.7063788201847904
Precision: 0.7087272727272728
Recall   : 0.6959471522942332
F1 Score : 0.7022790739573012

Classification Report:
               precision    recall  f1-score   support

           0       0.70      0.72      0.71      5655
           1       0.71      0.70      0.70      5601

    accuracy                           0.71     11256
   macro avg       0.71      0.71      0.71     11256
weighted avg       0.71      0.71      0.71     11256

Confusion Matrix:
 [[4053 1602]
 [1703 3898]]




# Tuned KNN - Interpretation:

## 1. KNN
- **Accuracy:** 0.706  
- **Precision:** 0.709  
- **Recall:** 0.696  
- **F1 Score:** 0.702  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4053         | 1602         |
| **Actual: 1** | 1703         | 3898         |

- **False negatives:** 1703 (positives missed)  
- **False positives:** 1602 (negatives flagged)  

**Interpretation:**  
The KNN model demonstrates **balanced performance**, with precision and recall both around **0.70**.  
It misses ~1700 positive cases and incorrectly flags ~1600 negatives, showing that errors are fairly evenly distributed.  
Overall, the model generalizes well and captures class patterns symmetrically, but in a healthcare context, missing ~30% of positive (disease) cases could be critical.  

---

### Further KNN Improvement - Implementing PCA 

In [32]:
from sklearn.decomposition import PCA
from sklearn.pipeline import Pipeline
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score,
    classification_report, confusion_matrix
)
import numpy as np

# 1) PCA + KNN pipeline 
pca_knn = Pipeline([
    ('pca', PCA(n_components=0.95, random_state=42)),  # keep 95% variance
    ('knn', KNeighborsClassifier(
        n_neighbors=15, metric='manhattan', weights='distance'
    ))
])

pca_knn.fit(X_train_ready, y_train)

# Inspect PCA details
n_comp = pca_knn.named_steps['pca'].n_components_
expl_var = pca_knn.named_steps['pca'].explained_variance_ratio_.sum()
print(f"PCA components: {n_comp} | Explained variance retained: {expl_var:.3f}")

#2) Evaluate 
y_pred_pca_knn = pca_knn.predict(X_test_ready)
probs_pca_knn = pca_knn.predict_proba(X_test_ready)[:, 1]

evaluate_model(y_test, y_pred_pca_knn, "PCA+KNN")

PCA components: 16 | Explained variance retained: 0.952
=== PCA+KNN Evaluation ===
Accuracy : 0.6926972281449894
Precision: 0.7134316460741331
Recall   : 0.6391715765041956
F1 Score : 0.6742631132875035

Classification Report:
               precision    recall  f1-score   support

           0       0.68      0.75      0.71      5655
           1       0.71      0.64      0.67      5601

    accuracy                           0.69     11256
   macro avg       0.69      0.69      0.69     11256
weighted avg       0.69      0.69      0.69     11256

Confusion Matrix:
 [[4217 1438]
 [2021 3580]]




# Model Interpretations

## 1. Baseline KNN
- **Accuracy:** 0.666  
- **Precision:** 0.665  
- **Recall:** 0.662  
- **F1 Score:** 0.664  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 3789         | 1866         |
| **Actual: 1** | 1892         | 3709         |

- **False negatives:** 1892 (positives missed)  
- **False positives:** 1866 (negatives flagged)  

**Interpretation:**  
The baseline KNN model achieves **balanced but modest performance (~0.66 across all metrics)**.  
It misses ~1900 positive cases and misclassifies a similar number of negatives.  
The model captures general patterns but lacks refinement, making it a weak starting point.  

---

## 2. Optimized KNN (Best Params)
- **Accuracy:** 0.706  
- **Precision:** 0.709  
- **Recall:** 0.696  
- **F1 Score:** 0.702  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4053         | 1602         |
| **Actual: 1** | 1703         | 3898         |

- **False negatives:** 1703 (positives missed)  
- **False positives:** 1602 (negatives flagged)  

**Interpretation:**  
The tuned KNN model shows **clear improvement** over baseline, with F1 rising from **0.66 → 0.70**.  
Errors are more balanced, and the confusion matrix shows fewer missed positives and fewer false alarms.  
This indicates that **hyperparameter tuning (n_neighbors=29, Euclidean distance)** significantly boosts model stability and predictive power.  

---

## 3. PCA + KNN (16 components, 95.2% variance)
- **Accuracy:** 0.693  
- **Precision:** 0.713  
- **Recall:** 0.639  
- **F1 Score:** 0.674  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4217         | 1438         |
| **Actual: 1** | 2021         | 3580         |

- **False negatives:** 2021 (positives missed)  
- **False positives:** 1438 (negatives flagged)  

**Interpretation:**  
Applying PCA before KNN reduces dimensionality but leads to **imbalanced performance**:  
- Precision improves slightly (0.713), meaning fewer false positives.  
- Recall drops (0.639), meaning **more positives are missed** (~2000 cases).  

This trade-off suggests PCA introduces information loss: while predictions for “no disease” become cleaner, the model sacrifices sensitivity to true positives — which is risky in medical contexts.  

---

# Summary:
- **Baseline KNN:** Balanced but weak (~0.66 F1).  
- **Optimized KNN:** Best overall performance (~0.70 F1, balanced precision/recall).  
- **PCA + KNN:** Precision gains but recall loss → not ideal for medical use (misses too many cases).  

**Best choice:** Optimized KNN with tuned parameters.  
PCA + KNN is **not recommended** here due to reduced recall, which increases the number of missed positive (disease) patients.  

---

In [45]:
#saving best performing KNN Model for fairness evaluation
import joblib, pandas as pd, numpy as np

# Ensure y_test is a Series (not a DataFrame)
if isinstance(y_test, pd.DataFrame):
    y_test = y_test.squeeze("columns")

# Save model
model_filename = "knn_tuned_model.pkl"
joblib.dump(best_knn, model_filename)

# Ensure 1D arrays
y_true = y_test.to_numpy() if hasattr(y_test, "to_numpy") else np.asarray(y_test)
y_pred_knn = y_pred_knn_best
y_prob_knn = y_prob_knn_best

# Optional gender column if present
if isinstance(X_test, pd.DataFrame) and "gender" in X_test.columns:
    gender_vals = X_test["gender"].to_numpy()
else:
    gender_vals = np.full(shape=len(y_true), fill_value=np.nan)

# Build and save results
results = pd.DataFrame({
    "gender": gender_vals,
    "y_true": y_true,
    "y_prob": y_prob_knn,
    "y_pred": y_pred_knn
})

preds_filename = "CVDKaggleData_75F25M__tunedKNN_predictions.csv"
results.to_csv(preds_filename, index=False)

print(f"Saved tuned KNN model → {model_filename}")
print(f"Saved predictions → {preds_filename}")

Saved tuned KNN model → knn_tuned_model.pkl
Saved predictions → CVDKaggleData_75F25M__tunedKNN_predictions.csv


### Improvement - Decision Tree (DT)

In [34]:
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV, StratifiedKFold

# 1) Base model
dt = DecisionTreeClassifier(random_state=42)

# 2) Hyperparameter grid 
param_grid = {
    "criterion": ["gini", "entropy"],
    "max_depth": [3, 5, 7, 9, None],
    "min_samples_split": [2, 5, 10, 20],
    "min_samples_leaf": [1, 2, 4, 6, 10],
}

# 3) Cross-validation setup
cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)

# 4) Grid search 
grid_dt = GridSearchCV(
    estimator=dt,
    param_grid=param_grid,
    cv=cv,
    scoring="recall",
    n_jobs=-1,
    verbose=0
)

grid_dt.fit(X_train_ready, y_train)

print("Best Decision Tree params:", grid_dt.best_params_)
print("Best CV F1:", grid_dt.best_score_)

# 5) Train & evaluate best DT
best_dt = grid_dt.best_estimator_
y_pred_dt_best = best_dt.predict(X_test_ready)
y_prob_dt_best = best_dt.predict_proba(X_test_ready)[:, 1]  

evaluate_model(y_test, y_pred_dt_best, "Tuned Decision Tree")

Best Decision Tree params: {'criterion': 'gini', 'max_depth': 7, 'min_samples_leaf': 10, 'min_samples_split': 2}
Best CV F1: 0.6651999999999999
=== Tuned Decision Tree Evaluation ===
Accuracy : 0.7112651030561479
Precision: 0.709573899090747
Recall   : 0.7105873951080164
F1 Score : 0.7100802854594113

Classification Report:
               precision    recall  f1-score   support

           0       0.71      0.71      0.71      5655
           1       0.71      0.71      0.71      5601

    accuracy                           0.71     11256
   macro avg       0.71      0.71      0.71     11256
weighted avg       0.71      0.71      0.71     11256

Confusion Matrix:
 [[4026 1629]
 [1621 3980]]




# Tuned Decision Tree Interpretation

## Tuned Decision Tree
- **Accuracy:** 0.711  
- **Precision:** 0.710  
- **Recall:** 0.711  
- **F1 Score:** 0.710  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4026         | 1629         |
| **Actual: 1** | 1621         | 3980         |

- **False negatives:** 1621 (positives missed)  
- **False positives:** 1629 (negatives flagged)  

**Interpretation:**  
The tuned Decision Tree achieves **balanced and stable performance (~0.71 across all metrics)**, slightly outperforming the optimized KNN model.  
- Errors are nearly symmetric, with false negatives and false positives at similar levels.  
- Both classes (disease vs. no disease) are treated almost equally, showing no bias toward either group.  
- The chosen hyperparameters (`max_depth=7`, `min_samples_leaf=10`) prevent overfitting, yielding consistent results across classes.  

Compared to KNN, the Decision Tree offers **slightly higher recall** and **fewer missed positives**, making it more reliable in healthcare scenarios.  

---

In [35]:
# Alternative DT tuning: simpler trees + class balancing + cost-complexity pruning
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV, StratifiedKFold, cross_val_score
from sklearn.metrics import recall_score, precision_score, f1_score
import numpy as np

# Stage A: bias toward simpler trees with class_weight="balanced"
base_dt = DecisionTreeClassifier(random_state=42, class_weight="balanced")

param_grid_simple = {
    "criterion": ["gini", "entropy"],
    "max_depth": [3, 4, 5, 6, 7],
    "min_samples_split": [5, 10, 20],
    "min_samples_leaf": [2, 4, 6],
    "min_impurity_decrease": [0.0, 1e-4, 1e-3],  # tiny regularization
}

cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)

grid_simple = GridSearchCV(
    estimator=base_dt,
    param_grid=param_grid_simple,
    cv=cv,
    scoring="recall",        # recall-focused search
    n_jobs=-1,
    verbose=0,
    refit=True
)
grid_simple.fit(X_train_ready, y_train)

print("Stage A — Best simple DT params:", grid_simple.best_params_)
print("Stage A — Best CV Recall:", grid_simple.best_score_)
simple_dt = grid_simple.best_estimator_

# Stage B: cost-complexity pruning on the best simple DT
path = simple_dt.cost_complexity_pruning_path(X_train_ready, y_train)
ccp_alphas = path.ccp_alphas

unique_alphas = np.unique(np.round(ccp_alphas, 6))
candidate_alphas = np.linspace(unique_alphas.min(), unique_alphas.max(), num=min(20, len(unique_alphas)))
candidate_alphas = np.unique(np.concatenate([candidate_alphas, [0.0]]))  # include no-pruning baseline

cv_scores = []
for alpha in candidate_alphas:
    dt_alpha = DecisionTreeClassifier(
        random_state=42,
        class_weight="balanced",
        criterion=simple_dt.criterion,
        max_depth=simple_dt.max_depth,
        min_samples_split=simple_dt.min_samples_split,
        min_samples_leaf=simple_dt.min_samples_leaf,
        min_impurity_decrease=simple_dt.min_impurity_decrease,
        ccp_alpha=alpha
    )
    # recall-focused CV
    recall_cv = cross_val_score(dt_alpha, X_train_ready, y_train, cv=cv, scoring="recall", n_jobs=-1).mean()
    cv_scores.append((alpha, recall_cv))

best_alpha, best_cv_recall = sorted(cv_scores, key=lambda x: x[1], reverse=True)[0]
print(f"Stage B — Best ccp_alpha: {best_alpha:.6f} | CV Recall: {best_cv_recall:.4f}")

# Final model fit with the chosen ccp_alpha
best_dt = DecisionTreeClassifier(
    random_state=42,
    class_weight="balanced",
    criterion=simple_dt.criterion,
    max_depth=simple_dt.max_depth,
    min_samples_split=simple_dt.min_samples_split,
    min_samples_leaf=simple_dt.min_samples_leaf,
    min_impurity_decrease=simple_dt.min_impurity_decrease,
    ccp_alpha=best_alpha
).fit(X_train_ready, y_train)

# Evaluation
y_pred_dt = best_dt.predict(X_test_ready)
y_prob_dt = best_dt.predict_proba(X_test_ready)[:, 1]   

evaluate_model(y_test, y_pred_dt, "Alternative Tuned & Pruned DT")

Stage A — Best simple DT params: {'criterion': 'gini', 'max_depth': 4, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 4, 'min_samples_split': 5}
Stage A — Best CV Recall: 0.6784666666666668
Stage B — Best ccp_alpha: 0.000000 | CV Recall: 0.6785
=== Alternative Tuned & Pruned DT Evaluation ===
Accuracy : 0.7133084577114428
Precision: 0.7194085027726432
Recall   : 0.6948759150151759
F1 Score : 0.7069294342021615

Classification Report:
               precision    recall  f1-score   support

           0       0.71      0.73      0.72      5655
           1       0.72      0.69      0.71      5601

    accuracy                           0.71     11256
   macro avg       0.71      0.71      0.71     11256
weighted avg       0.71      0.71      0.71     11256

Confusion Matrix:
 [[4137 1518]
 [1709 3892]]




# Alternative Decision Tree Interpretation

## Tuned & Pruned Decision Tree
- **Accuracy:** 0.713  
- **Precision:** 0.719  
- **Recall:** 0.695  
- **F1 Score:** 0.707  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4137         | 1518         |
| **Actual: 1** | 1709         | 3892         |

- **False negatives:** 1709 (positives missed)  
- **False positives:** 1518 (negatives flagged)  

**Interpretation:**  
The pruned Decision Tree achieves **solid overall performance (~0.71 across metrics)**, very close to the tuned DT but with subtle differences:  
- **Higher precision (0.719)** → fewer false positives compared to the previous tuned DT.  
- **Slightly lower recall (0.695)** → more missed positives (~1709 vs. 1621 previously).  
- Errors remain fairly balanced, but the model is more conservative in labeling positives.  

 **Strengths:** Stable accuracy and precision gains. Tree pruning likely improved generalization by reducing overfitting.  
**Limitations:** Recall dropped a bit, which means more sick patients are missed compared to the non-pruned DT.  

**Conclusio:** This pruned tree is **more precise but slightly less sensitive**. In medical contexts, this trade-off may not be ideal since missing cases (FN) is more costly than flagging extra false positives.  

---

# Decision Tree Model Comparisons

## 1. Baseline Decision Tree
- **Accuracy:** 0.692  
- **Precision:** 0.715  
- **Recall:** 0.633  
- **F1 Score:** 0.672  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4243         | 1412         |
| **Actual: 1** | 2056         | 3545         |

- **False negatives:** 2056 (positives missed)  
- **False positives:** 1412 (negatives flagged)  

**Interpretation:**  
The baseline tree shows **high precision but low recall**, meaning it is conservative in predicting positives.  
It misses over **2000 true cases**, which is problematic in healthcare contexts.  

---

## 2. Tuned Decision Tree
- **Accuracy:** 0.711  
- **Precision:** 0.710  
- **Recall:** 0.711  
- **F1 Score:** 0.710  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4026         | 1629         |
| **Actual: 1** | 1621         | 3980         |

- **False negatives:** 1621  
- **False positives:** 1629  

**Interpretation:**  
Tuning improves **balance across precision and recall**, raising recall substantially compared to baseline.  
The model now misses fewer positives while keeping errors evenly distributed.  
This is a **clear improvement** over the baseline.  

---

## 3. Alternative Tuned & Pruned Decision Tree
- **Accuracy:** 0.713  
- **Precision:** 0.719  
- **Recall:** 0.695  
- **F1 Score:** 0.707  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4137         | 1518         |
| **Actual: 1** | 1709         | 3892         |

- **False negatives:** 1709  
- **False positives:** 1518  

**Interpretation:**  
The pruned tree yields **slightly better accuracy and precision** but a **small drop in recall** compared to the tuned version.  
It is more conservative, catching fewer positives but reducing false alarms.  
This model is more **stable and generalizable** but less sensitive.  

---

# Overall Comparison

| Model                          | Accuracy | Precision | Recall | F1 Score | Key Behavior |
|--------------------------------|----------|-----------|--------|----------|--------------|
| **Baseline DT**                | 0.692    | 0.715     | 0.633  | 0.672    | High precision, misses many positives |
| **Tuned DT**                   | 0.711    | 0.710     | 0.711  | 0.710    | Balanced, fewer missed positives |
| **Alternative Pruned DT**      | 0.713    | 0.719     | 0.695  | 0.707    | More precise, slightly less sensitive |

---

## Summary:
- **Best overall:** Tuned DT → best balance between accuracy, recall, and precision.  
- **Most conservative:** Pruned DT → slightly safer predictions, but misses more positives.   
- **Worst option:** Baseline DT → too many positives missed.  

---

In [36]:
import joblib, pandas as pd, numpy as np

# Save tuned Decision Tree model
model_filename = "tuned_dt_model.pkl"
joblib.dump(best_dt, model_filename)

# Ensure 1D arrays
y_true = y_test.to_numpy() if hasattr(y_test, "to_numpy") else np.asarray(y_test)
# Use tuned predictions/probabilities from the best estimator
y_pred = y_pred_dt_best
y_prob = y_prob_dt_best

# Optional gender column if present
if isinstance(X_test, pd.DataFrame) and "gender" in X_test.columns:
    gender_vals = X_test["gender"].to_numpy()
else:
    gender_vals = np.full(shape=len(y_true), fill_value=np.nan)

# Build and save results
results = pd.DataFrame({
    "gender": gender_vals,
    "y_true": y_true,
    "y_pred_dt": y_pred,
    "y_prob": y_prob
})

preds_filename = "CVDKaggleData_75F25M_DT_tuned_predictions.csv"
results.to_csv(preds_filename, index=False)

print(f"Saved tuned DT model → {model_filename}")
print(f"Saved predictions → {preds_filename}")

Saved tuned DT model → tuned_dt_model.pkl
Saved predictions → CVDKaggleData_75F25M_DT_tuned_predictions.csv


### Ensemble Model - Random Forest (RF)

In [37]:
from sklearn.ensemble import RandomForestClassifier

# Initialize Random Forest
rf = RandomForestClassifier(random_state=42)

# Train the model
rf.fit(X_train_ready, y_train)

# Predict on test set
y_pred_rf = rf.predict(X_test_ready)
evaluate_model(y_test, y_pred_rf, "Random Forest")

=== Random Forest Evaluation ===
Accuracy : 0.6974946695095949
Precision: 0.7109104878985786
Recall   : 0.6607748616318515
F1 Score : 0.6849264365688905

Classification Report:
               precision    recall  f1-score   support

           0       0.69      0.73      0.71      5655
           1       0.71      0.66      0.68      5601

    accuracy                           0.70     11256
   macro avg       0.70      0.70      0.70     11256
weighted avg       0.70      0.70      0.70     11256

Confusion Matrix:
 [[4150 1505]
 [1900 3701]]




# Random Forest Interpretation

## Random Forest
- **Accuracy:** 0.697  
- **Precision:** 0.711  
- **Recall:** 0.661  
- **F1 Score:** 0.685  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4150         | 1505         |
| **Actual: 1** | 1900         | 3701         |

- **False negatives:** 1900 (positives missed)  
- **False positives:** 1505 (negatives flagged)  

**Interpretation:**  
The Random Forest model achieves **~70% overall accuracy** with good precision (~0.71), meaning that most predicted positives are correct.  
However, recall (~0.66) is lower, showing that the model **misses about one-third of actual positive cases** (~1900 false negatives).  
The confusion matrix highlights this imbalance: while predictions for healthy cases (class 0) are fairly strong, the model struggles more with consistently identifying diseased patients (class 1).  

**Strengths:** Reliable precision and balanced performance between classes.  
**Limitations:** Lower recall reduces sensitivity, which is critical in medical contexts since many positive cases remain undetected.  

---


### Improvement Random Forest (RF)

In [38]:
from sklearn.experimental import enable_halving_search_cv  # must be first
from sklearn.model_selection import HalvingGridSearchCV, StratifiedKFold
from sklearn.ensemble import RandomForestClassifier

Xtr = getattr(X_train_ready, "values", X_train_ready).astype("float32")
Xte = getattr(X_test_ready, "values", X_test_ready).astype("float32")

rf = RandomForestClassifier(random_state=42, n_jobs=1, bootstrap=True)

# Do NOT include 'n_estimators' in param_grid; Halving will vary it as the resource.
param_grid = {
    "max_depth": [None, 12],
    "min_samples_split": [2, 5],
    "min_samples_leaf": [1, 2],
    "max_features": ["sqrt"],
    "class_weight": ["balanced"],
}

cv = StratifiedKFold(n_splits=3, shuffle=True, random_state=42)

search = HalvingGridSearchCV(
    estimator=rf,
    param_grid=param_grid,
    resource="n_estimators",   # use trees as the resource
    min_resources=100,         # start small
    max_resources=400,         # end at your intended max
    factor=3,
    cv=cv,
    scoring="recall",
    n_jobs=-1,
    verbose=1,
    refit=True,
    return_train_score=False,
)

search.fit(Xtr, y_train)
best_rf = search.best_estimator_
y_pred_rf = best_rf.predict(Xte)
y_prob_rf = best_rf.predict_proba(Xte)[:, 1]
evaluate_model(y_test, y_pred_rf, "Random Forest (best, halving over n_estimators)")



n_iterations: 2
n_required_iterations: 2
n_possible_iterations: 2
min_resources_: 100
max_resources_: 400
aggressive_elimination: False
factor: 3
----------
iter: 0
n_candidates: 8
n_resources: 100
Fitting 3 folds for each of 8 candidates, totalling 24 fits
----------
iter: 1
n_candidates: 3
n_resources: 300
Fitting 3 folds for each of 3 candidates, totalling 9 fits
=== Random Forest (best, halving over n_estimators) Evaluation ===
Accuracy : 0.7075337597725657
Precision: 0.7214655668521005
Recall   : 0.6714872344224245
F1 Score : 0.6955798039578325

Classification Report:
               precision    recall  f1-score   support

           0       0.70      0.74      0.72      5655
           1       0.72      0.67      0.70      5601

    accuracy                           0.71     11256
   macro avg       0.71      0.71      0.71     11256
weighted avg       0.71      0.71      0.71     11256

Confusion Matrix:
 [[4203 1452]
 [1840 3761]]




# Random Forest Comparisons

## 1. Baseline Random Forest
- **Accuracy:** 0.697  
- **Precision:** 0.711  
- **Recall:** 0.661  
- **F1 Score:** 0.685  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4150         | 1505         |
| **Actual: 1** | 1900         | 3701         |

- **False negatives:** 1900 (positives missed)  
- **False positives:** 1505 (negatives flagged)  

**Interpretation:**  
The baseline Random Forest shows **good precision (~0.71)** but **weaker recall (~0.66)**, meaning it misses ~1/3 of true positive cases.  
It balances class performance reasonably but under-detects diseased patients.  

---

## 2. Optimized Random Forest (Halving Search)
- **Accuracy:** 0.708  
- **Precision:** 0.721  
- **Recall:** 0.671  
- **F1 Score:** 0.696  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4203         | 1452         |
| **Actual: 1** | 1840         | 3761         |

- **False negatives:** 1840 (positives missed)  
- **False positives:** 1452 (negatives flagged)  

**Interpretation:**  
The tuned Random Forest achieves **better balance** with gains in accuracy, precision, recall, and F1 compared to baseline.  
It catches more true positives (lower FN) while also reducing false positives.  
Overall, the hyperparameter optimization leads to a **more reliable and generalizable model**.  

---

# Overall Comparison:

| Model                      | Accuracy | Precision | Recall | F1 Score | FN (missed) | FP (flagged) |
|-----------------------------|----------|-----------|--------|----------|-------------|--------------|
| **Baseline RF**            | 0.697    | 0.711     | 0.661  | 0.685    | 1900        | 1505         |
| **Optimized RF (Halving)** | 0.708    | 0.721     | 0.671  | 0.696    | 1840        | 1452         |

**Summary:**  
- The **optimized RF** improves across all metrics: +1% accuracy, +1% precision, +1% recall, and +1% F1.  
- It reduces both **false negatives** and **false positives**, making it better at detecting diseased patients while keeping false alarms in check.  
- **Best choice:** Optimized Random Forest.  

---


In [39]:
# Save Tuned Random Forest Results

# Save tuned Random Forest model
model_filename = "tuned_rf_model.pkl"
joblib.dump(best_rf, model_filename)

# Ensure 1D arrays for y_true and y_pred
y_true = y_test.to_numpy() if hasattr(y_test, "to_numpy") else np.asarray(y_test)
y_pred = y_pred_rf  # from best_rf.predict(X_test_ready)
y_prob = y_prob_rf

# Optional gender column if present in test set
if isinstance(X_test, pd.DataFrame) and "gender" in X_test.columns:
    gender_vals = X_test["gender"].to_numpy()
else:
    gender_vals = np.full(shape=len(y_true), fill_value=np.nan)

# Build and save results DataFrame
results = pd.DataFrame({
    "gender": gender_vals,
    "y_true": y_true,
    "y_pred_rf": y_pred,
    "y_prob" :y_prob_rf
})

preds_filename = "CVDKaggleData_75F25M_RF_tuned_predictions.csv"
results.to_csv(preds_filename, index=False)

print(f"Saved tuned RF model → {model_filename}")
print(f"Saved predictions → {preds_filename}")

Saved tuned RF model → tuned_rf_model.pkl
Saved predictions → CVDKaggleData_75F25M_RF_tuned_predictions.csv


### Deep Learning - Multi-layer Perceptron

In [40]:
#import required library 
from sklearn.neural_network import MLPClassifier

In [41]:
# Initialize MLP model
mlp = MLPClassifier(
    hidden_layer_sizes=(100,),   # one hidden layer with 100 neurons
    activation='relu',           # or 'tanh'
    solver='adam',               # optimizer
    max_iter=1000,                # increase if convergence warning appears
    random_state=42
)

# Train the model
mlp.fit(X_train_ready, y_train)

# Predict
y_pred_mlp = mlp.predict(X_test_ready)

evaluate_model(y_test, y_pred_mlp, "Multilayer Perceptron (MLP)")

=== Multilayer Perceptron (MLP) Evaluation ===
Accuracy : 0.7058457711442786
Precision: 0.7175151975683891
Recall   : 0.6743438671665773
F1 Score : 0.6952600092038657

Classification Report:
               precision    recall  f1-score   support

           0       0.70      0.74      0.72      5655
           1       0.72      0.67      0.70      5601

    accuracy                           0.71     11256
   macro avg       0.71      0.71      0.71     11256
weighted avg       0.71      0.71      0.71     11256

Confusion Matrix:
 [[4168 1487]
 [1824 3777]]




# Multilayer Perceptron (MLP) Interpretation

## MLP
- **Accuracy:** 0.706  
- **Precision:** 0.718  
- **Recall:** 0.674  
- **F1 Score:** 0.695  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4168         | 1487         |
| **Actual: 1** | 1824         | 3777         |

- **False negatives:** 1824 (positives missed)  
- **False positives:** 1487 (negatives flagged)  

**Interpretation:**  
The MLP achieves **~71% accuracy**, with **solid precision (~0.72)** and **moderate recall (~0.67)**.  
This means the model is reliable at correctly predicting positives but still misses around **1 in 3 true cases** (~1824 false negatives).  

The confusion matrix shows balanced performance:  
- Slightly stronger at identifying healthy individuals (class 0).  
- Still reasonably effective at detecting diseased patients (class 1), though sensitivity could be improved.  

**Strengths:** Good precision and balanced metrics across both classes.  
**Limitations:** Recall is lower, meaning a significant number of true positive cases are missed. In a healthcare setting, this could be critical.  

---

### Improvements - MLP

In [42]:
#Adam + Early Stopping 
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import classification_report, confusion_matrix

adammlp = MLPClassifier(
    hidden_layer_sizes=(64, 32),   # slightly smaller/deeper can help
    activation='relu',
    solver='adam',
    learning_rate_init=1e-3,       # smaller step can stabilize
    alpha=1e-3,                    # L2 regularization to reduce overfitting
    batch_size=32,
    max_iter=1000,                 # increased max_iter
    early_stopping=True,           # use a validation split internally
    validation_fraction=0.15,
    n_iter_no_change=25,          
    tol=1e-4,
    random_state=42
)

adammlp.fit(X_train_ready, y_train)  
y_pred_mlp = adammlp.predict(X_test_ready)                     
y_prob_mlp = adammlp.predict_proba(X_test_ready)[:, 1]         

evaluate_model(y_test, y_pred_mlp, "(Adam + EarlyStopping)")

=== (Adam + EarlyStopping) Evaluation ===
Accuracy : 0.7107320540156361
Precision: 0.713998904909655
Recall   : 0.698446705945367
F1 Score : 0.7061371841155235

Classification Report:
               precision    recall  f1-score   support

           0       0.71      0.72      0.72      5655
           1       0.71      0.70      0.71      5601

    accuracy                           0.71     11256
   macro avg       0.71      0.71      0.71     11256
weighted avg       0.71      0.71      0.71     11256

Confusion Matrix:
 [[4088 1567]
 [1689 3912]]




# Multilayer Perceptron (Adam + EarlyStopping) Interpretation

## MLP (Adam + EarlyStopping)
- **Accuracy:** 0.711  
- **Precision:** 0.714  
- **Recall:** 0.698  
- **F1 Score:** 0.706  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4088         | 1567         |
| **Actual: 1** | 1689         | 3912         |

- **False negatives:** 1689 (positives missed)  
- **False positives:** 1567 (negatives flagged)  

**Interpretation:**  
This MLP variant achieves **~71% accuracy**, with **well-balanced precision (0.714) and recall (0.698)**.  
Compared to a plain MLP, it benefits from **slightly higher recall**, meaning it identifies more true positives while keeping false positives at a manageable level.  

The confusion matrix shows balanced error distribution:  
- False negatives (missed positives) and false positives are nearly equal.  
- Both classes (healthy and diseased) are treated fairly evenly, with no strong bias.  

**Strengths:** Stable training via EarlyStopping, balanced metrics across precision and recall, fewer missed positives than the plain MLP.  
**Limitations:** While improved, recall still indicates that ~1 in 3 diseased patients are not detected, which may be critical in medical settings.  

---

### Further Improvement MLP 

In [43]:
# OPTION A — Fastest win: early stopping + single-metric scoring + lighter CV
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import RandomizedSearchCV, StratifiedKFold
from sklearn.metrics import fbeta_score, make_scorer
import numpy as np

# Keep arrays lean
Xtr = getattr(X_train_ready, "values", X_train_ready).astype("float32")
Xte = getattr(X_test_ready, "values", X_test_ready).astype("float32")

base_mlp = MLPClassifier(
    solver="adam",
    early_stopping=True,          # <- stop per-config when val score plateaus
    validation_fraction=0.1,
    n_iter_no_change=10,
    max_iter=200,                 # <- much lower; early_stopping will bail sooner
    tol=1e-4,
    random_state=42
)

# Trim the space; focus on what usually matters
param_dist = {
    "hidden_layer_sizes": [(64,), (128,), (64, 32)],
    "activation": ["relu"],       # 'relu' typically dominates for tabular MLPs
    "alpha": [1e-5, 1e-4, 3e-4, 1e-3],
    "learning_rate_init": [1e-3, 5e-4, 3e-4],
    "batch_size": [32, 64, 128],  # larger batch -> faster per-epoch
}

# Fewer folds = big speedup with little generalization loss
cv = StratifiedKFold(n_splits=3, shuffle=True, random_state=42)

# Use a single scorer that matches your objective (recall-weighted)
fbeta2 = make_scorer(fbeta_score, beta=2)

rs = RandomizedSearchCV(
    estimator=base_mlp,
    param_distributions=param_dist,
    n_iter=20,                    # fewer trials; early stop handles over-training
    scoring=fbeta2,               # <- single metric speeds everything up
    refit=True,                   # refit on full train with best params
    cv=cv,
    n_jobs=-1,
    verbose=1,
    random_state=42
)

rs.fit(Xtr, y_train)
best_mlp = rs.best_estimator_

print("Best MLP params:", rs.best_params_)
print(f"Best CV F-beta (β=2): {rs.best_score_:.4f}")

y_pred = best_mlp.predict(Xte)
y_prob = best_mlp.predict_proba(Xte)[:, 1]
evaluate_model(y_test, y_pred, model_name="Best MLP (Adam + ES, fast)")

Fitting 3 folds for each of 20 candidates, totalling 60 fits
Best MLP params: {'learning_rate_init': 0.0005, 'hidden_layer_sizes': (128,), 'batch_size': 128, 'alpha': 0.0003, 'activation': 'relu'}
Best CV F-beta (β=2): 0.6823
=== Best MLP (Adam + ES, fast) Evaluation ===
Accuracy : 0.7158848614072495
Precision: 0.7413135167704358
Recall   : 0.6589894661667559
F1 Score : 0.6977315689981096

Classification Report:
               precision    recall  f1-score   support

           0       0.70      0.77      0.73      5655
           1       0.74      0.66      0.70      5601

    accuracy                           0.72     11256
   macro avg       0.72      0.72      0.71     11256
weighted avg       0.72      0.72      0.71     11256

Confusion Matrix:
 [[4367 1288]
 [1910 3691]]




# Multilayer Perceptron (MLP) Comparisons

## 1. Baseline MLP
- **Accuracy:** 0.706  
- **Precision:** 0.718  
- **Recall:** 0.674  
- **F1 Score:** 0.695  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4168         | 1487         |
| **Actual: 1** | 1824         | 3777         |

- **False negatives:** 1824  
- **False positives:** 1487  

**Interpretation:**  
The baseline MLP achieves balanced performance with **strong precision (~0.72)** but **lower recall (~0.67)**, missing ~1/3 of positive cases.  

---

## 2. MLP (Adam + EarlyStopping)
- **Accuracy:** 0.711  
- **Precision:** 0.714  
- **Recall:** 0.698  
- **F1 Score:** 0.706  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4088         | 1567         |
| **Actual: 1** | 1689         | 3912         |

- **False negatives:** 1689  
- **False positives:** 1567  

**Interpretation:**  
This variant benefits from **slightly higher recall (~0.70)**, reducing missed positives compared to baseline.  
Errors are more balanced between classes, with stable precision and recall.  
EarlyStopping likely improved generalization and reduced overfitting.  

---

## 3. Best MLP (Adam + ES, tuned params)
- **Accuracy:** 0.716  
- **Precision:** 0.741  
- **Recall:** 0.659  
- **F1 Score:** 0.698  

**Confusion Matrix**  
|               | Predicted: 0 | Predicted: 1 |
|--------------:|-------------:|-------------:|
| **Actual: 0** | 4367         | 1288         |
| **Actual: 1** | 1910         | 3691         |

- **False negatives:** 1910  
- **False positives:** 1288  

**Interpretation:**  
The tuned MLP achieves the **highest accuracy (71.6%) and precision (~0.74)**, meaning fewer false alarms.  
However, recall drops back down to ~0.66, so it misses more positives (~1910 cases) compared to the EarlyStopping version.  
This model prioritizes **precision and overall accuracy** at the cost of sensitivity.  

---

# Overall Comparison

| Model                     | Accuracy | Precision | Recall | F1 Score | FN (missed) | FP (flagged) | Key Trait |
|----------------------------|----------|-----------|--------|----------|-------------|--------------|-----------|
| **Baseline MLP**          | 0.706    | 0.718     | 0.674  | 0.695    | 1824        | 1487         | Balanced, but moderate recall |
| **Adam + EarlyStopping**  | 0.711    | 0.714     | 0.698  | 0.706    | 1689        | 1567         | Best recall, fewer missed positives |
| **Best Tuned MLP**        | 0.716    | 0.741     | 0.659  | 0.698    | 1910        | 1288         | Highest accuracy & precision, lower recall |

---

##  Summary:
- **Baseline MLP:** Balanced but weaker recall.  
- **Adam + EarlyStopping:** **Best recall**, fewer missed cases — more suitable for healthcare contexts where sensitivity matters.  
- **Best Tuned MLP:** **Highest accuracy & precision**, but lower recall, meaning more diseased patients are missed.  

The **Adam + EarlyStopping version** offers the most clinically relevant balance (catching more positives), while the tuned model is stronger if minimizing false positives is the priority.  


In [44]:
# Save Tuned MLP Results
import joblib, pandas as pd, numpy as np

# Save MLP model
model_filename =  "mlp_adamtuned.pkl"
joblib.dump(adammlp, model_filename)

# Ensure 1D arrays for y_true and y_pred
y_true = y_test.to_numpy() if hasattr(y_test, "to_numpy") else np.asarray(y_test)
y_pred = y_pred_mlp
y_prob = y_prob_mlp

# Optional gender column if present in test set
if isinstance(X_test, pd.DataFrame) and "gender" in X_test.columns:
    gender_vals = X_test["gender"].to_numpy()
else:
    gender_vals = np.full(shape=len(y_true), fill_value=np.nan)

# Build and save results DataFrame
results = pd.DataFrame({
    "gender": gender_vals,
    "y_true": y_true,
    "y_pred": y_pred,
    "y_prob" : y_prob
})

preds_filename = "CVDKaggleData_75F25M_MLP_adamtuned_predictions.csv"
results.to_csv(preds_filename, index=False)

print(f"Saved Adam tuned MLP model → {model_filename}")
print(f"Saved predictions → {preds_filename}")

Saved Adam tuned MLP model → mlp_adamtuned.pkl
Saved predictions → CVDKaggleData_75F25M_MLP_adamtuned_predictions.csv
