# **Traditional ML Models**

## **Executive Summary**
- We compare Logistic Regression, KNN, Decision Tree, Random Forest, and SVM on unscaled, standard-scaled, and robust-scaled data.
- Imbalance is handled via class weights and SMOTE.
- Selection rule: highest test F1; tie-breakers precision, recall, then accuracy. ROC-AUC is also reported.
- Outputs: tables, per-run and aggregate figures, and a single persisted best model.

## **Setup**

In [4]:
import os
import sys
import json
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import joblib

from sklearn.base import clone
from sklearn.preprocessing import StandardScaler

PROJECT_ROOT = os.path.abspath(os.path.join(os.getcwd(), '..'))
if PROJECT_ROOT not in sys.path:
    sys.path.insert(0, PROJECT_ROOT)

from src.models.traditional_ml import (
    HAS_IMBLEARN,
    load_dataset,
    prepare_features,
    make_preprocessor,
    get_estimators,
    build_pipeline,
    get_search_spaces,
    get_cv,
    tune_model,
    predict_proba_positive,
    evaluate_predictions,
    plot_confusion_matrix,
    plot_roc_curve,
    plot_precision_recall_curve,
)


### **Paths and seed**

In [5]:
DATA_DIR = '../data/processed'
FIG_DIR = '../figures/02_ML_Models'
MODELS_DIR = '../models'
RANDOM_STATE = 42

os.makedirs(FIG_DIR, exist_ok=True)
os.makedirs(MODELS_DIR, exist_ok=True)


## **Feature and target definition**

In [6]:
TARGET = 'TenYearCHD'
FEATURES = [
    'bmi',
    'hypertension',
    'pulse_pressure',
    'cigarettes_per_day',
    'total_cholesterol',
    'glucose',
    'heart_rate',
    'age_group_code',
]

FEATURES


['bmi',
 'hypertension',
 'pulse_pressure',
 'cigarettes_per_day',
 'total_cholesterol',
 'glucose',
 'heart_rate',
 'age_group_code']

## **Data loading and validation**
Shapes, missing values, and class balance per dataset version.

In [7]:
dataset_versions = [
    ('unscaled', 'train_unscaled.csv', 'test_unscaled.csv', True),
    ('standard_scaled', 'train_standard_scaled.csv', 'test_standard_scaled.csv', False),
    ('robust_scaled', 'train_robust_scaled.csv', 'test_robust_scaled.csv', False),
]

datasets = {}
shape_rows = []
missing_rows = []
balance_rows = []

for name, train_file, test_file, needs_scaling in dataset_versions:
    train_df = load_dataset(os.path.join(DATA_DIR, train_file))
    test_df = load_dataset(os.path.join(DATA_DIR, test_file))

    X_train, y_train = prepare_features(train_df, FEATURES, TARGET)
    X_test, y_test = prepare_features(test_df, FEATURES, TARGET)

    datasets[name] = {
        'X_train': X_train,
        'y_train': y_train,
        'X_test': X_test,
        'y_test': y_test,
        'needs_scaling': needs_scaling,
    }

    shape_rows.append({
        'dataset_version': name,
        'train_rows': len(X_train),
        'test_rows': len(X_test),
        'n_features': X_train.shape[1],
    })

    missing_rows.append({
        'dataset_version': name,
        **{col: X_train[col].isna().sum() + X_test[col].isna().sum() for col in FEATURES},
    })

    balance_rows.append({
        'dataset_version': name,
        'train_positive_rate': y_train.mean(),
        'test_positive_rate': y_test.mean(),
    })

shape_table = pd.DataFrame(shape_rows)
missing_table = pd.DataFrame(missing_rows)
balance_table = pd.DataFrame(balance_rows)

shape_table


Unnamed: 0,dataset_version,train_rows,test_rows,n_features
0,unscaled,3390,848,8
1,standard_scaled,3390,848,8
2,robust_scaled,3390,848,8


### **Missing values across selected features**

In [8]:
missing_table

Unnamed: 0,dataset_version,bmi,hypertension,pulse_pressure,cigarettes_per_day,total_cholesterol,glucose,heart_rate,age_group_code
0,unscaled,0,0,0,0,0,0,0,0
1,standard_scaled,0,0,0,0,0,0,0,0
2,robust_scaled,0,0,0,0,0,0,0,0


### **Class balance (positive rate)**

In [9]:
balance_table

Unnamed: 0,dataset_version,train_positive_rate,test_positive_rate
0,unscaled,0.151917,0.152123
1,standard_scaled,0.151917,0.152123
2,robust_scaled,0.151917,0.152123


## **Modeling plan**
- Logistic Regression: linear decision boundary; benefits from scaling; class_weight supported.
- KNN: distance-based; highly scaling-sensitive; no class_weight.
- Decision Tree: non-linear, scale-invariant; supports class_weight; can overfit.
- Random Forest: ensemble of trees; robust to scaling; supports class_weight; tuned with small randomized search.
- SVM (RBF/linear): margin-based; needs scaling; supports class_weight; probability=True for ROC/PR.
- Imbalance: SMOTE within CV only (if imblearn).

## **Model and search setup**

In [10]:
base_estimators = get_estimators(RANDOM_STATE)
search_spaces = get_search_spaces(RANDOM_STATE)

imbalance_options = [
    ('none', {'use_smote': False, 'use_class_weight': False}),
    ('class_weight', {'use_smote': False, 'use_class_weight': True}),
]

if HAS_IMBLEARN:
    imbalance_options += [
        ('smote', {'use_smote': True, 'use_class_weight': False}),
        ('smote+class_weight', {'use_smote': True, 'use_class_weight': True}),
    ]

print(f"SMOTE available: {HAS_IMBLEARN}")
print('Models:', list(base_estimators.keys()))
print('Imbalance strategies:', [name for name, _ in imbalance_options])


SMOTE available: True
Models: ['logistic_regression', 'knn', 'decision_tree', 'random_forest', 'svm']
Imbalance strategies: ['none', 'class_weight', 'smote', 'smote+class_weight']


## **Experiment helper**

In [11]:
cv = get_cv(random_state=RANDOM_STATE)
scale_sensitive = {'logistic_regression', 'knn', 'svm'}


def choose_scaler(dataset_version: str, model_name: str):
    if dataset_version == 'unscaled' and model_name in scale_sensitive:
        return StandardScaler()
    return None


def run_one_experiment(dataset_version, data_dict, model_name, imbalance_name, imbalance_cfg):
    estimator = clone(base_estimators[model_name])
    if imbalance_cfg['use_class_weight']:
        if 'class_weight' not in estimator.get_params():
            return None, None, None, None
        estimator.set_params(class_weight='balanced')

    use_smote = imbalance_cfg['use_smote'] and HAS_IMBLEARN
    preprocessor = make_preprocessor(
        FEATURES,
        scaler=choose_scaler(dataset_version, model_name),
    )

    pipeline = build_pipeline(
        preprocessor=preprocessor,
        estimator=estimator,
        use_smote=use_smote,
        random_state=RANDOM_STATE,
        allow_smote=HAS_IMBLEARN,
    )

    search_cfg = search_spaces.get(model_name, {'search_type': 'grid', 'params': {}})
    search = tune_model(
        estimator=pipeline,
        param_space=search_cfg.get('params', {}),
        X_train=data_dict['X_train'],
        y_train=data_dict['y_train'],
        cv=cv,
        scoring='roc_auc',
        search_type=search_cfg.get('search_type', 'grid'),
        n_iter=search_cfg.get('n_iter'),
        random_state=RANDOM_STATE,
    )

    best_estimator = search.best_estimator_
    y_pred = best_estimator.predict(data_dict['X_test'])
    y_proba = predict_proba_positive(best_estimator, data_dict['X_test'])
    metrics = evaluate_predictions(data_dict['y_test'], y_pred, y_proba)

    row = {
        'dataset_version': dataset_version,
        'model_name': model_name,
        'imbalance_method': imbalance_name,
        'best_params': json.dumps(search.best_params_, sort_keys=True),
        'cv_best_roc_auc': search.best_score_,
        'test_accuracy': metrics['accuracy'],
        'test_precision': metrics['precision'],
        'test_recall': metrics['recall'],
        'test_f1': metrics['f1'],
        'test_roc_auc': metrics['roc_auc'],
    }
    return row, best_estimator, y_pred, y_proba


## **Run experiments**
Iterate over dataset versions, models, and imbalance strategies; collect metrics and artifacts in memory. No pickles saved here.

In [12]:
comparison_rows = []
run_artifacts = {}

for dataset_version, data in datasets.items():
    print(f"=== Dataset: {dataset_version} ===")
    for model_name in base_estimators.keys():
        for imbalance_name, imbalance_cfg in imbalance_options:
            row, estimator, y_pred, y_proba = run_one_experiment(dataset_version, data, model_name, imbalance_name, imbalance_cfg)
            if row is None:
                print(f"Skipped {dataset_version} | {model_name} | {imbalance_name} (unsupported combo)")
                continue
            comparison_rows.append(row)
            prefix = f"{dataset_version}__{model_name}__{imbalance_name}"
            run_artifacts[prefix] = {
                'estimator': estimator,
                'y_pred': y_pred,
                'y_proba': y_proba,
                'y_true': data['y_test'],
            }
            print(
                f"{prefix} | CV roc_auc={row['cv_best_roc_auc']:.3f} | "
                f"Test f1={row['test_f1']:.3f} | Test precision={row['test_precision']:.3f} | "
                f"Test recall={row['test_recall']:.3f}"
            )


=== Dataset: unscaled ===
unscaled__logistic_regression__none | CV roc_auc=0.722 | Test f1=0.071 | Test precision=0.455 | Test recall=0.039
unscaled__logistic_regression__class_weight | CV roc_auc=0.721 | Test f1=0.342 | Test precision=0.241 | Test recall=0.589
unscaled__logistic_regression__smote | CV roc_auc=0.721 | Test f1=0.337 | Test precision=0.235 | Test recall=0.597
unscaled__logistic_regression__smote+class_weight | CV roc_auc=0.721 | Test f1=0.337 | Test precision=0.235 | Test recall=0.597
unscaled__knn__none | CV roc_auc=0.638 | Test f1=0.127 | Test precision=0.357 | Test recall=0.078
Skipped unscaled | knn | class_weight (unsupported combo)
unscaled__knn__smote | CV roc_auc=0.618 | Test f1=0.297 | Test precision=0.206 | Test recall=0.535
Skipped unscaled | knn | smote+class_weight (unsupported combo)
unscaled__decision_tree__none | CV roc_auc=0.611 | Test f1=0.189 | Test precision=0.333 | Test recall=0.132
unscaled__decision_tree__class_weight | CV roc_auc=0.593 | Test f1=0

## **Results table (F1-first sorting)**

In [13]:
comparison_df = pd.DataFrame(comparison_rows)
sort_cols = ['test_f1', 'test_precision', 'test_recall', 'test_accuracy']
comparison_df = comparison_df.sort_values(sort_cols, ascending=False)

comparison_csv_path = os.path.join(DATA_DIR, 'model_comparison.csv')
comparison_df.to_csv(comparison_csv_path, index=False)

print(f"Saved comparison table: {comparison_csv_path}")
comparison_df.head(10)


Saved comparison table: ../data/processed\model_comparison.csv


Unnamed: 0,dataset_version,model_name,imbalance_method,best_params,cv_best_roc_auc,test_accuracy,test_precision,test_recall,test_f1,test_roc_auc
33,standard_scaled,svm,class_weight,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.721043,0.669811,0.252459,0.596899,0.354839,0.689712
15,unscaled,svm,class_weight,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.721074,0.668632,0.251634,0.596899,0.354023,0.689712
51,robust_scaled,svm,class_weight,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.720986,0.668632,0.251634,0.596899,0.354023,0.689588
16,unscaled,svm,smote,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.721145,0.650943,0.244648,0.620155,0.350877,0.687615
17,unscaled,svm,smote+class_weight,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.721145,0.650943,0.244648,0.620155,0.350877,0.687615
34,standard_scaled,svm,smote,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.721135,0.650943,0.244648,0.620155,0.350877,0.687637
35,standard_scaled,svm,smote+class_weight,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.721135,0.650943,0.244648,0.620155,0.350877,0.687637
52,robust_scaled,svm,smote,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.720929,0.650943,0.244648,0.620155,0.350877,0.688887
53,robust_scaled,svm,smote+class_weight,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.720929,0.650943,0.244648,0.620155,0.350877,0.688887
37,robust_scaled,logistic_regression,class_weight,"{""model__C"": 1, ""model__penalty"": ""l2""}",0.721381,0.65684,0.242038,0.589147,0.343115,0.686785


## **Results table (ROC-first sorting)**

In [22]:
comparison_df.sort_values("test_roc_auc", ascending=False).head(10)


Unnamed: 0,dataset_version,model_name,imbalance_method,best_params,cv_best_roc_auc,test_accuracy,test_precision,test_recall,test_f1,test_roc_auc
33,standard_scaled,svm,class_weight,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.721043,0.669811,0.252459,0.596899,0.354839,0.689712
15,unscaled,svm,class_weight,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.721074,0.668632,0.251634,0.596899,0.354023,0.689712
51,robust_scaled,svm,class_weight,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.720986,0.668632,0.251634,0.596899,0.354023,0.689588
52,robust_scaled,svm,smote,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.720929,0.650943,0.244648,0.620155,0.350877,0.688887
53,robust_scaled,svm,smote+class_weight,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.720929,0.650943,0.244648,0.620155,0.350877,0.688887
34,standard_scaled,svm,smote,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.721135,0.650943,0.244648,0.620155,0.350877,0.687637
35,standard_scaled,svm,smote+class_weight,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.721135,0.650943,0.244648,0.620155,0.350877,0.687637
16,unscaled,svm,smote,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.721145,0.650943,0.244648,0.620155,0.350877,0.687615
17,unscaled,svm,smote+class_weight,"{""model__C"": 5, ""model__gamma"": ""scale"", ""mode...",0.721145,0.650943,0.244648,0.620155,0.350877,0.687615
36,robust_scaled,logistic_regression,none,"{""model__C"": 10, ""model__penalty"": ""l1""}",0.722009,0.846698,0.454545,0.03876,0.071429,0.687141


### **Grouped summaries**

In [14]:
by_model = comparison_df.groupby('model_name')[['test_accuracy','test_precision','test_recall','test_f1','test_roc_auc']]
model_summary = by_model.agg(['mean','std'])

by_dataset = comparison_df.groupby('dataset_version')[['test_accuracy','test_precision','test_recall','test_f1','test_roc_auc']]
dataset_summary = by_dataset.agg(['mean','std'])

pivot_f1 = comparison_df.pivot_table(
    index='model_name', columns='dataset_version', values='test_f1', aggfunc='max'
)

print('Model-level summary (mean/std):')
display(model_summary)
print()
print('Dataset-level summary (mean/std):')
display(dataset_summary)
print()
print('Best test_f1 per model x dataset:')
display(pivot_f1)


Model-level summary (mean/std):


Unnamed: 0_level_0,test_accuracy,test_accuracy,test_precision,test_precision,test_recall,test_recall,test_f1,test_f1,test_roc_auc,test_roc_auc
Unnamed: 0_level_1,mean,std,mean,std,mean,std,mean,std,mean,std
model_name,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2
decision_tree,0.726022,0.070508,0.248237,0.054311,0.331395,0.149445,0.256024,0.048225,0.604569,0.020597
knn,0.724843,0.125432,0.285344,0.092737,0.303618,0.245011,0.211664,0.088015,0.61498,0.013705
logistic_regression,0.697425,0.090156,0.291898,0.098113,0.456718,0.252092,0.272723,0.121408,0.686107,0.001008
random_forest,0.787932,0.044263,0.346352,0.13631,0.237726,0.141144,0.226891,0.104309,0.660673,0.003007
svm,0.704697,0.086685,0.268635,0.129164,0.468346,0.261139,0.280329,0.131916,0.66363,0.054167



Dataset-level summary (mean/std):


Unnamed: 0_level_0,test_accuracy,test_accuracy,test_precision,test_precision,test_recall,test_recall,test_f1,test_f1,test_roc_auc,test_roc_auc
Unnamed: 0_level_1,mean,std,mean,std,mean,std,mean,std,mean,std
dataset_version,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2
robust_scaled,0.729167,0.08685,0.271752,0.118839,0.363049,0.229163,0.251424,0.112971,0.650416,0.041312
standard_scaled,0.719209,0.087121,0.293946,0.108177,0.386305,0.225509,0.259754,0.102795,0.645638,0.044833
unscaled,0.73729,0.086665,0.299498,0.104881,0.347976,0.23048,0.250021,0.100617,0.652258,0.03891



Best test_f1 per model x dataset:


dataset_version,robust_scaled,standard_scaled,unscaled
model_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
decision_tree,0.316547,0.315036,0.316547
knn,0.280255,0.297414,0.297414
logistic_regression,0.343115,0.341573,0.341573
random_forest,0.3125,0.312139,0.269388
svm,0.354023,0.354839,0.354023


### **Some leading rows**

In [15]:
comparison_df[['dataset_version','model_name','imbalance_method','test_precision','test_recall','test_f1','test_accuracy','test_roc_auc']].head(10)


Unnamed: 0,dataset_version,model_name,imbalance_method,test_precision,test_recall,test_f1,test_accuracy,test_roc_auc
33,standard_scaled,svm,class_weight,0.252459,0.596899,0.354839,0.669811,0.689712
15,unscaled,svm,class_weight,0.251634,0.596899,0.354023,0.668632,0.689712
51,robust_scaled,svm,class_weight,0.251634,0.596899,0.354023,0.668632,0.689588
16,unscaled,svm,smote,0.244648,0.620155,0.350877,0.650943,0.687615
17,unscaled,svm,smote+class_weight,0.244648,0.620155,0.350877,0.650943,0.687615
34,standard_scaled,svm,smote,0.244648,0.620155,0.350877,0.650943,0.687637
35,standard_scaled,svm,smote+class_weight,0.244648,0.620155,0.350877,0.650943,0.687637
52,robust_scaled,svm,smote,0.244648,0.620155,0.350877,0.650943,0.688887
53,robust_scaled,svm,smote+class_weight,0.244648,0.620155,0.350877,0.650943,0.688887
37,robust_scaled,logistic_regression,class_weight,0.242038,0.589147,0.343115,0.65684,0.686785


## **Per-run figures (confusion, ROC, PR)**

In [16]:
for prefix, artifacts in run_artifacts.items():
    y_true = artifacts['y_true']
    y_pred = artifacts['y_pred']
    y_proba = artifacts['y_proba']

    fig_cm, _ = plot_confusion_matrix(y_true, y_pred, title=f"Confusion | {prefix}")
    fig_cm.savefig(os.path.join(FIG_DIR, f"{prefix}__confusion.png"), dpi=150, bbox_inches='tight')
    plt.close(fig_cm)

    if y_proba is not None:
        roc_plot = plot_roc_curve(y_true, y_proba, title=f"ROC | {prefix}", label=prefix)
        if roc_plot is not None:
            fig_roc, _ = roc_plot
            fig_roc.savefig(os.path.join(FIG_DIR, f"{prefix}__roc.png"), dpi=150, bbox_inches='tight')
            plt.close(fig_roc)

        pr_plot = plot_precision_recall_curve(y_true, y_proba, title=f"PR | {prefix}", label=prefix)
        if pr_plot is not None:
            fig_pr, _ = pr_plot
            fig_pr.savefig(os.path.join(FIG_DIR, f"{prefix}__pr.png"), dpi=150, bbox_inches='tight')
            plt.close(fig_pr)


## **Aggregate figures**

In [17]:
plt.figure(figsize=(10, max(6, 0.3 * len(comparison_df))))
plt.barh(range(len(comparison_df)), comparison_df['test_f1'], color='#4C72B0')
plt.yticks(range(len(comparison_df)), [
    f"{row.dataset_version} | {row.model_name} | {row.imbalance_method}"
    for row in comparison_df.itertuples()
])
plt.gca().invert_yaxis()
plt.xlabel('Test F1')
plt.title('Test F1 across all runs')
plt.tight_layout()
plt.savefig(os.path.join(FIG_DIR, 'summary__f1_barplot.png'), dpi=150)
plt.close()

TopN = comparison_df.head(10).reset_index(drop=True)
x = np.arange(len(TopN))
width = 0.35
plt.figure(figsize=(12, 6))
plt.bar(x - width/2, TopN['test_precision'], width, label='Precision')
plt.bar(x + width/2, TopN['test_recall'], width, label='Recall')
plt.xticks(x, [f"{row.dataset_version}|{row.model_name}|{row.imbalance_method}" for row in TopN.itertuples()], rotation=45, ha='right')
plt.ylabel('Score')
plt.title('Precision and Recall for Top 10 runs')
plt.legend()
plt.tight_layout()
plt.savefig(os.path.join(FIG_DIR, 'summary__precision_recall_top10.png'), dpi=150)
plt.close()

plt.figure(figsize=(8, 6))
plt.scatter(comparison_df['test_recall'], comparison_df['test_precision'], c=pd.Categorical(comparison_df['model_name']).codes)
for i, row in enumerate(comparison_df.itertuples()):
    if i < 8:
        plt.annotate(row.model_name, (row.test_recall, row.test_precision), fontsize=8)
plt.xlabel('Recall (test)')
plt.ylabel('Precision (test)')
plt.title('Precision vs Recall by model')
plt.tight_layout()
plt.savefig(os.path.join(FIG_DIR, 'summary__precision_recall_scatter.png'), dpi=150)
plt.close()


## **Calibration-style view for the best model**
Overlay score histograms for positives vs negatives.

In [18]:
best_prefix = comparison_df.iloc[0].dataset_version + '__' + comparison_df.iloc[0].model_name + '__' + comparison_df.iloc[0].imbalance_method
best_artifacts = run_artifacts[best_prefix]
if best_artifacts['y_proba'] is not None:
    y_true = best_artifacts['y_true']
    y_proba = best_artifacts['y_proba']
    plt.figure(figsize=(8,5))
    plt.hist(y_proba[y_true==1], bins=20, alpha=0.6, label='Positive', density=True)
    plt.hist(y_proba[y_true==0], bins=20, alpha=0.6, label='Negative', density=True)
    plt.xlabel('Predicted score')
    plt.ylabel('Density')
    plt.title(f'Score distributions | {best_prefix}')
    plt.legend()
    plt.tight_layout()
    plt.savefig(os.path.join(FIG_DIR, 'summary__score_hist_best.png'), dpi=150)
    plt.close()


## Dialectical discussion
- What worked: highlight models/imbalance methods with highest F1 and balanced precision/recall.
- What struggled: note cases with low precision or recall; distance-based models on unscaled data, etc.
- Imbalance impact: compare class_weight vs SMOTE; recall gains vs precision trade-offs.
- Scaling impact: unscaled vs scaled performance for scale-sensitive models.
- Threshold caution: default 0.5 may not maximize F1; future work could tune thresholds via PR curves.

## **We save ONLY the best model**

In [19]:
best_row = comparison_df.iloc[0]
best_prefix = f"{best_row.dataset_version}__{best_row.model_name}__{best_row.imbalance_method}"
best_model = run_artifacts[best_prefix]['estimator']
best_path = os.path.join(MODELS_DIR, f"best_model__{best_prefix}.pkl")
joblib.dump(best_model, best_path)

print('Best model (F1-first rule):')
print(best_row[['dataset_version','model_name','imbalance_method','test_f1','test_precision','test_recall','test_accuracy','test_roc_auc']])
print(f"Saved best model to: {best_path}")


Best model (F1-first rule):
dataset_version     standard_scaled
model_name                      svm
imbalance_method       class_weight
test_f1                    0.354839
test_precision             0.252459
test_recall                0.596899
test_accuracy              0.669811
test_roc_auc               0.689712
Name: 33, dtype: object
Saved best model to: ../models\best_model__standard_scaled__svm__class_weight.pkl
