# Phase 7: Advanced Models

This notebook trains advanced models with hyperparameter tuning:
- **LightGBM**: Gradient boosting that handles NaN natively (uses `minimal` datasets)
- **MLP (sklearn)**: Neural network requiring complete data (uses `full` datasets)

## MLflow Tracking

We use MLflow to track experiments:
- **Parameters**: Hyperparameters for each model
- **Metrics**: F1, accuracy, AUC (classification); RMSE, MAE, R² (regression)
- **Artifacts**: Trained models, confusion matrices, ROC curves
- **Tags**: Model type, feature set, task type

To view the MLflow UI after running this notebook:
```bash
cd /path/to/project
mlflow ui --backend-store-uri mlruns
```
Then open http://localhost:5000 in your browser.

## Optuna Hyperparameter Optimization

We use Optuna's TPE (Tree-structured Parzen Estimator) sampler for Bayesian optimization:
- More efficient than grid/random search
- Learns from previous trials to focus on promising regions
- Handles conditional hyperparameters (e.g., layer sizes in MLP)

In [None]:
# Standard imports
import json
import sys
import warnings
from datetime import datetime
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

# ML imports
import joblib
import lightgbm as lgb
import mlflow
import mlflow.sklearn
import optuna
from sklearn.metrics import (
    accuracy_score, f1_score, roc_auc_score,
    mean_squared_error, mean_absolute_error, r2_score,
    confusion_matrix, classification_report
)
from sklearn.model_selection import StratifiedKFold, KFold, cross_val_score
from sklearn.neural_network import MLPClassifier, MLPRegressor
from sklearn.preprocessing import StandardScaler

# Suppress warnings for cleaner output
warnings.filterwarnings('ignore', category=FutureWarning)
warnings.filterwarnings('ignore', category=UserWarning)
optuna.logging.set_verbosity(optuna.logging.WARNING)

# Add project root to path
project_root = Path.cwd().parent
sys.path.insert(0, str(project_root))

# Project imports
from src.models.train import (
    split_data, get_class_weights, setup_mlflow, log_experiment,
    tune_model, save_model, train_with_cv
)
from src.models.evaluate import (
    compute_classification_metrics, compute_regression_metrics,
    plot_confusion_matrix, plot_roc_curves, plot_residuals,
    compare_models_table, plot_model_comparison, plot_feature_set_comparison,
    generate_classification_report_figures
)

print(f"Project root: {project_root}")
print(f"MLflow version: {mlflow.__version__}")
print(f"LightGBM version: {lgb.__version__}")
print(f"Optuna version: {optuna.__version__}")

In [None]:
# Configuration
RANDOM_STATE = 42
CV_FOLDS = 3  # 3-fold CV for faster iteration

# Trial counts - adjust these for quick test vs full run
# Quick test: 10/5 trials (~15 min total)
# Full run: 100/50 trials (~10 hours total)
N_TRIALS_LIGHTGBM = 10  # Quick test (set to 100 for full run)
N_TRIALS_MLP = 5  # Quick test (set to 50 for full run)

# Paths
DATA_DIR = project_root / 'data' / 'processed'
MODELS_DIR = project_root / 'models'
FIGURES_DIR = project_root / 'reports' / 'figures'
MLRUNS_DIR = project_root / 'mlruns'

# Create directories
MODELS_DIR.mkdir(exist_ok=True)
FIGURES_DIR.mkdir(parents=True, exist_ok=True)

print(f"Data directory: {DATA_DIR}")
print(f"Models directory: {MODELS_DIR}")
print(f"Figures directory: {FIGURES_DIR}")
print(f"\nTrial configuration:")
print(f"  LightGBM: {N_TRIALS_LIGHTGBM} trials")
print(f"  MLP: {N_TRIALS_MLP} trials")

## 1. Load Data

We have 4 feature datasets:
- `X_with_labs_minimal`: Has NaN, for LightGBM (109 features)
- `X_with_labs_full`: No NaN, for MLP (96 features)
- `X_without_labs_minimal`: Has NaN, for LightGBM (92 features)
- `X_without_labs_full`: No NaN, for MLP (82 features)

And 2 target variables:
- `y_classification`: 3-class diabetes status (0, 1, 2)
- `y_regression`: HbA1c level (continuous)

In [None]:
# Load all datasets
print("Loading datasets...")

# Feature matrices
X_with_labs_minimal = pd.read_parquet(DATA_DIR / 'X_with_labs_minimal.parquet')
X_with_labs_full = pd.read_parquet(DATA_DIR / 'X_with_labs_full.parquet')
X_without_labs_minimal = pd.read_parquet(DATA_DIR / 'X_without_labs_minimal.parquet')
X_without_labs_full = pd.read_parquet(DATA_DIR / 'X_without_labs_full.parquet')

# Target variables
# Classification target: DIABETES_STATUS (0, 1, 2) from any y file
y_classification = pd.read_parquet(DATA_DIR / 'y_with_labs_minimal.parquet')['DIABETES_STATUS']

# Regression target: LBXGH (HbA1c) from study_population
# (not in processed y files because it's a target, not a feature)
INTERIM_DIR = project_root / 'data' / 'interim'
study_pop = pd.read_parquet(INTERIM_DIR / 'study_population.parquet')
y_regression = study_pop.loc[y_classification.index, 'LBXGH']

print("\nDataset shapes:")
print(f"  X_with_labs_minimal:    {X_with_labs_minimal.shape} (has NaN: {X_with_labs_minimal.isna().any().any()})")
print(f"  X_with_labs_full:       {X_with_labs_full.shape} (has NaN: {X_with_labs_full.isna().any().any()})")
print(f"  X_without_labs_minimal: {X_without_labs_minimal.shape} (has NaN: {X_without_labs_minimal.isna().any().any()})")
print(f"  X_without_labs_full:    {X_without_labs_full.shape} (has NaN: {X_without_labs_full.isna().any().any()})")
print(f"\n  y_classification:       {y_classification.shape}")
print(f"  y_regression:           {y_regression.shape} (non-null: {y_regression.notna().sum()})")

In [None]:
# Verify target distributions
print("Target variable distributions:")
print("\nClassification (DIABETES_STATUS):")
status_counts = y_classification.value_counts().sort_index()
status_labels = {0: 'No Diabetes', 1: 'Prediabetes', 2: 'Diabetes'}
for idx, count in status_counts.items():
    print(f"  {status_labels[idx]}: {count:,} ({count/len(y_classification)*100:.1f}%)")

print(f"\nRegression (HbA1c):")
print(f"  Range: {y_regression.min():.1f}% - {y_regression.max():.1f}%")
print(f"  Mean:  {y_regression.mean():.2f}%")
print(f"  Std:   {y_regression.std():.2f}%")

## 2. Data Splitting

Split data into train (70%), validation (15%), and test (15%) sets.
- Stratified by target class for classification
- Same random seed for reproducibility

In [None]:
# Split classification data (using with_labs_minimal as reference for indices)
X_train_idx, X_val_idx, X_test_idx, y_train_cls, y_val_cls, y_test_cls = split_data(
    X_with_labs_minimal, y_classification,
    test_size=0.15, val_size=0.15, random_state=RANDOM_STATE, stratify=True
)

# Get indices for splitting other datasets consistently
train_idx = X_train_idx.index
val_idx = X_val_idx.index
test_idx = X_test_idx.index

print(f"Data split sizes:")
print(f"  Train: {len(train_idx):,} ({len(train_idx)/len(y_classification)*100:.1f}%)")
print(f"  Val:   {len(val_idx):,} ({len(val_idx)/len(y_classification)*100:.1f}%)")
print(f"  Test:  {len(test_idx):,} ({len(test_idx)/len(y_classification)*100:.1f}%)")

In [None]:
# Create all data splits using consistent indices
def create_splits(X, y, train_idx, val_idx, test_idx):
    """Create train/val/test splits using pre-defined indices."""
    # Handle case where y might have different index (regression has fewer samples)
    common_train = train_idx.intersection(y.index)
    common_val = val_idx.intersection(y.index)
    common_test = test_idx.intersection(y.index)
    
    X_train = X.loc[common_train]
    X_val = X.loc[common_val]
    X_test = X.loc[common_test]
    y_train = y.loc[common_train]
    y_val = y.loc[common_val]
    y_test = y.loc[common_test]
    
    return X_train, X_val, X_test, y_train, y_val, y_test

# Classification splits
splits_cls = {
    'with_labs_minimal': create_splits(X_with_labs_minimal, y_classification, train_idx, val_idx, test_idx),
    'with_labs_full': create_splits(X_with_labs_full, y_classification, train_idx, val_idx, test_idx),
    'without_labs_minimal': create_splits(X_without_labs_minimal, y_classification, train_idx, val_idx, test_idx),
    'without_labs_full': create_splits(X_without_labs_full, y_classification, train_idx, val_idx, test_idx),
}

# Regression splits (fewer samples - only those with HbA1c values)
splits_reg = {
    'with_labs_minimal': create_splits(X_with_labs_minimal, y_regression.dropna(), train_idx, val_idx, test_idx),
    'with_labs_full': create_splits(X_with_labs_full, y_regression.dropna(), train_idx, val_idx, test_idx),
    'without_labs_minimal': create_splits(X_without_labs_minimal, y_regression.dropna(), train_idx, val_idx, test_idx),
    'without_labs_full': create_splits(X_without_labs_full, y_regression.dropna(), train_idx, val_idx, test_idx),
}

print("Classification split sizes:")
for name, (X_tr, X_v, X_te, y_tr, y_v, y_te) in splits_cls.items():
    print(f"  {name}: train={len(X_tr)}, val={len(X_v)}, test={len(X_te)}")

print("\nRegression split sizes (fewer samples due to missing HbA1c):")
for name, (X_tr, X_v, X_te, y_tr, y_v, y_te) in splits_reg.items():
    print(f"  {name}: train={len(X_tr)}, val={len(X_v)}, test={len(X_te)}")

In [None]:
# Calculate class weights for handling imbalance
class_weights = get_class_weights(y_train_cls)
print("Class weights (for balanced training):")
for cls, weight in class_weights.items():
    print(f"  {status_labels[cls]}: {weight:.3f}")

## 3. Setup MLflow

MLflow is an open-source platform for managing the ML lifecycle. We use it to:

1. **Track experiments**: Each model training run is logged with:
   - Parameters (hyperparameters)
   - Metrics (F1, AUC, RMSE, etc.)
   - Artifacts (saved models, plots)
   - Tags (model type, feature set)

2. **Compare models**: The MLflow UI lets us compare runs side-by-side

3. **Reproduce results**: Every run is logged with exact parameters

In [None]:
# Setup MLflow tracking
mlflow.set_tracking_uri(str(MLRUNS_DIR))
experiment_name = 'diabetes-prediction-phase7'
mlflow.set_experiment(experiment_name)

print(f"MLflow tracking URI: {MLRUNS_DIR}")
print(f"Experiment name: {experiment_name}")
print(f"\nTo view results, run:")
print(f"  cd {project_root}")
print(f"  mlflow ui --backend-store-uri mlruns")

## 4. LightGBM Training

LightGBM is a gradient boosting framework that:
- **Handles NaN natively**: Uses `minimal` imputation datasets
- **Fast training**: Uses histogram-based algorithm
- **Good for tabular data**: Often best performer for structured data

### Hyperparameters tuned with Optuna:
- `n_estimators`: Number of boosting rounds (100-500)
- `max_depth`: Maximum tree depth (3-10)
- `learning_rate`: Step size shrinkage (0.01-0.3)
- `num_leaves`: Maximum leaves per tree (15-127)
- `min_child_samples`: Minimum samples in leaf (5-100)
- `reg_alpha`, `reg_lambda`: L1/L2 regularization
- `subsample`, `colsample_bytree`: Row/column sampling

In [None]:
def train_lightgbm_with_optuna(
    X_train, y_train, X_val, y_val,
    task='classification',
    n_trials=100,
    class_weights=None,
    cv_folds=3,
):
    """
    Train LightGBM with Optuna hyperparameter optimization.
    
    Uses 3-fold CV during tuning, then retrains best model on full train set.
    """
    # Scoring metric
    scoring = 'f1_macro' if task == 'classification' else 'neg_root_mean_squared_error'
    
    # CV splitter
    if task == 'classification':
        cv = StratifiedKFold(n_splits=cv_folds, shuffle=True, random_state=RANDOM_STATE)
    else:
        cv = KFold(n_splits=cv_folds, shuffle=True, random_state=RANDOM_STATE)
    
    # Convert to numpy for sklearn compatibility
    X_train_np = X_train.values
    y_train_np = y_train.values
    
    def objective(trial):
        params = {
            'n_estimators': trial.suggest_int('n_estimators', 100, 500),
            'max_depth': trial.suggest_int('max_depth', 3, 10),
            'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.3, log=True),
            'num_leaves': trial.suggest_int('num_leaves', 15, 127),
            'min_child_samples': trial.suggest_int('min_child_samples', 5, 100),
            'reg_alpha': trial.suggest_float('reg_alpha', 1e-8, 10, log=True),
            'reg_lambda': trial.suggest_float('reg_lambda', 1e-8, 10, log=True),
            'subsample': trial.suggest_float('subsample', 0.5, 1.0),
            'colsample_bytree': trial.suggest_float('colsample_bytree', 0.5, 1.0),
            'random_state': RANDOM_STATE,
            'verbose': -1,
            'n_jobs': -1,
        }
        
        if task == 'classification':
            params['class_weight'] = class_weights
            model = lgb.LGBMClassifier(**params)
        else:
            model = lgb.LGBMRegressor(**params)
        
        try:
            scores = cross_val_score(model, X_train_np, y_train_np, cv=cv, scoring=scoring, n_jobs=-1)
            return scores.mean()
        except Exception as e:
            return -1e10 if 'neg' in scoring else 0.0
    
    # Run Optuna optimization
    study = optuna.create_study(
        direction='maximize',
        sampler=optuna.samplers.TPESampler(seed=RANDOM_STATE),
    )
    
    print(f"Starting Optuna optimization ({n_trials} trials)...")
    study.optimize(objective, n_trials=n_trials, show_progress_bar=True)
    
    # Get best parameters
    best_params = study.best_params
    best_params['random_state'] = RANDOM_STATE
    best_params['verbose'] = -1
    best_params['n_jobs'] = -1
    
    print(f"\nBest trial score: {study.best_value:.4f}")
    print(f"Best parameters:")
    for k, v in best_params.items():
        if isinstance(v, float):
            print(f"  {k}: {v:.6f}")
        else:
            print(f"  {k}: {v}")
    
    # Train final model on full training data
    if task == 'classification':
        best_params['class_weight'] = class_weights
        best_model = lgb.LGBMClassifier(**best_params)
    else:
        best_model = lgb.LGBMRegressor(**best_params)
    
    best_model.fit(X_train_np, y_train_np)
    
    return best_model, best_params, study

### 4.1 LightGBM Classification

In [None]:
# Store results for comparison
results_classification = {}
models_classification = {}

In [None]:
%%time
# LightGBM Classification - With Labs
print("="*60)
print("LightGBM Classification - WITH LABS")
print("="*60)

X_train, X_val, X_test, y_train, y_val, y_test = splits_cls['with_labs_minimal']

lgb_clf_with_labs, lgb_clf_params_with_labs, study_lgb_clf_with = train_lightgbm_with_optuna(
    X_train, y_train, X_val, y_val,
    task='classification',
    n_trials=N_TRIALS_LIGHTGBM,
    class_weights=class_weights,
    cv_folds=CV_FOLDS,
)

# Evaluate on validation set
y_pred = lgb_clf_with_labs.predict(X_val.values)
y_prob = lgb_clf_with_labs.predict_proba(X_val.values)
metrics = compute_classification_metrics(y_val.values, y_pred, y_prob)

print(f"\nValidation Results:")
print(f"  Accuracy: {metrics['accuracy']:.4f}")
print(f"  F1 Macro: {metrics['f1_macro']:.4f}")
print(f"  ROC AUC:  {metrics['roc_auc_ovr']:.4f}")

# Log to MLflow
with mlflow.start_run(run_name='LightGBM_cls_with_labs'):
    mlflow.set_tag('model_type', 'LightGBM')
    mlflow.set_tag('task', 'classification')
    mlflow.set_tag('feature_set', 'with_labs')
    mlflow.set_tag('imputation', 'minimal')
    
    # Log parameters
    for k, v in lgb_clf_params_with_labs.items():
        if k not in ['class_weight', 'verbose', 'n_jobs']:
            mlflow.log_param(k, v)
    
    # Log metrics
    mlflow.log_metric('val_accuracy', metrics['accuracy'])
    mlflow.log_metric('val_f1_macro', metrics['f1_macro'])
    mlflow.log_metric('val_roc_auc_ovr', metrics['roc_auc_ovr'])
    mlflow.log_metric('best_cv_score', study_lgb_clf_with.best_value)
    mlflow.log_metric('n_trials', N_TRIALS_LIGHTGBM)
    
    # Log model
    mlflow.sklearn.log_model(lgb_clf_with_labs, 'model')

# Store results
results_classification['LightGBM (with labs)'] = metrics
models_classification['LightGBM (with labs)'] = lgb_clf_with_labs

In [None]:
%%time
# LightGBM Classification - Without Labs
print("="*60)
print("LightGBM Classification - WITHOUT LABS")
print("="*60)

X_train, X_val, X_test, y_train, y_val, y_test = splits_cls['without_labs_minimal']

lgb_clf_without_labs, lgb_clf_params_without_labs, study_lgb_clf_without = train_lightgbm_with_optuna(
    X_train, y_train, X_val, y_val,
    task='classification',
    n_trials=N_TRIALS_LIGHTGBM,
    class_weights=class_weights,
    cv_folds=CV_FOLDS,
)

# Evaluate on validation set
y_pred = lgb_clf_without_labs.predict(X_val.values)
y_prob = lgb_clf_without_labs.predict_proba(X_val.values)
metrics = compute_classification_metrics(y_val.values, y_pred, y_prob)

print(f"\nValidation Results:")
print(f"  Accuracy: {metrics['accuracy']:.4f}")
print(f"  F1 Macro: {metrics['f1_macro']:.4f}")
print(f"  ROC AUC:  {metrics['roc_auc_ovr']:.4f}")

# Log to MLflow
with mlflow.start_run(run_name='LightGBM_cls_without_labs'):
    mlflow.set_tag('model_type', 'LightGBM')
    mlflow.set_tag('task', 'classification')
    mlflow.set_tag('feature_set', 'without_labs')
    mlflow.set_tag('imputation', 'minimal')
    
    for k, v in lgb_clf_params_without_labs.items():
        if k not in ['class_weight', 'verbose', 'n_jobs']:
            mlflow.log_param(k, v)
    
    mlflow.log_metric('val_accuracy', metrics['accuracy'])
    mlflow.log_metric('val_f1_macro', metrics['f1_macro'])
    mlflow.log_metric('val_roc_auc_ovr', metrics['roc_auc_ovr'])
    mlflow.log_metric('best_cv_score', study_lgb_clf_without.best_value)
    mlflow.log_metric('n_trials', N_TRIALS_LIGHTGBM)
    
    mlflow.sklearn.log_model(lgb_clf_without_labs, 'model')

results_classification['LightGBM (without labs)'] = metrics
models_classification['LightGBM (without labs)'] = lgb_clf_without_labs

### 4.2 LightGBM Regression

In [None]:
# Store regression results
results_regression = {}
models_regression = {}

In [None]:
%%time
# LightGBM Regression - With Labs
print("="*60)
print("LightGBM Regression - WITH LABS")
print("="*60)

X_train, X_val, X_test, y_train, y_val, y_test = splits_reg['with_labs_minimal']

lgb_reg_with_labs, lgb_reg_params_with_labs, study_lgb_reg_with = train_lightgbm_with_optuna(
    X_train, y_train, X_val, y_val,
    task='regression',
    n_trials=N_TRIALS_LIGHTGBM,
    cv_folds=CV_FOLDS,
)

# Evaluate on validation set
y_pred = lgb_reg_with_labs.predict(X_val.values)
metrics = compute_regression_metrics(y_val.values, y_pred)

print(f"\nValidation Results:")
print(f"  RMSE: {metrics['rmse']:.4f}")
print(f"  MAE:  {metrics['mae']:.4f}")
print(f"  R²:   {metrics['r2']:.4f}")

# Log to MLflow
with mlflow.start_run(run_name='LightGBM_reg_with_labs'):
    mlflow.set_tag('model_type', 'LightGBM')
    mlflow.set_tag('task', 'regression')
    mlflow.set_tag('feature_set', 'with_labs')
    mlflow.set_tag('imputation', 'minimal')
    
    for k, v in lgb_reg_params_with_labs.items():
        if k not in ['verbose', 'n_jobs']:
            mlflow.log_param(k, v)
    
    mlflow.log_metric('val_rmse', metrics['rmse'])
    mlflow.log_metric('val_mae', metrics['mae'])
    mlflow.log_metric('val_r2', metrics['r2'])
    mlflow.log_metric('best_cv_score', study_lgb_reg_with.best_value)
    mlflow.log_metric('n_trials', N_TRIALS_LIGHTGBM)
    
    mlflow.sklearn.log_model(lgb_reg_with_labs, 'model')

results_regression['LightGBM (with labs)'] = metrics
models_regression['LightGBM (with labs)'] = lgb_reg_with_labs

In [None]:
%%time
# LightGBM Regression - Without Labs
print("="*60)
print("LightGBM Regression - WITHOUT LABS")
print("="*60)

X_train, X_val, X_test, y_train, y_val, y_test = splits_reg['without_labs_minimal']

lgb_reg_without_labs, lgb_reg_params_without_labs, study_lgb_reg_without = train_lightgbm_with_optuna(
    X_train, y_train, X_val, y_val,
    task='regression',
    n_trials=N_TRIALS_LIGHTGBM,
    cv_folds=CV_FOLDS,
)

# Evaluate on validation set
y_pred = lgb_reg_without_labs.predict(X_val.values)
metrics = compute_regression_metrics(y_val.values, y_pred)

print(f"\nValidation Results:")
print(f"  RMSE: {metrics['rmse']:.4f}")
print(f"  MAE:  {metrics['mae']:.4f}")
print(f"  R²:   {metrics['r2']:.4f}")

# Log to MLflow
with mlflow.start_run(run_name='LightGBM_reg_without_labs'):
    mlflow.set_tag('model_type', 'LightGBM')
    mlflow.set_tag('task', 'regression')
    mlflow.set_tag('feature_set', 'without_labs')
    mlflow.set_tag('imputation', 'minimal')
    
    for k, v in lgb_reg_params_without_labs.items():
        if k not in ['verbose', 'n_jobs']:
            mlflow.log_param(k, v)
    
    mlflow.log_metric('val_rmse', metrics['rmse'])
    mlflow.log_metric('val_mae', metrics['mae'])
    mlflow.log_metric('val_r2', metrics['r2'])
    mlflow.log_metric('best_cv_score', study_lgb_reg_without.best_value)
    mlflow.log_metric('n_trials', N_TRIALS_LIGHTGBM)
    
    mlflow.sklearn.log_model(lgb_reg_without_labs, 'model')

results_regression['LightGBM (without labs)'] = metrics
models_regression['LightGBM (without labs)'] = lgb_reg_without_labs

## 5. MLP Training

Multi-Layer Perceptron (sklearn implementation):
- **Requires complete data**: Uses `full` imputation datasets (no NaN)
- **Requires scaling**: Features must be standardized
- **Flexible architecture**: Number of layers and neurons tuned by Optuna

### Hyperparameters tuned:
- `n_layers`: Number of hidden layers (1-3)
- `n_units_l{i}`: Neurons per layer (32-256)
- `activation`: relu or tanh
- `alpha`: L2 regularization strength
- `learning_rate`: constant or adaptive
- `learning_rate_init`: Initial learning rate

In [None]:
def train_mlp_with_optuna(
    X_train, y_train, X_val, y_val,
    task='classification',
    n_trials=50,
    cv_folds=3,
):
    """
    Train MLP with Optuna hyperparameter optimization.
    
    Note: MLP requires standardized features, so we fit a scaler on training data.
    """
    # Standardize features (critical for neural networks)
    scaler = StandardScaler()
    X_train_scaled = scaler.fit_transform(X_train)
    X_val_scaled = scaler.transform(X_val)
    
    # Scoring metric
    scoring = 'f1_macro' if task == 'classification' else 'neg_root_mean_squared_error'
    
    # CV splitter
    if task == 'classification':
        cv = StratifiedKFold(n_splits=cv_folds, shuffle=True, random_state=RANDOM_STATE)
    else:
        cv = KFold(n_splits=cv_folds, shuffle=True, random_state=RANDOM_STATE)
    
    y_train_np = y_train.values if hasattr(y_train, 'values') else y_train
    
    def objective(trial):
        # Layer architecture
        n_layers = trial.suggest_int('n_layers', 1, 3)
        layers = []
        for i in range(n_layers):
            layers.append(trial.suggest_int(f'n_units_l{i}', 32, 256))
        
        params = {
            'hidden_layer_sizes': tuple(layers),
            'activation': trial.suggest_categorical('activation', ['relu', 'tanh']),
            'alpha': trial.suggest_float('alpha', 1e-5, 1e-1, log=True),
            'learning_rate': trial.suggest_categorical('learning_rate', ['constant', 'adaptive']),
            'learning_rate_init': trial.suggest_float('learning_rate_init', 1e-4, 1e-2, log=True),
            'max_iter': 500,
            'early_stopping': True,
            'validation_fraction': 0.1,
            'n_iter_no_change': 20,
            'random_state': RANDOM_STATE,
        }
        
        if task == 'classification':
            model = MLPClassifier(**params)
        else:
            model = MLPRegressor(**params)
        
        try:
            scores = cross_val_score(model, X_train_scaled, y_train_np, cv=cv, scoring=scoring, n_jobs=-1)
            return scores.mean()
        except Exception as e:
            return -1e10 if 'neg' in scoring else 0.0
    
    # Run Optuna optimization
    study = optuna.create_study(
        direction='maximize',
        sampler=optuna.samplers.TPESampler(seed=RANDOM_STATE),
    )
    
    print(f"Starting Optuna optimization ({n_trials} trials)...")
    study.optimize(objective, n_trials=n_trials, show_progress_bar=True)
    
    # Get best parameters
    best_params = study.best_params
    
    # Reconstruct layer sizes
    n_layers = best_params['n_layers']
    layers = [best_params[f'n_units_l{i}'] for i in range(n_layers)]
    
    model_params = {
        'hidden_layer_sizes': tuple(layers),
        'activation': best_params['activation'],
        'alpha': best_params['alpha'],
        'learning_rate': best_params['learning_rate'],
        'learning_rate_init': best_params['learning_rate_init'],
        'max_iter': 500,
        'early_stopping': True,
        'validation_fraction': 0.1,
        'n_iter_no_change': 20,
        'random_state': RANDOM_STATE,
    }
    
    print(f"\nBest trial score: {study.best_value:.4f}")
    print(f"Best parameters:")
    print(f"  hidden_layer_sizes: {layers}")
    print(f"  activation: {model_params['activation']}")
    print(f"  alpha: {model_params['alpha']:.6f}")
    print(f"  learning_rate: {model_params['learning_rate']}")
    print(f"  learning_rate_init: {model_params['learning_rate_init']:.6f}")
    
    # Train final model
    if task == 'classification':
        best_model = MLPClassifier(**model_params)
    else:
        best_model = MLPRegressor(**model_params)
    
    best_model.fit(X_train_scaled, y_train_np)
    
    return best_model, model_params, study, scaler

### 5.1 MLP Classification

In [None]:
%%time
# MLP Classification - With Labs
print("="*60)
print("MLP Classification - WITH LABS")
print("="*60)

X_train, X_val, X_test, y_train, y_val, y_test = splits_cls['with_labs_full']

mlp_clf_with_labs, mlp_clf_params_with_labs, study_mlp_clf_with, scaler_clf_with = train_mlp_with_optuna(
    X_train, y_train, X_val, y_val,
    task='classification',
    n_trials=N_TRIALS_MLP,
    cv_folds=CV_FOLDS,
)

# Evaluate on validation set (must scale!)
X_val_scaled = scaler_clf_with.transform(X_val)
y_pred = mlp_clf_with_labs.predict(X_val_scaled)
y_prob = mlp_clf_with_labs.predict_proba(X_val_scaled)
metrics = compute_classification_metrics(y_val.values, y_pred, y_prob)

print(f"\nValidation Results:")
print(f"  Accuracy: {metrics['accuracy']:.4f}")
print(f"  F1 Macro: {metrics['f1_macro']:.4f}")
print(f"  ROC AUC:  {metrics['roc_auc_ovr']:.4f}")

# Log to MLflow
with mlflow.start_run(run_name='MLP_cls_with_labs'):
    mlflow.set_tag('model_type', 'MLP')
    mlflow.set_tag('task', 'classification')
    mlflow.set_tag('feature_set', 'with_labs')
    mlflow.set_tag('imputation', 'full')
    
    mlflow.log_param('hidden_layer_sizes', str(mlp_clf_params_with_labs['hidden_layer_sizes']))
    mlflow.log_param('activation', mlp_clf_params_with_labs['activation'])
    mlflow.log_param('alpha', mlp_clf_params_with_labs['alpha'])
    mlflow.log_param('learning_rate', mlp_clf_params_with_labs['learning_rate'])
    mlflow.log_param('learning_rate_init', mlp_clf_params_with_labs['learning_rate_init'])
    
    mlflow.log_metric('val_accuracy', metrics['accuracy'])
    mlflow.log_metric('val_f1_macro', metrics['f1_macro'])
    mlflow.log_metric('val_roc_auc_ovr', metrics['roc_auc_ovr'])
    mlflow.log_metric('best_cv_score', study_mlp_clf_with.best_value)
    mlflow.log_metric('n_trials', N_TRIALS_MLP)
    
    mlflow.sklearn.log_model(mlp_clf_with_labs, 'model')

results_classification['MLP (with labs)'] = metrics
models_classification['MLP (with labs)'] = (mlp_clf_with_labs, scaler_clf_with)

In [None]:
%%time
# MLP Classification - Without Labs
print("="*60)
print("MLP Classification - WITHOUT LABS")
print("="*60)

X_train, X_val, X_test, y_train, y_val, y_test = splits_cls['without_labs_full']

mlp_clf_without_labs, mlp_clf_params_without_labs, study_mlp_clf_without, scaler_clf_without = train_mlp_with_optuna(
    X_train, y_train, X_val, y_val,
    task='classification',
    n_trials=N_TRIALS_MLP,
    cv_folds=CV_FOLDS,
)

# Evaluate on validation set
X_val_scaled = scaler_clf_without.transform(X_val)
y_pred = mlp_clf_without_labs.predict(X_val_scaled)
y_prob = mlp_clf_without_labs.predict_proba(X_val_scaled)
metrics = compute_classification_metrics(y_val.values, y_pred, y_prob)

print(f"\nValidation Results:")
print(f"  Accuracy: {metrics['accuracy']:.4f}")
print(f"  F1 Macro: {metrics['f1_macro']:.4f}")
print(f"  ROC AUC:  {metrics['roc_auc_ovr']:.4f}")

# Log to MLflow
with mlflow.start_run(run_name='MLP_cls_without_labs'):
    mlflow.set_tag('model_type', 'MLP')
    mlflow.set_tag('task', 'classification')
    mlflow.set_tag('feature_set', 'without_labs')
    mlflow.set_tag('imputation', 'full')
    
    mlflow.log_param('hidden_layer_sizes', str(mlp_clf_params_without_labs['hidden_layer_sizes']))
    mlflow.log_param('activation', mlp_clf_params_without_labs['activation'])
    mlflow.log_param('alpha', mlp_clf_params_without_labs['alpha'])
    mlflow.log_param('learning_rate', mlp_clf_params_without_labs['learning_rate'])
    mlflow.log_param('learning_rate_init', mlp_clf_params_without_labs['learning_rate_init'])
    
    mlflow.log_metric('val_accuracy', metrics['accuracy'])
    mlflow.log_metric('val_f1_macro', metrics['f1_macro'])
    mlflow.log_metric('val_roc_auc_ovr', metrics['roc_auc_ovr'])
    mlflow.log_metric('best_cv_score', study_mlp_clf_without.best_value)
    mlflow.log_metric('n_trials', N_TRIALS_MLP)
    
    mlflow.sklearn.log_model(mlp_clf_without_labs, 'model')

results_classification['MLP (without labs)'] = metrics
models_classification['MLP (without labs)'] = (mlp_clf_without_labs, scaler_clf_without)

### 5.2 MLP Regression

In [None]:
%%time
# MLP Regression - With Labs
print("="*60)
print("MLP Regression - WITH LABS")
print("="*60)

X_train, X_val, X_test, y_train, y_val, y_test = splits_reg['with_labs_full']

mlp_reg_with_labs, mlp_reg_params_with_labs, study_mlp_reg_with, scaler_reg_with = train_mlp_with_optuna(
    X_train, y_train, X_val, y_val,
    task='regression',
    n_trials=N_TRIALS_MLP,
    cv_folds=CV_FOLDS,
)

# Evaluate on validation set
X_val_scaled = scaler_reg_with.transform(X_val)
y_pred = mlp_reg_with_labs.predict(X_val_scaled)
metrics = compute_regression_metrics(y_val.values, y_pred)

print(f"\nValidation Results:")
print(f"  RMSE: {metrics['rmse']:.4f}")
print(f"  MAE:  {metrics['mae']:.4f}")
print(f"  R²:   {metrics['r2']:.4f}")

# Log to MLflow
with mlflow.start_run(run_name='MLP_reg_with_labs'):
    mlflow.set_tag('model_type', 'MLP')
    mlflow.set_tag('task', 'regression')
    mlflow.set_tag('feature_set', 'with_labs')
    mlflow.set_tag('imputation', 'full')
    
    mlflow.log_param('hidden_layer_sizes', str(mlp_reg_params_with_labs['hidden_layer_sizes']))
    mlflow.log_param('activation', mlp_reg_params_with_labs['activation'])
    mlflow.log_param('alpha', mlp_reg_params_with_labs['alpha'])
    mlflow.log_param('learning_rate', mlp_reg_params_with_labs['learning_rate'])
    mlflow.log_param('learning_rate_init', mlp_reg_params_with_labs['learning_rate_init'])
    
    mlflow.log_metric('val_rmse', metrics['rmse'])
    mlflow.log_metric('val_mae', metrics['mae'])
    mlflow.log_metric('val_r2', metrics['r2'])
    mlflow.log_metric('best_cv_score', study_mlp_reg_with.best_value)
    mlflow.log_metric('n_trials', N_TRIALS_MLP)
    
    mlflow.sklearn.log_model(mlp_reg_with_labs, 'model')

results_regression['MLP (with labs)'] = metrics
models_regression['MLP (with labs)'] = (mlp_reg_with_labs, scaler_reg_with)

In [None]:
%%time
# MLP Regression - Without Labs
print("="*60)
print("MLP Regression - WITHOUT LABS")
print("="*60)

X_train, X_val, X_test, y_train, y_val, y_test = splits_reg['without_labs_full']

mlp_reg_without_labs, mlp_reg_params_without_labs, study_mlp_reg_without, scaler_reg_without = train_mlp_with_optuna(
    X_train, y_train, X_val, y_val,
    task='regression',
    n_trials=N_TRIALS_MLP,
    cv_folds=CV_FOLDS,
)

# Evaluate on validation set
X_val_scaled = scaler_reg_without.transform(X_val)
y_pred = mlp_reg_without_labs.predict(X_val_scaled)
metrics = compute_regression_metrics(y_val.values, y_pred)

print(f"\nValidation Results:")
print(f"  RMSE: {metrics['rmse']:.4f}")
print(f"  MAE:  {metrics['mae']:.4f}")
print(f"  R²:   {metrics['r2']:.4f}")

# Log to MLflow
with mlflow.start_run(run_name='MLP_reg_without_labs'):
    mlflow.set_tag('model_type', 'MLP')
    mlflow.set_tag('task', 'regression')
    mlflow.set_tag('feature_set', 'without_labs')
    mlflow.set_tag('imputation', 'full')
    
    mlflow.log_param('hidden_layer_sizes', str(mlp_reg_params_without_labs['hidden_layer_sizes']))
    mlflow.log_param('activation', mlp_reg_params_without_labs['activation'])
    mlflow.log_param('alpha', mlp_reg_params_without_labs['alpha'])
    mlflow.log_param('learning_rate', mlp_reg_params_without_labs['learning_rate'])
    mlflow.log_param('learning_rate_init', mlp_reg_params_without_labs['learning_rate_init'])
    
    mlflow.log_metric('val_rmse', metrics['rmse'])
    mlflow.log_metric('val_mae', metrics['mae'])
    mlflow.log_metric('val_r2', metrics['r2'])
    mlflow.log_metric('best_cv_score', study_mlp_reg_without.best_value)
    mlflow.log_metric('n_trials', N_TRIALS_MLP)
    
    mlflow.sklearn.log_model(mlp_reg_without_labs, 'model')

results_regression['MLP (without labs)'] = metrics
models_regression['MLP (without labs)'] = (mlp_reg_without_labs, scaler_reg_without)

## 6. Validation Set Comparison

Compare all models on the validation set before final test evaluation.

In [None]:
# Classification comparison
print("="*60)
print("Classification Model Comparison (Validation Set)")
print("="*60)

cls_comparison = compare_models_table(
    results_classification,
    metrics=['accuracy', 'f1_macro', 'roc_auc_ovr', 'precision_macro', 'recall_macro'],
    sort_by='f1_macro',
    ascending=False
)
print(cls_comparison.to_string())

In [None]:
# Regression comparison
print("="*60)
print("Regression Model Comparison (Validation Set)")
print("="*60)

reg_comparison = compare_models_table(
    results_regression,
    metrics=['rmse', 'mae', 'r2'],
    sort_by='rmse',
    ascending=True
)
print(reg_comparison.to_string())

In [None]:
# Visualize classification comparison
fig = plot_model_comparison(
    results_classification,
    metrics=['accuracy', 'f1_macro', 'roc_auc_ovr'],
    title='Classification Model Comparison (Validation Set)'
)
fig.savefig(FIGURES_DIR / 'phase7_classification_comparison_val.png', dpi=300, bbox_inches='tight')
plt.show()

In [None]:
# With vs Without Labs comparison
results_with = {k: v for k, v in results_classification.items() if 'with labs' in k}
results_without = {k.replace('without', 'with'): v for k, v in results_classification.items() if 'without' in k}

fig = plot_feature_set_comparison(
    results_with,
    results_without,
    metric='f1_macro',
    title='F1 Macro: With Labs vs Without Labs'
)
fig.savefig(FIGURES_DIR / 'phase7_with_without_labs_comparison.png', dpi=300, bbox_inches='tight')
plt.show()

## 7. Final Test Evaluation

Evaluate best models on the held-out test set (never seen during training or hyperparameter tuning).

In [None]:
# Evaluate all classification models on test set
print("="*60)
print("Classification Test Set Evaluation")
print("="*60)

test_results_cls = {}

# LightGBM with labs
X_train, X_val, X_test, y_train, y_val, y_test = splits_cls['with_labs_minimal']
y_pred = lgb_clf_with_labs.predict(X_test.values)
y_prob = lgb_clf_with_labs.predict_proba(X_test.values)
test_results_cls['LightGBM (with labs)'] = compute_classification_metrics(y_test.values, y_pred, y_prob)

# LightGBM without labs
X_train, X_val, X_test, y_train, y_val, y_test = splits_cls['without_labs_minimal']
y_pred = lgb_clf_without_labs.predict(X_test.values)
y_prob = lgb_clf_without_labs.predict_proba(X_test.values)
test_results_cls['LightGBM (without labs)'] = compute_classification_metrics(y_test.values, y_pred, y_prob)

# MLP with labs
X_train, X_val, X_test, y_train, y_val, y_test = splits_cls['with_labs_full']
X_test_scaled = scaler_clf_with.transform(X_test)
y_pred = mlp_clf_with_labs.predict(X_test_scaled)
y_prob = mlp_clf_with_labs.predict_proba(X_test_scaled)
test_results_cls['MLP (with labs)'] = compute_classification_metrics(y_test.values, y_pred, y_prob)

# MLP without labs
X_train, X_val, X_test, y_train, y_val, y_test = splits_cls['without_labs_full']
X_test_scaled = scaler_clf_without.transform(X_test)
y_pred = mlp_clf_without_labs.predict(X_test_scaled)
y_prob = mlp_clf_without_labs.predict_proba(X_test_scaled)
test_results_cls['MLP (without labs)'] = compute_classification_metrics(y_test.values, y_pred, y_prob)

# Display comparison
test_cls_comparison = compare_models_table(
    test_results_cls,
    metrics=['accuracy', 'f1_macro', 'roc_auc_ovr'],
    sort_by='f1_macro',
    ascending=False
)
print(test_cls_comparison.to_string())

In [None]:
# Evaluate all regression models on test set
print("="*60)
print("Regression Test Set Evaluation")
print("="*60)

test_results_reg = {}

# LightGBM with labs
X_train, X_val, X_test, y_train, y_val, y_test = splits_reg['with_labs_minimal']
y_pred = lgb_reg_with_labs.predict(X_test.values)
test_results_reg['LightGBM (with labs)'] = compute_regression_metrics(y_test.values, y_pred)

# LightGBM without labs
X_train, X_val, X_test, y_train, y_val, y_test = splits_reg['without_labs_minimal']
y_pred = lgb_reg_without_labs.predict(X_test.values)
test_results_reg['LightGBM (without labs)'] = compute_regression_metrics(y_test.values, y_pred)

# MLP with labs
X_train, X_val, X_test, y_train, y_val, y_test = splits_reg['with_labs_full']
X_test_scaled = scaler_reg_with.transform(X_test)
y_pred = mlp_reg_with_labs.predict(X_test_scaled)
test_results_reg['MLP (with labs)'] = compute_regression_metrics(y_test.values, y_pred)

# MLP without labs
X_train, X_val, X_test, y_train, y_val, y_test = splits_reg['without_labs_full']
X_test_scaled = scaler_reg_without.transform(X_test)
y_pred = mlp_reg_without_labs.predict(X_test_scaled)
test_results_reg['MLP (without labs)'] = compute_regression_metrics(y_test.values, y_pred)

# Display comparison
test_reg_comparison = compare_models_table(
    test_results_reg,
    metrics=['rmse', 'mae', 'r2'],
    sort_by='rmse',
    ascending=True
)
print(test_reg_comparison.to_string())

In [None]:
# Visualize test set comparison
fig = plot_model_comparison(
    test_results_cls,
    metrics=['accuracy', 'f1_macro', 'roc_auc_ovr'],
    title='Classification Model Comparison (Test Set)'
)
fig.savefig(FIGURES_DIR / 'phase7_classification_comparison_test.png', dpi=300, bbox_inches='tight')
plt.show()

## 8. Best Model Analysis

Generate detailed evaluation for the best performing model.

In [None]:
# Find best classification model
best_cls_model_name = max(test_results_cls, key=lambda x: test_results_cls[x]['f1_macro'])
print(f"Best Classification Model: {best_cls_model_name}")
print(f"  F1 Macro: {test_results_cls[best_cls_model_name]['f1_macro']:.4f}")
print(f"  ROC AUC:  {test_results_cls[best_cls_model_name]['roc_auc_ovr']:.4f}")

# Find best regression model
best_reg_model_name = min(test_results_reg, key=lambda x: test_results_reg[x]['rmse'])
print(f"\nBest Regression Model: {best_reg_model_name}")
print(f"  RMSE: {test_results_reg[best_reg_model_name]['rmse']:.4f}")
print(f"  R²:   {test_results_reg[best_reg_model_name]['r2']:.4f}")

In [None]:
# Generate confusion matrix for best classification model
if 'LightGBM' in best_cls_model_name and 'with labs' in best_cls_model_name:
    X_train, X_val, X_test, y_train, y_val, y_test = splits_cls['with_labs_minimal']
    y_pred = lgb_clf_with_labs.predict(X_test.values)
    y_prob = lgb_clf_with_labs.predict_proba(X_test.values)
elif 'LightGBM' in best_cls_model_name:
    X_train, X_val, X_test, y_train, y_val, y_test = splits_cls['without_labs_minimal']
    y_pred = lgb_clf_without_labs.predict(X_test.values)
    y_prob = lgb_clf_without_labs.predict_proba(X_test.values)
elif 'with labs' in best_cls_model_name:
    X_train, X_val, X_test, y_train, y_val, y_test = splits_cls['with_labs_full']
    X_test_scaled = scaler_clf_with.transform(X_test)
    y_pred = mlp_clf_with_labs.predict(X_test_scaled)
    y_prob = mlp_clf_with_labs.predict_proba(X_test_scaled)
else:
    X_train, X_val, X_test, y_train, y_val, y_test = splits_cls['without_labs_full']
    X_test_scaled = scaler_clf_without.transform(X_test)
    y_pred = mlp_clf_without_labs.predict(X_test_scaled)
    y_prob = mlp_clf_without_labs.predict_proba(X_test_scaled)

# Plot confusion matrices
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

plot_confusion_matrix(y_test.values, y_pred, ax=axes[0], 
                     title=f'{best_cls_model_name} - Confusion Matrix')
plot_confusion_matrix(y_test.values, y_pred, ax=axes[1], normalize=True,
                     title=f'{best_cls_model_name} - Normalized')

plt.tight_layout()
fig.savefig(FIGURES_DIR / 'phase7_best_model_confusion_matrix.png', dpi=300, bbox_inches='tight')
plt.show()

In [None]:
# Plot ROC curves for best model
fig, ax = plt.subplots(figsize=(8, 6))
plot_roc_curves(y_test.values, y_prob, ax=ax, 
               title=f'{best_cls_model_name} - ROC Curves')
fig.savefig(FIGURES_DIR / 'phase7_best_model_roc_curves.png', dpi=300, bbox_inches='tight')
plt.show()

In [None]:
# Classification report for best model
print(f"\n{best_cls_model_name} - Classification Report (Test Set)")
print("="*60)
print(classification_report(
    y_test.values, y_pred,
    target_names=['No Diabetes', 'Prediabetes', 'Diabetes']
))

## 9. Save Models

In [None]:
# Save all trained models
advanced_models_dir = MODELS_DIR / 'advanced'
advanced_models_dir.mkdir(exist_ok=True)

# Classification models
cls_dir = advanced_models_dir / 'classification'
cls_dir.mkdir(exist_ok=True)

joblib.dump(lgb_clf_with_labs, cls_dir / 'lgb_with_labs.joblib')
joblib.dump(lgb_clf_without_labs, cls_dir / 'lgb_without_labs.joblib')
joblib.dump((mlp_clf_with_labs, scaler_clf_with), cls_dir / 'mlp_with_labs.joblib')
joblib.dump((mlp_clf_without_labs, scaler_clf_without), cls_dir / 'mlp_without_labs.joblib')

# Regression models
reg_dir = advanced_models_dir / 'regression'
reg_dir.mkdir(exist_ok=True)

joblib.dump(lgb_reg_with_labs, reg_dir / 'lgb_with_labs.joblib')
joblib.dump(lgb_reg_without_labs, reg_dir / 'lgb_without_labs.joblib')
joblib.dump((mlp_reg_with_labs, scaler_reg_with), reg_dir / 'mlp_with_labs.joblib')
joblib.dump((mlp_reg_without_labs, scaler_reg_without), reg_dir / 'mlp_without_labs.joblib')

print(f"Models saved to {advanced_models_dir}")

In [None]:
# Save results summary
results_summary = {
    'classification': {
        'validation': {k: {m: float(v) if isinstance(v, (np.floating, float)) else v 
                          for m, v in metrics.items() if m != 'confusion_matrix'}
                      for k, metrics in results_classification.items()},
        'test': {k: {m: float(v) if isinstance(v, (np.floating, float)) else v 
                    for m, v in metrics.items() if m != 'confusion_matrix'}
                for k, metrics in test_results_cls.items()},
    },
    'regression': {
        'validation': {k: {m: float(v) for m, v in metrics.items()}
                      for k, metrics in results_regression.items()},
        'test': {k: {m: float(v) for m, v in metrics.items()}
                for k, metrics in test_results_reg.items()},
    },
    'best_models': {
        'classification': best_cls_model_name,
        'regression': best_reg_model_name,
    },
    'hyperparameters': {
        'lightgbm_cls_with_labs': {k: float(v) if isinstance(v, (np.floating, float)) else v 
                                   for k, v in lgb_clf_params_with_labs.items() 
                                   if k not in ['class_weight', 'verbose', 'n_jobs', 'random_state']},
        'lightgbm_cls_without_labs': {k: float(v) if isinstance(v, (np.floating, float)) else v 
                                      for k, v in lgb_clf_params_without_labs.items() 
                                      if k not in ['class_weight', 'verbose', 'n_jobs', 'random_state']},
        'mlp_cls_with_labs': {k: str(v) if k == 'hidden_layer_sizes' else v 
                              for k, v in mlp_clf_params_with_labs.items() 
                              if k not in ['max_iter', 'early_stopping', 'validation_fraction', 'n_iter_no_change', 'random_state']},
    },
    'config': {
        'n_trials_lightgbm': N_TRIALS_LIGHTGBM,
        'n_trials_mlp': N_TRIALS_MLP,
        'cv_folds': CV_FOLDS,
        'random_state': RANDOM_STATE,
    }
}

with open(advanced_models_dir / 'results_summary.json', 'w') as f:
    json.dump(results_summary, f, indent=2)

print(f"Results saved to {advanced_models_dir / 'results_summary.json'}")

## 10. Summary

### Key Findings

In [None]:
print("="*60)
print("Phase 7: Advanced Models - Summary")
print("="*60)

print("\n### Classification Results (Test Set)")
print(test_cls_comparison.to_string())

print("\n### Regression Results (Test Set)")
print(test_reg_comparison.to_string())

print(f"\n### Best Models")
print(f"Classification: {best_cls_model_name}")
print(f"  F1 Macro: {test_results_cls[best_cls_model_name]['f1_macro']:.4f}")
print(f"  ROC AUC:  {test_results_cls[best_cls_model_name]['roc_auc_ovr']:.4f}")
print(f"\nRegression: {best_reg_model_name}")
print(f"  RMSE: {test_results_reg[best_reg_model_name]['rmse']:.4f}")
print(f"  R²:   {test_results_reg[best_reg_model_name]['r2']:.4f}")

print("\n### Artifacts Generated")
print(f"- Models saved to: {advanced_models_dir}")
print(f"- Figures saved to: {FIGURES_DIR}")
print(f"- MLflow runs logged to: {MLRUNS_DIR}")
print(f"\nTo view MLflow UI: mlflow ui --backend-store-uri {MLRUNS_DIR}")

## Next Steps

- **Phase 7.1 (Optional)**: Deep Learning with TensorFlow/PyTorch
- **Phase 8**: Model Evaluation & Comparison (detailed analysis)
- **Phase 9**: Model Interpretation & Insights (SHAP, feature importance)