### Notebook 04: Model Tuning and Hybrid Models

This notebook tunes the baseline machine learning models (Logistic Regresssion, Random Forest, and XGBoost) and explores hybrid modeling approaches to improve prediction accuracy. Optuna is used for hyperparameter tuning.

In [26]:
import os
import numpy as np
import pandas as pd 
import optuna
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.ensemble import RandomForestClassifier, VotingClassifier, StackingClassifier
from sklearn.linear_model import LogisticRegression
from xgboost import XGBClassifier
import matplotlib.pyplot as plt
import seaborn as sns
import joblib

import warnings
warnings.filterwarnings('ignore')

In [5]:
# Load preprocessed data
preprocessed_file = r'C:\Users\anitr\AAI590_Capstone\AAI590_Capstone_AH\Data\ea_modelset\eamodelset\dataset\preprocessed_models.csv'
models_df = pd.read_csv(preprocessed_file)
models_df.head()

Unnamed: 0,id,name,duplicateCount,elementCount,relationshipCount,viewCount,num_formats,hasWarning,hasDuplicate,rel_elem_ratio,view_elem_ratio,source_GitHub,source_Other,source_Unknown,language_bs,language_ca,language_cs,language_da,language_de,language_en,language_es,language_fi,language_fr,language_hr,language_id,language_it,language_ko,language_ms,language_nb,language_nl,language_nn,language_pl,language_pt,language_ru,language_sl,language_sv,language_tl,language_vi,language_yo,language_zh,arb_outcome
0,id-48fb3807bfa249a9bae607b6a92cc390,LAE,0,142,296,24,4,0,0,2.084507,0.169014,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,4cc127d7-6937-42e8-99fb-19f0f6f4991a,Baseline Media Production,0,22,28,1,4,0,0,1.272727,0.045455,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
2,_7RWQ8CqVEey-A40W5C_9dw,buhService,0,55,41,3,4,0,0,0.745455,0.054545,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0
3,3846c562-eab4-4e07-aa95-87703e0e0e69,Data model test,0,15,11,1,2,0,0,0.733333,0.066667,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
4,_ay028PGjEeqygJczXaaxEQ,payments-arch,0,18,20,1,4,0,0,1.111111,0.055556,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1


In [6]:
# Define arb_outcome as target variable (y) and features (X)
X = models_df.drop(columns=['name','id','arb_outcome'])
y = models_df['arb_outcome']

# Check shape of X and y
print(f"Models DataFrame shape: {models_df.shape}")
print(f"Features shape: {X.shape}")
print(f"Target shape: {y.shape}")

Models DataFrame shape: (978, 41)
Features shape: (978, 38)
Target shape: (978,)


In [7]:
# Train/test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
print(f"Training set shape: {X_train.shape}, {y_train.shape}")
print(f"Test set shape: {X_test.shape}, {y_test.shape}")

Training set shape: (782, 38), (782,)
Test set shape: (196, 38), (196,)


### XGBOOST - HYPERPARAMETER TUNING WITH OPTUNA

In [11]:
# Use optuna to optimize hyperparameters for XGBoost
def xgb_objective(trial):
    param = {
        'n_estimators': trial.suggest_int('n_estimators', 200, 500),
        'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.3, log=True),
        'max_depth': trial.suggest_int('max_depth', 3, 12),
        'subsample': trial.suggest_float('subsample', 0.6, 1.0),
        'colsample_bytree': trial.suggest_float('colsample_bytree', 0.6, 1.0),
        'gamma': trial.suggest_float('gamma', 0.0, 5.0),
        'min_child_weight': trial.suggest_int('min_child_weight', 1, 10),
        'objective': 'multi:softprob',
        'num_class': 3,
        'random_state': 42,
        'n_jobs': -1
    }
    
    xgb = XGBClassifier(**param)
    score = cross_val_score(xgb, X_train, y_train, cv=3, scoring='f1_macro', n_jobs=-1).mean()
    return score

study = optuna.create_study(direction='maximize')
study.optimize(xgb_objective, n_trials=25, show_progress_bar=True)
print("Best hyperparameters: ", study.best_params)
print("Best F1 Macro Score: ", study.best_value)



[I 2025-12-07 13:51:51,612] A new study created in memory with name: no-name-acd4a802-87ef-4951-829d-53443ea8a024
Best trial: 0. Best value: 0.981955:   4%|▍         | 1/25 [00:02<00:54,  2.27s/it]

[I 2025-12-07 13:51:53,879] Trial 0 finished with value: 0.9819554682423414 and parameters: {'n_estimators': 225, 'learning_rate': 0.10127098673703337, 'max_depth': 12, 'subsample': 0.945601409106067, 'colsample_bytree': 0.6965270370415978, 'gamma': 3.2108108771096373, 'min_child_weight': 3}. Best is trial 0 with value: 0.9819554682423414.


Best trial: 0. Best value: 0.981955:   8%|▊         | 2/25 [00:03<00:44,  1.93s/it]

[I 2025-12-07 13:51:55,578] Trial 1 finished with value: 0.9819554682423414 and parameters: {'n_estimators': 285, 'learning_rate': 0.21790987712319423, 'max_depth': 8, 'subsample': 0.9379372675096971, 'colsample_bytree': 0.8472522617357415, 'gamma': 2.2334149710812525, 'min_child_weight': 4}. Best is trial 0 with value: 0.9819554682423414.


Best trial: 0. Best value: 0.981955:  12%|█▏        | 3/25 [00:05<00:40,  1.84s/it]

[I 2025-12-07 13:51:57,318] Trial 2 finished with value: 0.9807381413367016 and parameters: {'n_estimators': 371, 'learning_rate': 0.039280511496430366, 'max_depth': 4, 'subsample': 0.7340724347635605, 'colsample_bytree': 0.7613090801040259, 'gamma': 3.214681237941246, 'min_child_weight': 2}. Best is trial 0 with value: 0.9819554682423414.


Best trial: 3. Best value: 0.983403:  16%|█▌        | 4/25 [00:07<00:37,  1.81s/it]

[I 2025-12-07 13:51:59,068] Trial 3 finished with value: 0.983402884317829 and parameters: {'n_estimators': 446, 'learning_rate': 0.16117142847326213, 'max_depth': 12, 'subsample': 0.9736545849103231, 'colsample_bytree': 0.7867759783129827, 'gamma': 4.508548952978752, 'min_child_weight': 2}. Best is trial 3 with value: 0.983402884317829.


Best trial: 4. Best value: 0.98651:  20%|██        | 5/25 [00:09<00:35,  1.80s/it] 

[I 2025-12-07 13:52:00,841] Trial 4 finished with value: 0.9865104372579466 and parameters: {'n_estimators': 397, 'learning_rate': 0.017486265463762897, 'max_depth': 11, 'subsample': 0.7894606592238029, 'colsample_bytree': 0.76007169833078, 'gamma': 0.42768935321655543, 'min_child_weight': 2}. Best is trial 4 with value: 0.9865104372579466.


Best trial: 4. Best value: 0.98651:  28%|██▊       | 7/25 [00:10<00:21,  1.21s/it]

[I 2025-12-07 13:52:02,423] Trial 5 finished with value: 0.9819554682423414 and parameters: {'n_estimators': 373, 'learning_rate': 0.03352081992157196, 'max_depth': 4, 'subsample': 0.8125453177197373, 'colsample_bytree': 0.6237924293976073, 'gamma': 2.469005074505817, 'min_child_weight': 4}. Best is trial 4 with value: 0.9865104372579466.
[I 2025-12-07 13:52:02,574] Trial 6 finished with value: 0.9807381413367016 and parameters: {'n_estimators': 380, 'learning_rate': 0.02048384915867571, 'max_depth': 10, 'subsample': 0.7631610260529992, 'colsample_bytree': 0.8635005755026626, 'gamma': 4.075573419798709, 'min_child_weight': 6}. Best is trial 4 with value: 0.9865104372579466.


Best trial: 4. Best value: 0.98651:  36%|███▌      | 9/25 [00:11<00:10,  1.55it/s]

[I 2025-12-07 13:52:02,724] Trial 7 finished with value: 0.9819554682423414 and parameters: {'n_estimators': 425, 'learning_rate': 0.058970245535082934, 'max_depth': 6, 'subsample': 0.9824376116951496, 'colsample_bytree': 0.6683450004989513, 'gamma': 1.0549991284577243, 'min_child_weight': 6}. Best is trial 4 with value: 0.9865104372579466.
[I 2025-12-07 13:52:02,866] Trial 8 finished with value: 0.9457508345546621 and parameters: {'n_estimators': 298, 'learning_rate': 0.013061889539566001, 'max_depth': 6, 'subsample': 0.7138312155885433, 'colsample_bytree': 0.9792276322125337, 'gamma': 4.433402432710433, 'min_child_weight': 8}. Best is trial 4 with value: 0.9865104372579466.


Best trial: 4. Best value: 0.98651:  40%|████      | 10/25 [00:11<00:07,  2.09it/s]

[I 2025-12-07 13:52:02,976] Trial 9 finished with value: 0.9819554682423414 and parameters: {'n_estimators': 278, 'learning_rate': 0.055990288019580135, 'max_depth': 12, 'subsample': 0.8840154608609796, 'colsample_bytree': 0.9997615762488948, 'gamma': 4.76798357502854, 'min_child_weight': 3}. Best is trial 4 with value: 0.9865104372579466.


Best trial: 11. Best value: 0.989158:  48%|████▊     | 12/25 [00:11<00:04,  3.02it/s]

[I 2025-12-07 13:52:03,206] Trial 10 finished with value: 0.9118516479183268 and parameters: {'n_estimators': 469, 'learning_rate': 0.01156846324385182, 'max_depth': 9, 'subsample': 0.607921462724954, 'colsample_bytree': 0.9018201914680373, 'gamma': 0.43025053752562226, 'min_child_weight': 10}. Best is trial 4 with value: 0.9865104372579466.
[I 2025-12-07 13:52:03,375] Trial 11 finished with value: 0.989158277185438 and parameters: {'n_estimators': 492, 'learning_rate': 0.2516756545286703, 'max_depth': 11, 'subsample': 0.8477864785313033, 'colsample_bytree': 0.7655106019974527, 'gamma': 1.276914562872955, 'min_child_weight': 1}. Best is trial 11 with value: 0.989158277185438.


Best trial: 11. Best value: 0.989158:  56%|█████▌    | 14/25 [00:12<00:02,  4.04it/s]

[I 2025-12-07 13:52:03,542] Trial 12 finished with value: 0.989158277185438 and parameters: {'n_estimators': 496, 'learning_rate': 0.10455010293133245, 'max_depth': 10, 'subsample': 0.8507545454363709, 'colsample_bytree': 0.7355371950772454, 'gamma': 1.3215315564367387, 'min_child_weight': 1}. Best is trial 11 with value: 0.989158277185438.
[I 2025-12-07 13:52:03,709] Trial 13 finished with value: 0.989158277185438 and parameters: {'n_estimators': 500, 'learning_rate': 0.28084677233621547, 'max_depth': 10, 'subsample': 0.8603413591071123, 'colsample_bytree': 0.7173302255101637, 'gamma': 1.5365119358855117, 'min_child_weight': 1}. Best is trial 11 with value: 0.989158277185438.


Best trial: 11. Best value: 0.989158:  60%|██████    | 15/25 [00:12<00:02,  4.44it/s]

[I 2025-12-07 13:52:03,885] Trial 14 finished with value: 0.989158277185438 and parameters: {'n_estimators': 497, 'learning_rate': 0.11983599854823543, 'max_depth': 9, 'subsample': 0.857300109761848, 'colsample_bytree': 0.826345139282052, 'gamma': 1.4862425387746545, 'min_child_weight': 1}. Best is trial 11 with value: 0.989158277185438.


Best trial: 11. Best value: 0.989158:  68%|██████▊   | 17/25 [00:12<00:01,  5.43it/s]

[I 2025-12-07 13:52:04,072] Trial 15 finished with value: 0.9707991489976324 and parameters: {'n_estimators': 432, 'learning_rate': 0.09431381356552218, 'max_depth': 7, 'subsample': 0.6773432497247874, 'colsample_bytree': 0.6033874299989179, 'gamma': 0.009737929987523986, 'min_child_weight': 4}. Best is trial 11 with value: 0.989158277185438.
[I 2025-12-07 13:52:04,186] Trial 16 finished with value: 0.9777869388435164 and parameters: {'n_estimators': 324, 'learning_rate': 0.17476057876038778, 'max_depth': 10, 'subsample': 0.9032022975065774, 'colsample_bytree': 0.7374295506104322, 'gamma': 1.8007441036060574, 'min_child_weight': 8}. Best is trial 11 with value: 0.989158277185438.


Best trial: 11. Best value: 0.989158:  76%|███████▌  | 19/25 [00:12<00:00,  6.04it/s]

[I 2025-12-07 13:52:04,353] Trial 17 finished with value: 0.989158277185438 and parameters: {'n_estimators': 467, 'learning_rate': 0.07441744161500245, 'max_depth': 11, 'subsample': 0.8259129115324532, 'colsample_bytree': 0.6756064323262831, 'gamma': 0.9368858770654205, 'min_child_weight': 1}. Best is trial 11 with value: 0.989158277185438.
[I 2025-12-07 13:52:04,488] Trial 18 finished with value: 0.9819554682423414 and parameters: {'n_estimators': 413, 'learning_rate': 0.28020389891051783, 'max_depth': 8, 'subsample': 0.9031815673654919, 'colsample_bytree': 0.8005252627121945, 'gamma': 3.0066669833256374, 'min_child_weight': 5}. Best is trial 11 with value: 0.989158277185438.


Best trial: 11. Best value: 0.989158:  84%|████████▍ | 21/25 [00:13<00:00,  6.74it/s]

[I 2025-12-07 13:52:04,645] Trial 19 finished with value: 0.9833862309464174 and parameters: {'n_estimators': 466, 'learning_rate': 0.14017890573961136, 'max_depth': 11, 'subsample': 0.8361829959634607, 'colsample_bytree': 0.6316418248059289, 'gamma': 1.9690633585746635, 'min_child_weight': 3}. Best is trial 11 with value: 0.989158277185438.
[I 2025-12-07 13:52:04,760] Trial 20 finished with value: 0.9719437643767158 and parameters: {'n_estimators': 339, 'learning_rate': 0.20635592111862402, 'max_depth': 9, 'subsample': 0.7787321154715885, 'colsample_bytree': 0.8882443783062456, 'gamma': 1.019350105766834, 'min_child_weight': 7}. Best is trial 11 with value: 0.989158277185438.


Best trial: 11. Best value: 0.989158:  92%|█████████▏| 23/25 [00:13<00:00,  6.30it/s]

[I 2025-12-07 13:52:04,929] Trial 21 finished with value: 0.989158277185438 and parameters: {'n_estimators': 500, 'learning_rate': 0.2714450558896942, 'max_depth': 10, 'subsample': 0.8611719550978623, 'colsample_bytree': 0.7318947256396325, 'gamma': 0.9400172879496328, 'min_child_weight': 1}. Best is trial 11 with value: 0.989158277185438.
[I 2025-12-07 13:52:05,095] Trial 22 finished with value: 0.989158277185438 and parameters: {'n_estimators': 500, 'learning_rate': 0.28107936341763085, 'max_depth': 10, 'subsample': 0.8689609803690862, 'colsample_bytree': 0.7248089781768803, 'gamma': 1.4774384882764053, 'min_child_weight': 1}. Best is trial 11 with value: 0.989158277185438.


Best trial: 11. Best value: 0.989158: 100%|██████████| 25/25 [00:13<00:00,  1.81it/s]

[I 2025-12-07 13:52:05,252] Trial 23 finished with value: 0.9877591954470054 and parameters: {'n_estimators': 454, 'learning_rate': 0.2002842010673283, 'max_depth': 11, 'subsample': 0.9278722478845978, 'colsample_bytree': 0.6965277360632465, 'gamma': 1.5331551680748203, 'min_child_weight': 2}. Best is trial 11 with value: 0.989158277185438.
[I 2025-12-07 13:52:05,419] Trial 24 finished with value: 0.9863131381222979 and parameters: {'n_estimators': 485, 'learning_rate': 0.12958566035006258, 'max_depth': 9, 'subsample': 0.8222609133340955, 'colsample_bytree': 0.7914706632469731, 'gamma': 2.6885968499815083, 'min_child_weight': 1}. Best is trial 11 with value: 0.989158277185438.
Best hyperparameters:  {'n_estimators': 492, 'learning_rate': 0.2516756545286703, 'max_depth': 11, 'subsample': 0.8477864785313033, 'colsample_bytree': 0.7655106019974527, 'gamma': 1.276914562872955, 'min_child_weight': 1}
Best F1 Macro Score:  0.989158277185438





In [12]:
# Train  optimized XGBoost model
xgboost_tuned = XGBClassifier(
    random_state=42, 
    n_estimators=study.best_params['n_estimators'], 
    learning_rate=study.best_params['learning_rate'], 
    max_depth=study.best_params['max_depth'], 
    subsample=study.best_params['subsample'], 
    colsample_bytree=study.best_params['colsample_bytree'], 
    gamma=study.best_params['gamma'],
    min_child_weight=study.best_params['min_child_weight'],
    objective="multi:softprob", 
    num_class=3
)
xgboost_tuned.fit(X_train, y_train)
y_pred_tuned = xgboost_tuned.predict(X_test)

# Evaluate tuned XGBoost model
accuracy_tuned = accuracy_score(y_test, y_pred_tuned)
print(f"Tuned XGBoost Model Accuracy: {accuracy_tuned:.4f}")
print("Classification Report:\n", classification_report(y_test, y_pred_tuned))

Tuned XGBoost Model Accuracy: 0.9949
Classification Report:
               precision    recall  f1-score   support

           0       0.98      1.00      0.99        50
           1       1.00      0.99      1.00       105
           2       1.00      1.00      1.00        41

    accuracy                           0.99       196
   macro avg       0.99      1.00      1.00       196
weighted avg       0.99      0.99      0.99       196



### RANDOM FOREST - HYPERPARAMETER TUNING WITH OPTUNA

In [28]:
# Use optuna to optimize hyperparameters for Random Forest
def rf_objective(trial):
    param = {
        'n_estimators': trial.suggest_int('n_estimators', 200, 600),
        'max_depth': trial.suggest_int('max_depth', 5, 40),
        'min_samples_split': trial.suggest_int('min_samples_split', 2, 20),
        'min_samples_leaf': trial.suggest_int('min_samples_leaf', 1, 10),
        'max_features': trial.suggest_categorical('max_features', ['sqrt', 'log2', None]),
        'bootstrap': trial.suggest_categorical('bootstrap', [True, False]),
        'class_weight': trial.suggest_categorical('class_weight', [None, 'balanced', 'balanced_subsample']),
        'random_state': 42,
        'n_jobs': -1
    }
    
    rf = RandomForestClassifier(**param)
    score = cross_val_score(rf, X_train, y_train, cv=3, scoring='f1_macro', n_jobs=-1).mean()
    return score

study_rf = optuna.create_study(direction='maximize')
study_rf.optimize(rf_objective, n_trials=25, show_progress_bar=True)
print("Best hyperparameters for Random Forest: ", study_rf.best_params)
print("Best F1 Macro Score for Random Forest: ", study_rf.best_value)


[I 2025-12-09 01:02:22,500] A new study created in memory with name: no-name-6622ec08-e92a-487d-a943-95820816e44d


Best trial: 0. Best value: 0.927799:   4%|▍         | 1/25 [00:03<01:28,  3.69s/it]

[I 2025-12-09 01:02:26,209] Trial 0 finished with value: 0.9277994701601587 and parameters: {'n_estimators': 418, 'max_depth': 23, 'min_samples_split': 14, 'min_samples_leaf': 10, 'max_features': 'sqrt', 'bootstrap': True, 'class_weight': 'balanced'}. Best is trial 0 with value: 0.9277994701601587.


Best trial: 1. Best value: 0.954474:   8%|▊         | 2/25 [00:05<01:01,  2.69s/it]

[I 2025-12-09 01:02:28,200] Trial 1 finished with value: 0.9544739068815103 and parameters: {'n_estimators': 332, 'max_depth': 22, 'min_samples_split': 15, 'min_samples_leaf': 7, 'max_features': 'sqrt', 'bootstrap': False, 'class_weight': None}. Best is trial 1 with value: 0.9544739068815103.


Best trial: 2. Best value: 0.988955:  12%|█▏        | 3/25 [00:08<00:56,  2.57s/it]

[I 2025-12-09 01:02:30,618] Trial 2 finished with value: 0.9889546306063924 and parameters: {'n_estimators': 407, 'max_depth': 33, 'min_samples_split': 11, 'min_samples_leaf': 4, 'max_features': None, 'bootstrap': True, 'class_weight': 'balanced_subsample'}. Best is trial 2 with value: 0.9889546306063924.


Best trial: 2. Best value: 0.988955:  16%|█▌        | 4/25 [00:10<00:49,  2.37s/it]

[I 2025-12-09 01:02:32,688] Trial 3 finished with value: 0.9835959647655149 and parameters: {'n_estimators': 324, 'max_depth': 21, 'min_samples_split': 13, 'min_samples_leaf': 1, 'max_features': 'log2', 'bootstrap': True, 'class_weight': 'balanced'}. Best is trial 2 with value: 0.9889546306063924.


Best trial: 2. Best value: 0.988955:  20%|██        | 5/25 [00:12<00:44,  2.22s/it]

[I 2025-12-09 01:02:34,639] Trial 4 finished with value: 0.943202946718133 and parameters: {'n_estimators': 239, 'max_depth': 36, 'min_samples_split': 18, 'min_samples_leaf': 5, 'max_features': 'sqrt', 'bootstrap': True, 'class_weight': None}. Best is trial 2 with value: 0.9889546306063924.


Best trial: 5. Best value: 0.991437:  24%|██▍       | 6/25 [00:14<00:42,  2.22s/it]

[I 2025-12-09 01:02:36,822] Trial 5 finished with value: 0.9914374928014849 and parameters: {'n_estimators': 425, 'max_depth': 32, 'min_samples_split': 4, 'min_samples_leaf': 1, 'max_features': None, 'bootstrap': True, 'class_weight': 'balanced'}. Best is trial 5 with value: 0.9914374928014849.


Best trial: 5. Best value: 0.991437:  28%|██▊       | 7/25 [00:14<00:29,  1.66s/it]

[I 2025-12-09 01:02:37,357] Trial 6 finished with value: 0.9726047938343018 and parameters: {'n_estimators': 464, 'max_depth': 37, 'min_samples_split': 15, 'min_samples_leaf': 4, 'max_features': 'log2', 'bootstrap': False, 'class_weight': 'balanced_subsample'}. Best is trial 5 with value: 0.9914374928014849.


Best trial: 5. Best value: 0.991437:  32%|███▏      | 8/25 [00:15<00:20,  1.23s/it]

[I 2025-12-09 01:02:37,674] Trial 7 finished with value: 0.954864852521985 and parameters: {'n_estimators': 239, 'max_depth': 34, 'min_samples_split': 11, 'min_samples_leaf': 4, 'max_features': 'sqrt', 'bootstrap': True, 'class_weight': None}. Best is trial 5 with value: 0.9914374928014849.


Best trial: 5. Best value: 0.991437:  36%|███▌      | 9/25 [00:15<00:16,  1.05s/it]

[I 2025-12-09 01:02:38,291] Trial 8 finished with value: 0.9644445405359311 and parameters: {'n_estimators': 308, 'max_depth': 20, 'min_samples_split': 6, 'min_samples_leaf': 9, 'max_features': None, 'bootstrap': True, 'class_weight': 'balanced_subsample'}. Best is trial 5 with value: 0.9914374928014849.


Best trial: 5. Best value: 0.991437:  40%|████      | 10/25 [00:16<00:12,  1.20it/s]

[I 2025-12-09 01:02:38,670] Trial 9 finished with value: 0.9591941812571996 and parameters: {'n_estimators': 310, 'max_depth': 7, 'min_samples_split': 9, 'min_samples_leaf': 5, 'max_features': 'sqrt', 'bootstrap': False, 'class_weight': None}. Best is trial 5 with value: 0.9914374928014849.


Best trial: 10. Best value: 0.992667:  44%|████▍     | 11/25 [00:16<00:11,  1.24it/s]

[I 2025-12-09 01:02:39,424] Trial 10 finished with value: 0.9926668537044878 and parameters: {'n_estimators': 575, 'max_depth': 28, 'min_samples_split': 3, 'min_samples_leaf': 1, 'max_features': None, 'bootstrap': False, 'class_weight': 'balanced'}. Best is trial 10 with value: 0.9926668537044878.


Best trial: 10. Best value: 0.992667:  48%|████▊     | 12/25 [00:17<00:10,  1.24it/s]

[I 2025-12-09 01:02:40,225] Trial 11 finished with value: 0.9926668537044878 and parameters: {'n_estimators': 584, 'max_depth': 27, 'min_samples_split': 3, 'min_samples_leaf': 1, 'max_features': None, 'bootstrap': False, 'class_weight': 'balanced'}. Best is trial 10 with value: 0.9926668537044878.


Best trial: 10. Best value: 0.992667:  52%|█████▏    | 13/25 [00:18<00:09,  1.23it/s]

[I 2025-12-09 01:02:41,060] Trial 12 finished with value: 0.99035905837628 and parameters: {'n_estimators': 595, 'max_depth': 28, 'min_samples_split': 2, 'min_samples_leaf': 2, 'max_features': None, 'bootstrap': False, 'class_weight': 'balanced'}. Best is trial 10 with value: 0.9926668537044878.


Best trial: 10. Best value: 0.992667:  56%|█████▌    | 14/25 [00:19<00:08,  1.23it/s]

[I 2025-12-09 01:02:41,871] Trial 13 finished with value: 0.99035905837628 and parameters: {'n_estimators': 599, 'max_depth': 13, 'min_samples_split': 7, 'min_samples_leaf': 2, 'max_features': None, 'bootstrap': False, 'class_weight': 'balanced'}. Best is trial 10 with value: 0.9926668537044878.


Best trial: 10. Best value: 0.992667:  60%|██████    | 15/25 [00:20<00:07,  1.29it/s]

[I 2025-12-09 01:02:42,565] Trial 14 finished with value: 0.991583073247828 and parameters: {'n_estimators': 517, 'max_depth': 29, 'min_samples_split': 2, 'min_samples_leaf': 2, 'max_features': None, 'bootstrap': False, 'class_weight': 'balanced'}. Best is trial 10 with value: 0.9926668537044878.


Best trial: 10. Best value: 0.992667:  64%|██████▍   | 16/25 [00:20<00:06,  1.37it/s]

[I 2025-12-09 01:02:43,188] Trial 15 finished with value: 0.9861710529297048 and parameters: {'n_estimators': 528, 'max_depth': 26, 'min_samples_split': 5, 'min_samples_leaf': 3, 'max_features': None, 'bootstrap': False, 'class_weight': 'balanced'}. Best is trial 10 with value: 0.9926668537044878.


Best trial: 10. Best value: 0.992667:  68%|██████▊   | 17/25 [00:21<00:05,  1.41it/s]

[I 2025-12-09 01:02:43,844] Trial 16 finished with value: 0.9612623951402955 and parameters: {'n_estimators': 540, 'max_depth': 40, 'min_samples_split': 8, 'min_samples_leaf': 8, 'max_features': None, 'bootstrap': False, 'class_weight': 'balanced'}. Best is trial 10 with value: 0.9926668537044878.


Best trial: 10. Best value: 0.992667:  72%|███████▏  | 18/25 [00:21<00:04,  1.51it/s]

[I 2025-12-09 01:02:44,400] Trial 17 finished with value: 0.9860439726888518 and parameters: {'n_estimators': 488, 'max_depth': 17, 'min_samples_split': 4, 'min_samples_leaf': 1, 'max_features': 'log2', 'bootstrap': False, 'class_weight': 'balanced'}. Best is trial 10 with value: 0.9926668537044878.


Best trial: 10. Best value: 0.992667:  76%|███████▌  | 19/25 [00:22<00:03,  1.52it/s]

[I 2025-12-09 01:02:45,041] Trial 18 finished with value: 0.9847839365904066 and parameters: {'n_estimators': 553, 'max_depth': 16, 'min_samples_split': 20, 'min_samples_leaf': 6, 'max_features': None, 'bootstrap': False, 'class_weight': 'balanced'}. Best is trial 10 with value: 0.9926668537044878.


Best trial: 10. Best value: 0.992667:  80%|████████  | 20/25 [00:23<00:03,  1.60it/s]

[I 2025-12-09 01:02:45,589] Trial 19 finished with value: 0.9861710529297048 and parameters: {'n_estimators': 469, 'max_depth': 25, 'min_samples_split': 2, 'min_samples_leaf': 3, 'max_features': None, 'bootstrap': False, 'class_weight': 'balanced_subsample'}. Best is trial 10 with value: 0.9926668537044878.


Best trial: 10. Best value: 0.992667:  84%|████████▍ | 21/25 [00:23<00:02,  1.63it/s]

[I 2025-12-09 01:02:46,181] Trial 20 finished with value: 0.9786990472735696 and parameters: {'n_estimators': 577, 'max_depth': 30, 'min_samples_split': 9, 'min_samples_leaf': 3, 'max_features': 'log2', 'bootstrap': False, 'class_weight': 'balanced'}. Best is trial 10 with value: 0.9926668537044878.


Best trial: 10. Best value: 0.992667:  88%|████████▊ | 22/25 [00:24<00:01,  1.64it/s]

[I 2025-12-09 01:02:46,781] Trial 21 finished with value: 0.991583073247828 and parameters: {'n_estimators': 501, 'max_depth': 29, 'min_samples_split': 2, 'min_samples_leaf': 2, 'max_features': None, 'bootstrap': False, 'class_weight': 'balanced'}. Best is trial 10 with value: 0.9926668537044878.


Best trial: 10. Best value: 0.992667:  92%|█████████▏| 23/25 [00:24<00:01,  1.61it/s]

[I 2025-12-09 01:02:47,435] Trial 22 finished with value: 0.9926668537044878 and parameters: {'n_estimators': 556, 'max_depth': 26, 'min_samples_split': 4, 'min_samples_leaf': 1, 'max_features': None, 'bootstrap': False, 'class_weight': 'balanced'}. Best is trial 10 with value: 0.9926668537044878.


Best trial: 10. Best value: 0.992667:  96%|█████████▌| 24/25 [00:25<00:00,  1.55it/s]

[I 2025-12-09 01:02:48,131] Trial 23 finished with value: 0.9926668537044878 and parameters: {'n_estimators': 561, 'max_depth': 26, 'min_samples_split': 4, 'min_samples_leaf': 1, 'max_features': None, 'bootstrap': False, 'class_weight': 'balanced'}. Best is trial 10 with value: 0.9926668537044878.


Best trial: 10. Best value: 0.992667: 100%|██████████| 25/25 [00:26<00:00,  1.05s/it]

[I 2025-12-09 01:02:48,810] Trial 24 finished with value: 0.9926668537044878 and parameters: {'n_estimators': 561, 'max_depth': 18, 'min_samples_split': 6, 'min_samples_leaf': 1, 'max_features': None, 'bootstrap': False, 'class_weight': 'balanced'}. Best is trial 10 with value: 0.9926668537044878.
Best hyperparameters for Random Forest:  {'n_estimators': 575, 'max_depth': 28, 'min_samples_split': 3, 'min_samples_leaf': 1, 'max_features': None, 'bootstrap': False, 'class_weight': 'balanced'}
Best F1 Macro Score for Random Forest:  0.9926668537044878





In [29]:
# Train optimized Random Forest model
rf_tuned = RandomForestClassifier(
    random_state=42, 
    n_estimators=study_rf.best_params['n_estimators'],
    max_depth=study_rf.best_params['max_depth'],
    min_samples_split=study_rf.best_params['min_samples_split'],
    min_samples_leaf=study_rf.best_params['min_samples_leaf'],
    max_features=study_rf.best_params['max_features'],
    bootstrap=study_rf.best_params['bootstrap'],
    class_weight=study_rf.best_params['class_weight']
)
rf_tuned.fit(X_train, y_train)
y_pred_rf_tuned = rf_tuned.predict(X_test)

# Evaluate tuned Random Forest model
accuracy_rf_tuned = accuracy_score(y_test, y_pred_rf_tuned)
print(f"Tuned Random Forest Model Accuracy: {accuracy_rf_tuned:.4f}")
print("Classification Report:\n", classification_report(y_test, y_pred_rf_tuned))

Tuned Random Forest Model Accuracy: 1.0000
Classification Report:
               precision    recall  f1-score   support

           0       1.00      1.00      1.00        50
           1       1.00      1.00      1.00       105
           2       1.00      1.00      1.00        41

    accuracy                           1.00       196
   macro avg       1.00      1.00      1.00       196
weighted avg       1.00      1.00      1.00       196



#### After hyperparameter tuning, the XGBoost model achieved an accuracy of 0.995, while the Random Forest model achieved an accuracy of 1.0 on the test set. The XGBoost model showed strong performance with a high F1 score across all classes, indicating its effectiveness in predicting ARB outcomes, but no improvement. The Random Forest modeled performed perfectly.

#### Next, ensemble models were created using both soft voting and stacking classifiers to combine the strengths of individual models.

### HYBRID MODEL - SOFT VOTING CLASSIFIER

In [18]:
# Create a soft voting classifier as hybrid model
hybrid_model_softvoting = VotingClassifier(
    estimators=[
        ('xgboost', xgboost_tuned),
        ('random_forest', rf_tuned),
        ('logistic_regression', LogisticRegression(max_iter=2000))
    ],
    voting='soft'
)

#Train ensemble
hybrid_model_softvoting.fit(X_train, y_train)

# Evaluate hybrid model
y_pred_hybrid = hybrid_model_softvoting.predict(X_test)
accuracy_hybrid = accuracy_score(y_test, y_pred_hybrid)
print(f"Hybrid Model Soft Voting Accuracy: {accuracy_hybrid:.4f}")
print("Classification Report:\n", classification_report(y_test, y_pred_hybrid))
 
    

Hybrid Model Soft Voting Accuracy: 0.9949
Classification Report:
               precision    recall  f1-score   support

           0       0.98      1.00      0.99        50
           1       1.00      0.99      1.00       105
           2       1.00      1.00      1.00        41

    accuracy                           0.99       196
   macro avg       0.99      1.00      1.00       196
weighted avg       0.99      0.99      0.99       196



### HYBRID MODEL - STACKING CLASSIFIER

In [21]:
# Create a stacking classifier as hybrid model
hybrid_model_stacking = StackingClassifier(
    estimators=[
        ('xgboost', xgboost_tuned),
        ('random_forest', rf_tuned),
    ],
    final_estimator=LogisticRegression(),
    stack_method='predict_proba',
    passthrough=False,
    n_jobs=-1
)

# Train ensemble
hybrid_model_stacking.fit(X_train, y_train)

# Evaluate hybrid model
y_pred_hybrid_stack = hybrid_model_stacking.predict(X_test)
accuracy_hybrid_stack = accuracy_score(y_test, y_pred_hybrid_stack)
print(f"Hybrid Model Stacking Accuracy: {accuracy_hybrid_stack:.4f}")
print("Classification Report:\n", classification_report(y_test, y_pred_hybrid_stack))


Hybrid Model Stacking Accuracy: 1.0000
Classification Report:
               precision    recall  f1-score   support

           0       1.00      1.00      1.00        50
           1       1.00      1.00      1.00       105
           2       1.00      1.00      1.00        41

    accuracy                           1.00       196
   macro avg       1.00      1.00      1.00       196
weighted avg       1.00      1.00      1.00       196



#### The stacking classifier achieved an accuracy of 1.00 on the test set, similar to the tuned Random Forest model. The soft voting classifier achived an accuract of 0.995, similar to the tuned XGBoost model, however, it did not outperform the Random Forest model or stacking classifier, which achieved perfect accuracy.

In [22]:
# Create comparison table of model accuracies
performance_data = {
    'Model': ['XGBoost Tuned', 'Random Forest Tuned', 'Hybrid Soft Voting', 'Hybrid Stacking'],
    'Accuracy': [accuracy_tuned, accuracy_rf_tuned, accuracy_hybrid, accuracy_hybrid_stack]
}

performance_df = pd.DataFrame(performance_data)
print(performance_df)


                 Model  Accuracy
0        XGBoost Tuned  0.994898
1  Random Forest Tuned  1.000000
2   Hybrid Soft Voting  0.994898
3      Hybrid Stacking  1.000000


#### Although the hybrid stacking and random forest models both achieved perfect accuracy, the stacking classifier provides a more robust approach by leveraging multiple models. Random Forest, however, produces more stable predictions.

In [27]:
# Save tuned and hybrid models
model_dir = r'C:\Users\anitr\AAI590_Capstone\AAI590_Capstone_AH\Models\tuned_and_hybrid_models'
os.makedirs(model_dir, exist_ok=True)
joblib.dump(xgboost_tuned, os.path.join(model_dir, 'xgboost_tuned.pkl'))
joblib.dump(rf_tuned, os.path.join(model_dir, 'rf_tuned.pkl'))
joblib.dump(hybrid_model_softvoting, os.path.join(model_dir, 'hybrid_softvoting.pkl'))
joblib.dump(hybrid_model_stacking, os.path.join(model_dir, 'hybrid_stacking.pkl'))

print("Tuned and hybrid models saved successfully.")

Tuned and hybrid models saved successfully.
