# LightGBM Model Development

LightGBM is a gradient boosting framework that uses tree-based learning algorithms. It is designed to be distributed and efficient with the following advantages:

-   Faster training speed and higher efficiency.
-   Lower memory usage.
-   Better accuracy.
-   Support of parallel, distributed, and GPU learning.
-   Capable of handling large-scale data.

In this notebook, we will:

1.  **Load the Datasets**: Load the processed training, validation, and test datasets.
2.  **Hyperparameter Tuning & Model Training**: Use Optuna to find the best hyperparameters for a LightGBM model, and then train the model.

### 1. Load the Datasets

In [1]:
import pandas as pd
from pathlib import Path

# Directory where the processed data is stored
data_path = Path("../processed_data")

# Load the training and validation datasets
X_train, X_val, y_train, y_val = (
    pd.read_csv(data_path / "X_train.csv"),
    pd.read_csv(data_path / "X_val.csv"),
    pd.read_csv(data_path / "y_train.csv"),
    pd.read_csv(data_path / "y_val.csv")
)

# Combine train and validation sets for robust K-Fold tuning
features = pd.concat([X_train, X_val], ignore_index=True)
targets = pd.concat([y_train, y_val], ignore_index=True)

# Display the shapes of the datasets
print(f"Features shape: {features.shape}")
print(f"Targets shape: {targets.shape}")

Features shape: (2000, 157)
Targets shape: (2000, 10)


### 2. Hyperparameter Tuning & Model Training

**K-Fold Cross-Validation**: It is a technique used to evaluate the performance and generalization ability of a model. The training data is divided into 'K' subsets or folds. The model is trained 'K' times, each time using a different fold as the validation set and the remaining folds as the training set. The results are then averaged across all folds to provide a more robust estimate of the model's performance, reducing the risk of overfitting to a specific training or validation set.

**Optuna**: It is an open-source hyperparameter optimization framework that automates the process of finding the best set of hyperparameters for a machine learning model. It uses sophisticated algorithms to efficiently search the hyperparameter space, balancing exploration of new parameter combinations with exploitation of promising ones. Optuna helps to maximize the performance of the model by identifying the optimal hyperparameter settings.

**Mean Absolute Percentage Error (MAPE)**: It is a metric used to measure the accuracy of a forecasting method. It calculates the average absolute percentage difference between the predicted and actual values. MAPE is scale-independent and easy to interpret, making it a popular choice for evaluating the performance of regression models, especially when the target variable is positive. A lower MAPE indicates higher accuracy of the model.

In [2]:
import optuna
import warnings
import numpy as np
import lightgbm as lgb
from sklearn.model_selection import KFold
from sklearn.metrics import mean_absolute_percentage_error

# Suppress warnings for cleaner output
warnings.filterwarnings("ignore")

# Define the objective function for Optuna
def objective(trial: optuna.Trial, X: pd.DataFrame, y: pd.Series):
    """
    Objective function for Optuna to minimize.
    This function trains a LightGBM model with a set of hyperparameters
    suggested by Optuna and returns the cross-validated MAPE.

    Parameters:
      trial (optuna.Trial): An Optuna trial object that suggests hyperparameters.
      X (pd.DataFrame): Feature matrix for training.
      y (pd.Series): Target variable for training.

    Returns:
      float: The mean absolute percentage error (MAPE) of the model on the validation set during cross-validation.
    """
    # Define the hyperparameter search space for LightGBM
    param = {
        'objective': 'regression_l1',                                         # Specify the objective function as L1 regression (Mean Absolute Error)
        'metric': 'mape',                                                     # Use Mean Absolute Percentage Error as the evaluation metric
        'n_estimators': trial.suggest_int('n_estimators', 100, 1000),         # Number of boosting rounds (trees) - suggested by Optuna
        'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.3),      # Step size shrinkage to prevent overfitting - suggested by Optuna
        'num_leaves': trial.suggest_int('num_leaves', 20, 300),               # Number of leaves in one tree - suggested by Optuna
        'max_depth': trial.suggest_int('max_depth', 3, 12),                   # Limit the depth of tree to prevent overfitting - suggested by Optuna
        'min_child_samples': trial.suggest_int('min_child_samples', 5, 100),  # Minimum number of data needed in a leaf - suggested by Optuna
        'subsample': trial.suggest_float('subsample', 0.6, 1.0),               # Fraction of training data to be used for each boosting round - suggested by Optuna
        'colsample_bytree': trial.suggest_float('colsample_bytree', 0.6, 1.0), # Fraction of features to be used for each tree - suggested by Optuna
        'random_state': 42,                                                   # Random seed for reproducibility
        'n_jobs': -1,                                                         # Use all available cores for training
        'verbose':-1                                                          # Suppress verbose output from LightGBM
    }

    # --- Perform K-Fold cross-validation ---
    # This splits the data into K folds, trains the model on K-1 folds, and validates it on the remaining fold, repeating this process K times.
    # The mean absolute percentage error (MAPE) is calculated for each fold, and the average MAPE across all folds is returned as the objective value.
    # This helps in assessing the model's performance more robustly by reducing variance.
    kf = KFold(n_splits=5, shuffle=True, random_state=42)
    mape_scores = []
    for train_index, val_index in kf.split(X):
        X_train, X_val = X.iloc[train_index], X.iloc[val_index]
        y_train, y_val = y.iloc[train_index], y.iloc[val_index]

        model = lgb.LGBMRegressor(**param)
        model.fit(X_train, y_train, eval_set=[(X_val, y_val)], callbacks=[lgb.early_stopping(10, verbose=False)])
        preds = model.predict(X_val)
        mape_scores.append(mean_absolute_percentage_error(y_val, preds))

    return np.mean(mape_scores)

In [None]:
import joblib

# Define the directory for saving models and Optuna studies
model_dir = Path("../models/lightgbm")
model_dir.mkdir(parents=True, exist_ok=True)

optuna_dir = Path("../optuna_db")
optuna_dir.mkdir(parents=True, exist_ok=True)
storage_name = f"sqlite:///{optuna_dir}/lightgbm_studies.db"

# Dictionary to store the best models
best_models = {}

# Iterate over each target property to tune and train a model
for target in targets.columns:
    print(f"\n--- TUNING AND TRAINING FOR {target} ---\n")
    y = targets[target]


    # Create an Optuna study to find the best hyperparameters
    study = optuna.create_study(direction='minimize',
                                study_name='lightgbm-tuning-' + target,
                                storage=storage_name)
    study.optimize(lambda trial: objective(trial, features, y), n_trials=100)

    # Get the best hyperparameters
    best_params = study.best_params
    print(f"\nBEST MAPE FOR {target}: {study.best_value}")
    print(f"BEST HYPERPARAMETERS FOR {target}: {best_params}")

    # Train the final model with the best hyperparameters on the entire training set
    final_model = lgb.LGBMRegressor(**best_params, random_state=42, n_jobs=-1)
    final_model.fit(features, y)

    # Save the trained model to a file
    joblib.dump(final_model, f'{model_dir}/{target}_model.joblib')
    print(f"Saved best model for {target}")

    # Store the best model with its MAPE score in the dictionary
    best_models[target] = (final_model, study.best_value)


--- TUNING AND TRAINING FOR BlendProperty1 ---



[I 2025-07-18 08:03:20,913] A new study created in RDB with name: lightgbm-tuning-BlendProperty1
[I 2025-07-18 08:03:32,910] Trial 0 finished with value: 0.6775153832726067 and parameters: {'n_estimators': 623, 'learning_rate': 0.02550234740429879, 'num_leaves': 153, 'max_depth': 8, 'min_child_samples': 62, 'subsample': 0.987840076924557, 'colsample_bytree': 0.9037467807059227}. Best is trial 0 with value: 0.6775153832726067.
[I 2025-07-18 08:03:40,202] Trial 1 finished with value: 3.081374529376782 and parameters: {'n_estimators': 453, 'learning_rate': 0.10631937672293254, 'num_leaves': 62, 'max_depth': 11, 'min_child_samples': 38, 'subsample': 0.97000069837888, 'colsample_bytree': 0.7394274698874774}. Best is trial 0 with value: 0.6775153832726067.
[I 2025-07-18 08:03:47,033] Trial 2 finished with value: 0.6824005309100417 and parameters: {'n_estimators': 698, 'learning_rate': 0.024529746336448423, 'num_leaves': 110, 'max_depth': 6, 'min_child_samples': 94, 'subsample': 0.61913292354


BEST MAPE FOR BlendProperty1: 0.664480119895121
BEST HYPERPARAMETERS FOR BlendProperty1: {'n_estimators': 777, 'learning_rate': 0.1476886594008643, 'num_leaves': 92, 'max_depth': 8, 'min_child_samples': 88, 'subsample': 0.8156123302508664, 'colsample_bytree': 0.9540732789631515}


[I 2025-07-18 08:15:41,499] A new study created in RDB with name: lightgbm-tuning-BlendProperty2


Saved best model for BlendProperty1

--- TUNING AND TRAINING FOR BlendProperty2 ---



[I 2025-07-18 08:15:46,943] Trial 0 finished with value: 0.7790544240928472 and parameters: {'n_estimators': 496, 'learning_rate': 0.021174372223544416, 'num_leaves': 201, 'max_depth': 9, 'min_child_samples': 99, 'subsample': 0.8633953518903468, 'colsample_bytree': 0.8854164751502047}. Best is trial 0 with value: 0.7790544240928472.
[I 2025-07-18 08:15:50,260] Trial 1 finished with value: 0.7002564843102212 and parameters: {'n_estimators': 470, 'learning_rate': 0.09398685441847866, 'num_leaves': 99, 'max_depth': 5, 'min_child_samples': 77, 'subsample': 0.7160273776846018, 'colsample_bytree': 0.8184718459357181}. Best is trial 1 with value: 0.7002564843102212.
[I 2025-07-18 08:15:51,918] Trial 2 finished with value: 0.9116071774318557 and parameters: {'n_estimators': 281, 'learning_rate': 0.1590666010295683, 'num_leaves': 183, 'max_depth': 3, 'min_child_samples': 45, 'subsample': 0.9018057216117575, 'colsample_bytree': 0.9235831914679075}. Best is trial 1 with value: 0.7002564843102212.


BEST MAPE FOR BlendProperty2: 0.5900549855027382
BEST HYPERPARAMETERS FOR BlendProperty2: {'n_estimators': 422, 'learning_rate': 0.06403131398252673, 'num_leaves': 251, 'max_depth': 7, 'min_child_samples': 40, 'subsample': 0.6029801288457345, 'colsample_bytree': 0.7980582063868619}


[I 2025-07-18 08:27:57,348] A new study created in RDB with name: lightgbm-tuning-BlendProperty3


Saved best model for BlendProperty2

--- TUNING AND TRAINING FOR BlendProperty3 ---



[I 2025-07-18 08:28:03,167] Trial 0 finished with value: 1.0770766515221362 and parameters: {'n_estimators': 624, 'learning_rate': 0.07037186030088431, 'num_leaves': 164, 'max_depth': 5, 'min_child_samples': 35, 'subsample': 0.7451111922626533, 'colsample_bytree': 0.9985754288430407}. Best is trial 0 with value: 1.0770766515221362.
[I 2025-07-18 08:28:14,548] Trial 1 finished with value: 1.1486634002124112 and parameters: {'n_estimators': 231, 'learning_rate': 0.019003281192781725, 'num_leaves': 224, 'max_depth': 10, 'min_child_samples': 23, 'subsample': 0.6477014138487872, 'colsample_bytree': 0.9846192468757828}. Best is trial 0 with value: 1.0770766515221362.
[I 2025-07-18 08:28:18,202] Trial 2 finished with value: 1.3811462690679603 and parameters: {'n_estimators': 240, 'learning_rate': 0.1858205489984776, 'num_leaves': 205, 'max_depth': 6, 'min_child_samples': 27, 'subsample': 0.9256807424155481, 'colsample_bytree': 0.6107201500033383}. Best is trial 0 with value: 1.077076651522136


BEST MAPE FOR BlendProperty3: 0.9522893576718117
BEST HYPERPARAMETERS FOR BlendProperty3: {'n_estimators': 593, 'learning_rate': 0.05780234131412661, 'num_leaves': 248, 'max_depth': 8, 'min_child_samples': 47, 'subsample': 0.853102339382838, 'colsample_bytree': 0.7779508951042512}


[I 2025-07-18 08:39:58,946] A new study created in RDB with name: lightgbm-tuning-BlendProperty4


Saved best model for BlendProperty3

--- TUNING AND TRAINING FOR BlendProperty4 ---



[I 2025-07-18 08:40:01,264] Trial 0 finished with value: 0.8754344023758043 and parameters: {'n_estimators': 804, 'learning_rate': 0.1266368711220266, 'num_leaves': 236, 'max_depth': 7, 'min_child_samples': 96, 'subsample': 0.9045910422049876, 'colsample_bytree': 0.7409411626064326}. Best is trial 0 with value: 0.8754344023758043.
[I 2025-07-18 08:40:09,159] Trial 1 finished with value: 0.674603513138819 and parameters: {'n_estimators': 449, 'learning_rate': 0.05108256837843003, 'num_leaves': 107, 'max_depth': 11, 'min_child_samples': 71, 'subsample': 0.7105919440574111, 'colsample_bytree': 0.7063198688428093}. Best is trial 1 with value: 0.674603513138819.
[I 2025-07-18 08:40:11,268] Trial 2 finished with value: 1.4038478315904044 and parameters: {'n_estimators': 554, 'learning_rate': 0.2974027176544398, 'num_leaves': 201, 'max_depth': 12, 'min_child_samples': 75, 'subsample': 0.9624204514558262, 'colsample_bytree': 0.6319768320792217}. Best is trial 1 with value: 0.674603513138819.
[


BEST MAPE FOR BlendProperty4: 0.5983217415884959
BEST HYPERPARAMETERS FOR BlendProperty4: {'n_estimators': 811, 'learning_rate': 0.015333192596856584, 'num_leaves': 105, 'max_depth': 8, 'min_child_samples': 83, 'subsample': 0.7824945556992893, 'colsample_bytree': 0.7279981192117051}


[I 2025-07-18 08:51:46,376] A new study created in RDB with name: lightgbm-tuning-BlendProperty5


Saved best model for BlendProperty4

--- TUNING AND TRAINING FOR BlendProperty5 ---



[I 2025-07-18 08:51:50,288] Trial 0 finished with value: 0.34478032900206806 and parameters: {'n_estimators': 154, 'learning_rate': 0.029709452125239233, 'num_leaves': 31, 'max_depth': 8, 'min_child_samples': 13, 'subsample': 0.7625320439853341, 'colsample_bytree': 0.7112310054447588}. Best is trial 0 with value: 0.34478032900206806.
[I 2025-07-18 08:51:52,949] Trial 1 finished with value: 0.27851108964119664 and parameters: {'n_estimators': 148, 'learning_rate': 0.14886200686277382, 'num_leaves': 251, 'max_depth': 10, 'min_child_samples': 30, 'subsample': 0.8840246041536775, 'colsample_bytree': 0.9570897609844352}. Best is trial 1 with value: 0.27851108964119664.
[I 2025-07-18 08:51:54,188] Trial 2 finished with value: 0.4157182935208734 and parameters: {'n_estimators': 359, 'learning_rate': 0.12351916242296068, 'num_leaves': 285, 'max_depth': 9, 'min_child_samples': 82, 'subsample': 0.8075204580512774, 'colsample_bytree': 0.8339248076341543}. Best is trial 1 with value: 0.27851108964


BEST MAPE FOR BlendProperty5: 0.1365157726963243
BEST HYPERPARAMETERS FOR BlendProperty5: {'n_estimators': 856, 'learning_rate': 0.020176686978292724, 'num_leaves': 114, 'max_depth': 6, 'min_child_samples': 9, 'subsample': 0.6452235372463412, 'colsample_bytree': 0.98434695824643}


[I 2025-07-18 09:03:41,608] A new study created in RDB with name: lightgbm-tuning-BlendProperty6


Saved best model for BlendProperty5

--- TUNING AND TRAINING FOR BlendProperty6 ---



[I 2025-07-18 09:03:44,654] Trial 0 finished with value: 0.9671588509039749 and parameters: {'n_estimators': 730, 'learning_rate': 0.16518951129422277, 'num_leaves': 135, 'max_depth': 5, 'min_child_samples': 9, 'subsample': 0.9418413836085842, 'colsample_bytree': 0.6247779209126676}. Best is trial 0 with value: 0.9671588509039749.
[I 2025-07-18 09:03:47,749] Trial 1 finished with value: 0.9556779792552661 and parameters: {'n_estimators': 665, 'learning_rate': 0.18543780300886856, 'num_leaves': 238, 'max_depth': 10, 'min_child_samples': 37, 'subsample': 0.8006006209851375, 'colsample_bytree': 0.791949261670256}. Best is trial 1 with value: 0.9556779792552661.
[I 2025-07-18 09:03:49,094] Trial 2 finished with value: 0.9097752269102939 and parameters: {'n_estimators': 185, 'learning_rate': 0.21047595533095254, 'num_leaves': 211, 'max_depth': 5, 'min_child_samples': 97, 'subsample': 0.6235626147461978, 'colsample_bytree': 0.8206506402675499}. Best is trial 2 with value: 0.9097752269102939.


BEST MAPE FOR BlendProperty6: 0.49472325129549743
BEST HYPERPARAMETERS FOR BlendProperty6: {'n_estimators': 536, 'learning_rate': 0.02751790371293122, 'num_leaves': 166, 'max_depth': 12, 'min_child_samples': 28, 'subsample': 0.8656252803882962, 'colsample_bytree': 0.6323888114760962}


[I 2025-07-18 09:21:53,750] A new study created in RDB with name: lightgbm-tuning-BlendProperty7


Saved best model for BlendProperty6

--- TUNING AND TRAINING FOR BlendProperty7 ---



[I 2025-07-18 09:22:02,734] Trial 0 finished with value: 0.9965320789183648 and parameters: {'n_estimators': 981, 'learning_rate': 0.04718979140857309, 'num_leaves': 131, 'max_depth': 8, 'min_child_samples': 63, 'subsample': 0.9647826873893971, 'colsample_bytree': 0.7427601283289986}. Best is trial 0 with value: 0.9965320789183648.
[I 2025-07-18 09:22:05,175] Trial 1 finished with value: 1.4984129798534673 and parameters: {'n_estimators': 257, 'learning_rate': 0.23338088026814932, 'num_leaves': 245, 'max_depth': 4, 'min_child_samples': 29, 'subsample': 0.8409132734669182, 'colsample_bytree': 0.8639156887199686}. Best is trial 0 with value: 0.9965320789183648.
[I 2025-07-18 09:22:14,730] Trial 2 finished with value: 1.3505988869965586 and parameters: {'n_estimators': 445, 'learning_rate': 0.10134733353166368, 'num_leaves': 149, 'max_depth': 10, 'min_child_samples': 17, 'subsample': 0.7379937612814085, 'colsample_bytree': 0.6945278050327175}. Best is trial 0 with value: 0.996532078918364


BEST MAPE FOR BlendProperty7: 0.900055207376339
BEST HYPERPARAMETERS FOR BlendProperty7: {'n_estimators': 370, 'learning_rate': 0.09127338726168102, 'num_leaves': 294, 'max_depth': 10, 'min_child_samples': 31, 'subsample': 0.8343089460743401, 'colsample_bytree': 0.7160976574352571}


[I 2025-07-18 09:33:12,540] A new study created in RDB with name: lightgbm-tuning-BlendProperty8


Saved best model for BlendProperty7

--- TUNING AND TRAINING FOR BlendProperty8 ---



[I 2025-07-18 09:33:45,798] Trial 0 finished with value: 1.0083324004593828 and parameters: {'n_estimators': 628, 'learning_rate': 0.02615310243838706, 'num_leaves': 277, 'max_depth': 9, 'min_child_samples': 13, 'subsample': 0.854546175673847, 'colsample_bytree': 0.7565012870182118}. Best is trial 0 with value: 1.0083324004593828.
[I 2025-07-18 09:33:50,847] Trial 1 finished with value: 1.096020847382366 and parameters: {'n_estimators': 804, 'learning_rate': 0.07560700693254174, 'num_leaves': 207, 'max_depth': 3, 'min_child_samples': 59, 'subsample': 0.8502686947485638, 'colsample_bytree': 0.9820335305518176}. Best is trial 0 with value: 1.0083324004593828.
[I 2025-07-18 09:33:53,124] Trial 2 finished with value: 1.1369061784815178 and parameters: {'n_estimators': 557, 'learning_rate': 0.20946213762336352, 'num_leaves': 220, 'max_depth': 4, 'min_child_samples': 20, 'subsample': 0.971601195745204, 'colsample_bytree': 0.8677446434209442}. Best is trial 0 with value: 1.0083324004593828.
[


BEST MAPE FOR BlendProperty8: 0.7446173258348122
BEST HYPERPARAMETERS FOR BlendProperty8: {'n_estimators': 289, 'learning_rate': 0.026614042856997554, 'num_leaves': 295, 'max_depth': 9, 'min_child_samples': 22, 'subsample': 0.9584896842282827, 'colsample_bytree': 0.8026588642390511}


[I 2025-07-18 09:48:58,489] A new study created in RDB with name: lightgbm-tuning-BlendProperty9


Saved best model for BlendProperty8

--- TUNING AND TRAINING FOR BlendProperty9 ---



[I 2025-07-18 09:49:15,454] Trial 0 finished with value: 1.589081776710255 and parameters: {'n_estimators': 503, 'learning_rate': 0.06628005025831195, 'num_leaves': 198, 'max_depth': 10, 'min_child_samples': 17, 'subsample': 0.8190118325768657, 'colsample_bytree': 0.6302664605314011}. Best is trial 0 with value: 1.589081776710255.
[I 2025-07-18 09:49:18,425] Trial 1 finished with value: 1.4969204768929267 and parameters: {'n_estimators': 570, 'learning_rate': 0.18970694409380465, 'num_leaves': 30, 'max_depth': 9, 'min_child_samples': 67, 'subsample': 0.7472047652587326, 'colsample_bytree': 0.8335367832511407}. Best is trial 1 with value: 1.4969204768929267.
[I 2025-07-18 09:49:23,330] Trial 2 finished with value: 1.3803250465631849 and parameters: {'n_estimators': 539, 'learning_rate': 0.039659420003095296, 'num_leaves': 268, 'max_depth': 9, 'min_child_samples': 96, 'subsample': 0.952070973793067, 'colsample_bytree': 0.8285018858697657}. Best is trial 2 with value: 1.3803250465631849.



BEST MAPE FOR BlendProperty9: 0.9503320893022901
BEST HYPERPARAMETERS FOR BlendProperty9: {'n_estimators': 657, 'learning_rate': 0.0712565290381967, 'num_leaves': 227, 'max_depth': 9, 'min_child_samples': 48, 'subsample': 0.9890307391233176, 'colsample_bytree': 0.7169081452240404}


[I 2025-07-18 10:00:07,990] A new study created in RDB with name: lightgbm-tuning-BlendProperty10


Saved best model for BlendProperty9

--- TUNING AND TRAINING FOR BlendProperty10 ---



[I 2025-07-18 10:00:10,791] Trial 0 finished with value: 1.0028614756502903 and parameters: {'n_estimators': 462, 'learning_rate': 0.299164155951027, 'num_leaves': 227, 'max_depth': 5, 'min_child_samples': 16, 'subsample': 0.6920642156334593, 'colsample_bytree': 0.9544821360376718}. Best is trial 0 with value: 1.0028614756502903.
[I 2025-07-18 10:00:12,696] Trial 1 finished with value: 0.8159412874727549 and parameters: {'n_estimators': 562, 'learning_rate': 0.17864584037296521, 'num_leaves': 109, 'max_depth': 4, 'min_child_samples': 68, 'subsample': 0.7498598658113048, 'colsample_bytree': 0.9394813912442636}. Best is trial 1 with value: 0.8159412874727549.
[I 2025-07-18 10:00:15,038] Trial 2 finished with value: 0.733703513549365 and parameters: {'n_estimators': 462, 'learning_rate': 0.16572042786063326, 'num_leaves': 255, 'max_depth': 7, 'min_child_samples': 82, 'subsample': 0.8519773436214688, 'colsample_bytree': 0.8971797503736529}. Best is trial 2 with value: 0.733703513549365.
[I


BEST MAPE FOR BlendProperty10: 0.46134439195188753
BEST HYPERPARAMETERS FOR BlendProperty10: {'n_estimators': 806, 'learning_rate': 0.06879826060990206, 'num_leaves': 263, 'max_depth': 10, 'min_child_samples': 60, 'subsample': 0.711751289956135, 'colsample_bytree': 0.6671329498719161}
Saved best model for BlendProperty10


### 3. List the Best Models with MAPE Scores

In [None]:
print("--- BEST MODELS AND MAPE SCORES ---\n")
for target, (model, mape) in best_models.items():
    print(f"{target}: MAPE = {mape:.4f}, Model = {model}")
print("--- ALL MODELS SAVED TO DISK ---\n")
print(f"Models are saved in: {model_dir}")
print(f"Optuna studies are saved in: {optuna_dir}")


--- BEST MODELS AND MAPE SCORES ---
BlendProperty1: MAPE = 0.6645, Model = LGBMRegressor(colsample_bytree=0.9540732789631515,
              learning_rate=0.1476886594008643, max_depth=8,
              min_child_samples=88, n_estimators=777, n_jobs=-1, num_leaves=92,
              random_state=42, subsample=0.8156123302508664)
BlendProperty2: MAPE = 0.5901, Model = LGBMRegressor(colsample_bytree=0.7980582063868619,
              learning_rate=0.06403131398252673, max_depth=7,
              min_child_samples=40, n_estimators=422, n_jobs=-1, num_leaves=251,
              random_state=42, subsample=0.6029801288457345)
BlendProperty3: MAPE = 0.9523, Model = LGBMRegressor(colsample_bytree=0.7779508951042512,
              learning_rate=0.05780234131412661, max_depth=8,
              min_child_samples=47, n_estimators=593, n_jobs=-1, num_leaves=248,
              random_state=42, subsample=0.853102339382838)
BlendProperty4: MAPE = 0.5983, Model = LGBMRegressor(colsample_bytree=0.727998119211