# Step 5a: Tuning
This notebook tunes four LightGBM forecasting models to predict next-week service request counts: **mean**, **median**, **10th percentile**, and **90th percentile**.  
The tuning process is performed using the `tune(...)` function defined in this notebook.

---

Running this notebook is **not recommended** unless absolutely necessary — the full tuning process took **~10 minutes** on an **AWS EC2 `c5d.18xlarge`** instance.  
This machine has the following specs:
- **36 vCPUs**  
- **144 GiB memory**  
- **High-performance NVMe SSD (local storage)**  
- **Optimized for compute-intensive workloads**

---

## Why Optuna

Optuna is used for **efficient hyperparameter tuning**.  
Its Tree-Structured Parzen Estimator (TPE) algorithm allows the search to focus on promising hyperparameter regions instead of performing exhaustive grid search.  
This makes it both **faster** and **more effective** when the number of trials is limited.

By wrapping our entire scikit-learn pipeline in an objective function, Optuna can automatically tune LightGBM parameters to minimize the relevant loss metric for each model type — Poisson deviance for mean prediction and pinball loss for quantile models.

---

## Why These Initial Settings

### Training and Test Split
The time-based split at `test_cutoff="2024-01-01"` ensures that **no learnings from the future leak into the training data**.  
This is aligned with how the model is trained in production — predictions are generated only using information available up to the forecast date.

### Modeling Choices
- **Mean Model**  
  Uses the `poisson` objective, which is appropriate for nonnegative count data.
- **Quantile Models**  
  Use the `quantile` objective with `alpha ∈ {0.1, 0.5, 0.9}` to estimate the distribution’s lower, median, and upper bounds.

### Hyperparameter Search Space
| Parameter | Range | Purpose |
|------------|--------|----------|
| `learning_rate` | [0.01, 0.2] | Controls learning speed and generalization |
| `num_leaves` | [50, 80] | Balances model complexity |
| `max_depth` | [6, 8] | Limits tree depth to avoid overfitting |
| `min_child_samples` | [20, 50] | Regularizes splits with low sample support |
| `subsample`, `colsample_bytree` | [0.6, 0.9] | Adds randomness for robustness |
| `n_estimators` | 500 | Provides sufficient boosting rounds for learning stability |
| `lambda_l1`, `colsample_bytree` | [0, 2] | Encourages sparsity in feature weights, aiding feature selection and reducing sensitivity to outliers |
| `lambda_l2` | [0, 3] | Penalizes large feature weights, helping to stabilize the model and reduce the impact of outliers |

### Reproducibility
A fixed `random_state=42` ensures consistent results across runs.


## Load Packages

In [2]:
import os
import sys

PACKAGE_PATH = os.path.abspath(os.path.join(os.getcwd(), '..'))
sys.path.insert(0, PACKAGE_PATH)

import json
import pandas as pd
import seaborn as sns
from src import config
from src import forecast

pd.set_option('display.max_columns', 50)
sns.set_style('whitegrid')
from importlib import reload

## Load Data

In [3]:
forecast_panel = pd.read_parquet(config.PRESENTATION_DATA_PATH + '/model_fitting_data.parquet')

## Inputs

In [4]:
numerical_columns = config.NUMERICAL_COLUMNS
categorical_columns = config.CATEGORICAL_COLUMNS
horizon = 1
df_input = forecast_panel.copy()
input_columns = df_input.columns

## Hyper Parameter Tuning

In [5]:
dict_params_mean = forecast.tune(
    df_input,
    horizon,
    input_columns,
    numerical_columns,
    categorical_columns,
)

Tuning forecast model for poisson (mean) with 30 trials...
X shape pre-filtering: (547535, 28)
X shape post-filtering: (547535, 28)


[I 2025-10-06 18:41:19,614] A new study created in memory with name: no-name-11d5951b-ad23-4800-af8e-bae8d12e9142


Train dates [2009-12-29 00:00:00 to 2023-12-26 00:00:00], Test dates [2024-01-02 00:00:00 to 2025-07-29 00:00:00]
X training shape: (487161, 28)
X test shape: (60374, 28)


  0%|          | 0/30 [00:00<?, ?it/s]

[I 2025-10-06 18:41:25,578] Trial 0 finished with value: 1.2471203760104566 and parameters: {'learning_rate': 0.03912720785647401, 'num_leaves': 109, 'max_depth': 8, 'min_child_samples': 46, 'subsample': 0.7390046601106091, 'subsample_freq': 2, 'colsample_bytree': 0.7145209030420498, 'lambda_l1': 1.7323522915498704, 'lambda_l2': 1.8033450352296265, 'min_split_gain': 0.35403628889802274, 'n_estimators': 361}. Best is trial 0 with value: 1.2471203760104566.
[I 2025-10-06 18:41:29,864] Trial 1 finished with value: 1.2497997221780406 and parameters: {'learning_rate': 0.11370159575730848, 'num_leaves': 102, 'max_depth': 6, 'min_child_samples': 30, 'subsample': 0.7458511274633584, 'subsample_freq': 3, 'colsample_bytree': 0.8311891079080594, 'lambda_l1': 0.8638900372842315, 'lambda_l2': 0.8736874205941257, 'min_split_gain': 0.30592644736118974, 'n_estimators': 426}. Best is trial 0 with value: 1.2471203760104566.
[I 2025-10-06 18:41:34,900] Trial 2 finished with value: 1.2467945276716148 and 

In [6]:
dict_params_90 = forecast.tune(
    df_input,
    horizon,
    input_columns,
    numerical_columns,
    categorical_columns,
    alpha = 0.9
)

Tuning forecast model for quantile (α=0.9) with 30 trials...
X shape pre-filtering: (547535, 28)
X shape post-filtering: (547535, 28)


[I 2025-10-06 18:44:16,922] A new study created in memory with name: no-name-f9db4b34-b6f4-4153-b7a8-3226b5f2be64


Train dates [2009-12-29 00:00:00 to 2023-12-26 00:00:00], Test dates [2024-01-02 00:00:00 to 2025-07-29 00:00:00]
X training shape: (487161, 28)
X test shape: (60374, 28)


  0%|          | 0/30 [00:00<?, ?it/s]

[I 2025-10-06 18:44:21,730] Trial 0 finished with value: 0.24817531329339756 and parameters: {'learning_rate': 0.03912720785647401, 'num_leaves': 109, 'max_depth': 8, 'min_child_samples': 46, 'subsample': 0.7390046601106091, 'subsample_freq': 2, 'colsample_bytree': 0.7145209030420498, 'lambda_l1': 1.7323522915498704, 'lambda_l2': 1.8033450352296265, 'min_split_gain': 0.35403628889802274, 'n_estimators': 361}. Best is trial 0 with value: 0.24817531329339756.
[I 2025-10-06 18:44:24,703] Trial 1 finished with value: 0.24893241127164126 and parameters: {'learning_rate': 0.11370159575730848, 'num_leaves': 102, 'max_depth': 6, 'min_child_samples': 30, 'subsample': 0.7458511274633584, 'subsample_freq': 3, 'colsample_bytree': 0.8311891079080594, 'lambda_l1': 0.8638900372842315, 'lambda_l2': 0.8736874205941257, 'min_split_gain': 0.30592644736118974, 'n_estimators': 426}. Best is trial 0 with value: 0.24817531329339756.
[I 2025-10-06 18:44:29,842] Trial 2 finished with value: 0.24784672713515998

In [7]:
dict_params_50 = forecast.tune(
    df_input,
    horizon,
    input_columns,
    numerical_columns,
    categorical_columns,
    alpha = 0.5
)

Tuning forecast model for quantile (α=0.5) with 30 trials...
X shape pre-filtering: (547535, 28)
X shape post-filtering: (547535, 28)


[I 2025-10-06 18:47:21,056] A new study created in memory with name: no-name-6fe5158a-1a96-4d14-b98d-d597d122778e


Train dates [2009-12-29 00:00:00 to 2023-12-26 00:00:00], Test dates [2024-01-02 00:00:00 to 2025-07-29 00:00:00]
X training shape: (487161, 28)
X test shape: (60374, 28)


  0%|          | 0/30 [00:00<?, ?it/s]

[I 2025-10-06 18:47:27,431] Trial 0 finished with value: 0.36734587060492907 and parameters: {'learning_rate': 0.03912720785647401, 'num_leaves': 109, 'max_depth': 8, 'min_child_samples': 46, 'subsample': 0.7390046601106091, 'subsample_freq': 2, 'colsample_bytree': 0.7145209030420498, 'lambda_l1': 1.7323522915498704, 'lambda_l2': 1.8033450352296265, 'min_split_gain': 0.35403628889802274, 'n_estimators': 361}. Best is trial 0 with value: 0.36734587060492907.
[I 2025-10-06 18:47:32,747] Trial 1 finished with value: 0.3670315417631783 and parameters: {'learning_rate': 0.11370159575730848, 'num_leaves': 102, 'max_depth': 6, 'min_child_samples': 30, 'subsample': 0.7458511274633584, 'subsample_freq': 3, 'colsample_bytree': 0.8311891079080594, 'lambda_l1': 0.8638900372842315, 'lambda_l2': 0.8736874205941257, 'min_split_gain': 0.30592644736118974, 'n_estimators': 426}. Best is trial 1 with value: 0.3670315417631783.
[I 2025-10-06 18:47:38,254] Trial 2 finished with value: 0.367467175951685 and

In [8]:
dict_params_10 = forecast.tune(
    df_input,
    horizon,
    input_columns,
    numerical_columns,
    categorical_columns,
    alpha = 0.1
)

Tuning forecast model for quantile (α=0.1) with 30 trials...
X shape pre-filtering: (547535, 28)
X shape post-filtering: (547535, 28)


[I 2025-10-06 18:51:40,785] A new study created in memory with name: no-name-e8be7bf8-e0b5-4649-85a5-e9a4d7c4c29d


Train dates [2009-12-29 00:00:00 to 2023-12-26 00:00:00], Test dates [2024-01-02 00:00:00 to 2025-07-29 00:00:00]
X training shape: (487161, 28)
X test shape: (60374, 28)


  0%|          | 0/30 [00:00<?, ?it/s]

[I 2025-10-06 18:51:47,110] Trial 0 finished with value: 0.09830400185571136 and parameters: {'learning_rate': 0.03912720785647401, 'num_leaves': 109, 'max_depth': 8, 'min_child_samples': 46, 'subsample': 0.7390046601106091, 'subsample_freq': 2, 'colsample_bytree': 0.7145209030420498, 'lambda_l1': 1.7323522915498704, 'lambda_l2': 1.8033450352296265, 'min_split_gain': 0.35403628889802274, 'n_estimators': 361}. Best is trial 0 with value: 0.09830400185571136.
[I 2025-10-06 18:51:52,168] Trial 1 finished with value: 0.09828296033940767 and parameters: {'learning_rate': 0.11370159575730848, 'num_leaves': 102, 'max_depth': 6, 'min_child_samples': 30, 'subsample': 0.7458511274633584, 'subsample_freq': 3, 'colsample_bytree': 0.8311891079080594, 'lambda_l1': 0.8638900372842315, 'lambda_l2': 0.8736874205941257, 'min_split_gain': 0.30592644736118974, 'n_estimators': 426}. Best is trial 1 with value: 0.09828296033940767.
[I 2025-10-06 18:51:57,500] Trial 2 finished with value: 0.09863702711350193

### Convert to a Single Dict

In [9]:
dict_params = {}
dict_params['mean'] = dict_params_mean['best_params']
dict_params['90'] = dict_params_90['best_params']
dict_params['50'] = dict_params_50['best_params']
dict_params['10'] = dict_params_10['best_params']



### Save to resources folder

In [10]:
with open("../src/resources/model_optimal_params.json", "w") as f:
    json.dump(dict_params, f, indent=4)