# Step 5a: Tuning
This notebook tunes four LightGBM forecasting models to predict next-week service request counts: **mean**, **median**, **10th percentile**, and **90th percentile**.  
The tuning process is performed using the `tune(...)` function defined in this notebook.

---

Running this notebook is **not recommended** unless absolutely necessary — the full tuning process took **~10 minutes** on an **AWS EC2 `c5d.18xlarge`** instance.  
This machine has the following specs:
- **36 vCPUs**  
- **144 GiB memory**  
- **High-performance NVMe SSD (local storage)**  
- **Optimized for compute-intensive workloads**

Even on such high-end hardware, Optuna’s multi-trial optimization (4 models × 30 trials each) requires substantial CPU and memory resources.  
If you only need to rerun a single quantile model or make incremental adjustments, consider reducing `n_trials` or reusing stored best parameters.

---

## Why Optuna

Optuna is used for **efficient hyperparameter tuning**.  
Its Tree-Structured Parzen Estimator (TPE) algorithm allows the search to focus on promising hyperparameter regions instead of performing exhaustive grid search.  
This makes it both **faster** and **more effective** when the number of trials is limited.

By wrapping our entire scikit-learn pipeline in an objective function, Optuna can automatically tune LightGBM parameters to minimize the relevant loss metric for each model type — Poisson deviance for mean prediction and pinball loss for quantile models.

---

## Why These Initial Settings

### Training and Test Split
The time-based split at `test_cutoff="2024-01-01"` ensures that **no learnings from the future leak into the training data**.  
This is aligned with how the model is trained in production — predictions are generated only using information available up to the forecast date.

### Modeling Choices
- **Mean Model**  
  Uses the `poisson` objective, which is appropriate for nonnegative count data.
- **Quantile Models**  
  Use the `quantile` objective with `alpha ∈ {0.1, 0.5, 0.9}` to estimate the distribution’s lower, median, and upper bounds.

### Hyperparameter Search Space
| Parameter | Range | Purpose |
|------------|--------|----------|
| `learning_rate` | [0.01, 0.2] | Controls learning speed and generalization |
| `num_leaves` | [50, 80] | Balances model complexity |
| `max_depth` | [6, 8] | Limits tree depth to avoid overfitting |
| `min_child_samples` | [20, 50] | Regularizes splits with low sample support |
| `subsample`, `colsample_bytree` | [0.6, 0.9] | Adds randomness for robustness |
| `n_estimators` | 500 | Provides sufficient boosting rounds for learning stability |

### Reproducibility
A fixed `random_state=42` ensures consistent results across runs.


## Load Packages

In [None]:
import os
import sys

PACKAGE_PATH = os.path.abspath(os.path.join(os.getcwd(), '..'))
sys.path.insert(0, PACKAGE_PATH)

import json
import pandas as pd
import seaborn as sns
from src import config
from src import forecast

pd.set_option('display.max_columns', 50)
sns.set_style('whitegrid')
from importlib import reload

  from .autonotebook import tqdm as notebook_tqdm


## Load Data

In [3]:
forecast_panel = pd.read_parquet(config.PRESENTATION_DATA_PATH + '/model_fitting_data.parquet')

## Inputs

In [None]:
numerical_columns = config.NUMERICAL_COLUMNS
categorical_columns = config.CATEGORICAL_COLUMNS
horizon = 1
df_input = forecast_panel.copy()
input_columns = df_input.columns

## Hyper Parameter Tuning

In [5]:
dict_params_mean = forecast.tune(
    df_input,
    horizon,
    input_columns,
    numerical_columns,
    categorical_columns,
)

Tuning forecast model for poisson (mean) with 30 trials...
X shape pre-filtering: (547535, 28)
X shape post-filtering: (547535, 28)


[I 2025-10-06 05:31:58,262] A new study created in memory with name: no-name-7f0c2d12-cd6a-44f3-8473-1c99a57854ff


Train dates [2009-12-29 00:00:00 to 2023-12-26 00:00:00], Test dates [2024-01-02 00:00:00 to 2025-07-29 00:00:00]
X training shape: (487161, 28)
X test shape: (60374, 28)


  0%|          | 0/30 [00:00<?, ?it/s]

[I 2025-10-06 05:32:04,732] Trial 0 finished with value: 1.2468016839770635 and parameters: {'learning_rate': 0.030710573677773714, 'num_leaves': 79, 'max_depth': 8, 'min_child_samples': 38, 'subsample': 0.6468055921327309, 'colsample_bytree': 0.6467983561008608}. Best is trial 0 with value: 1.2468016839770635.
[I 2025-10-06 05:32:11,030] Trial 1 finished with value: 1.2537382443683658 and parameters: {'learning_rate': 0.011900590783184251, 'num_leaves': 76, 'max_depth': 7, 'min_child_samples': 41, 'subsample': 0.6061753482887408, 'colsample_bytree': 0.8909729556485984}. Best is trial 0 with value: 1.2468016839770635.
[I 2025-10-06 05:32:14,576] Trial 2 finished with value: 1.2497434530210318 and parameters: {'learning_rate': 0.12106896936002161, 'num_leaves': 56, 'max_depth': 6, 'min_child_samples': 25, 'subsample': 0.6912726728878613, 'colsample_bytree': 0.7574269294896714}. Best is trial 0 with value: 1.2468016839770635.
[I 2025-10-06 05:32:19,198] Trial 3 finished with value: 1.246

In [6]:
dict_params_90 = forecast.tune(
    df_input,
    horizon,
    input_columns,
    numerical_columns,
    categorical_columns,
    alpha = 0.9
)

Tuning forecast model for quantile (α=0.9) with 30 trials...
X shape pre-filtering: (547535, 28)
X shape post-filtering: (547535, 28)


[I 2025-10-06 05:34:31,344] A new study created in memory with name: no-name-12fa0ba4-b3e1-4bc9-9e08-7fcd4dcc20c1


Train dates [2009-12-29 00:00:00 to 2023-12-26 00:00:00], Test dates [2024-01-02 00:00:00 to 2025-07-29 00:00:00]
X training shape: (487161, 28)
X test shape: (60374, 28)


  0%|          | 0/30 [00:00<?, ?it/s]

[I 2025-10-06 05:34:38,402] Trial 0 finished with value: 0.2475194483070389 and parameters: {'learning_rate': 0.030710573677773714, 'num_leaves': 79, 'max_depth': 8, 'min_child_samples': 38, 'subsample': 0.6468055921327309, 'colsample_bytree': 0.6467983561008608}. Best is trial 0 with value: 0.2475194483070389.
[I 2025-10-06 05:34:44,705] Trial 1 finished with value: 0.24827559035425892 and parameters: {'learning_rate': 0.011900590783184251, 'num_leaves': 76, 'max_depth': 7, 'min_child_samples': 41, 'subsample': 0.6061753482887408, 'colsample_bytree': 0.8909729556485984}. Best is trial 0 with value: 0.2475194483070389.
[I 2025-10-06 05:34:50,586] Trial 2 finished with value: 0.24980348336955763 and parameters: {'learning_rate': 0.12106896936002161, 'num_leaves': 56, 'max_depth': 6, 'min_child_samples': 25, 'subsample': 0.6912726728878613, 'colsample_bytree': 0.7574269294896714}. Best is trial 0 with value: 0.2475194483070389.
[I 2025-10-06 05:34:57,085] Trial 3 finished with value: 0.2

In [7]:
dict_params_50 = forecast.tune(
    df_input,
    horizon,
    input_columns,
    numerical_columns,
    categorical_columns,
    alpha = 0.5
)

Tuning forecast model for quantile (α=0.5) with 30 trials...
X shape pre-filtering: (547535, 28)
X shape post-filtering: (547535, 28)


[I 2025-10-06 05:37:54,777] A new study created in memory with name: no-name-a59d3efd-544a-4907-ab99-b7e4f31f52ce


Train dates [2009-12-29 00:00:00 to 2023-12-26 00:00:00], Test dates [2024-01-02 00:00:00 to 2025-07-29 00:00:00]
X training shape: (487161, 28)
X test shape: (60374, 28)


  0%|          | 0/30 [00:00<?, ?it/s]

[I 2025-10-06 05:38:02,310] Trial 0 finished with value: 0.36794942913557704 and parameters: {'learning_rate': 0.030710573677773714, 'num_leaves': 79, 'max_depth': 8, 'min_child_samples': 38, 'subsample': 0.6468055921327309, 'colsample_bytree': 0.6467983561008608}. Best is trial 0 with value: 0.36794942913557704.
[I 2025-10-06 05:38:09,044] Trial 1 finished with value: 0.3672689725352087 and parameters: {'learning_rate': 0.011900590783184251, 'num_leaves': 76, 'max_depth': 7, 'min_child_samples': 41, 'subsample': 0.6061753482887408, 'colsample_bytree': 0.8909729556485984}. Best is trial 1 with value: 0.3672689725352087.
[I 2025-10-06 05:38:14,998] Trial 2 finished with value: 0.3676738553075738 and parameters: {'learning_rate': 0.12106896936002161, 'num_leaves': 56, 'max_depth': 6, 'min_child_samples': 25, 'subsample': 0.6912726728878613, 'colsample_bytree': 0.7574269294896714}. Best is trial 1 with value: 0.3672689725352087.
[I 2025-10-06 05:38:21,816] Trial 3 finished with value: 0.3

In [8]:
dict_params_10 = forecast.tune(
    df_input,
    horizon,
    input_columns,
    numerical_columns,
    categorical_columns,
    alpha = 0.1
)

Tuning forecast model for quantile (α=0.1) with 30 trials...


[I 2025-10-06 05:41:22,919] A new study created in memory with name: no-name-f43ed4b1-0608-4ff0-97c4-baa806ba5a8c


X shape pre-filtering: (547535, 28)
X shape post-filtering: (547535, 28)
Train dates [2009-12-29 00:00:00 to 2023-12-26 00:00:00], Test dates [2024-01-02 00:00:00 to 2025-07-29 00:00:00]
X training shape: (487161, 28)
X test shape: (60374, 28)


  0%|          | 0/30 [00:00<?, ?it/s]

[I 2025-10-06 05:41:30,406] Trial 0 finished with value: 0.0985482102239971 and parameters: {'learning_rate': 0.030710573677773714, 'num_leaves': 79, 'max_depth': 8, 'min_child_samples': 38, 'subsample': 0.6468055921327309, 'colsample_bytree': 0.6467983561008608}. Best is trial 0 with value: 0.0985482102239971.
[I 2025-10-06 05:41:37,344] Trial 1 finished with value: 0.0985594341319639 and parameters: {'learning_rate': 0.011900590783184251, 'num_leaves': 76, 'max_depth': 7, 'min_child_samples': 41, 'subsample': 0.6061753482887408, 'colsample_bytree': 0.8909729556485984}. Best is trial 0 with value: 0.0985482102239971.
[I 2025-10-06 05:41:42,755] Trial 2 finished with value: 0.09845968225223375 and parameters: {'learning_rate': 0.12106896936002161, 'num_leaves': 56, 'max_depth': 6, 'min_child_samples': 25, 'subsample': 0.6912726728878613, 'colsample_bytree': 0.7574269294896714}. Best is trial 2 with value: 0.09845968225223375.
[I 2025-10-06 05:41:49,088] Trial 3 finished with value: 0.0

### Convert to a Single Dict

In [9]:
dict_params = {}
dict_params['mean'] = dict_params_mean['best_params']
dict_params['90'] = dict_params_90['best_params']
dict_params['50'] = dict_params_50['best_params']
dict_params['10'] = dict_params_10['best_params']



### Save to resources folder

In [10]:
with open("../src/resources/model_optimal_params.json", "w") as f:
    json.dump(dict_params, f, indent=4)