# Stacking ensemble machine learning models to improve forecasting

In machine learning, stacking is an ensemble technique that combines multiple models to reduce their biases and improve predictive performance. More specifically, the predictions of each model (base models) are stacked and used as input to a final model (meta model) to compute the prediction.

Stacking is effective because it leverages the strengths of different algorithms and attempts to mitigate their individual weaknesses. By combining several models, it can capture complex patterns in the data and improve prediction accuracy.

However, stacking can be computationally expensive and requires careful tuning to avoid overfitting. To this end, it is highly recommended to train the final estimator through cross-validation. In addition, obtaining diverse and well-performing base models is critical to the success of the stacking technique.

With scikit-learn it is very easy to combine multiple regressors thanks to its [StackingRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.StackingRegressor.html#sklearn.ensemble.StackingRegressor) class. The `estimators` parameter corresponds to the list of the estimators (base learners) which are stacked together in parallel on the input data. It should be given as a list of names and estimators. The `final_estimator` (meta model) will use the predictions of the estimators as input.

<div class="admonition note" name="html-admonition" style="background: rgba(0,184,212,.1); padding-top: 0px; padding-bottom: 6px; border-radius: 8px; border-left: 8px solid #00b8d4; padding-left: 10px;">

<p class="title">
    <i style="font-size: 18px; color:#00b8d4;"></i>
    <b style="color: #00b8d4;">&#9998 Note</b>
</p>

See <a href="https://skforecast.org/latest/faq/cyclical-features-time-series.html" target="_blank">Stacking (ensemble) machine learning models to improve forecasting</a> for a more detailed example of stacking models.

</div>

## Libraries and Data

In [5]:
# Data processing
# ==============================================================================
import numpy as np
import pandas as pd

# Plots
# ==============================================================================
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.io as pio
pio.templates.default = "seaborn"
plt.style.use('seaborn-v0_8-darkgrid')

# Modelling and Forecasting
# ==============================================================================
from lightgbm import LGBMRegressor
from sklearn.linear_model import Ridge
from sklearn.ensemble  import StackingRegressor
from sklearn.model_selection  import KFold

from skforecast.ForecasterAutoreg import ForecasterAutoreg
from skforecast.model_selection import grid_search_forecaster
from skforecast.model_selection import backtesting_forecaster
from skforecast.datasets import fetch_dataset

  from .autonotebook import tqdm as notebook_tqdm


In [6]:
# Data
# ==============================================================================
data = fetch_dataset(name = 'fuel_consumption')
data = data.loc[:"2019-01-01", ['Gasolinas']]
data = data.rename(columns = {'Gasolinas':'consumption'})
data.index.name = 'date'
data['consumption'] = data['consumption']/100000
data.head(3)

fuel_consumption
----------------
Monthly fuel consumption in Spain from 1969-01-01 to 2022-08-01.
Obtained from Corporación de Reservas Estratégicas de Productos Petrolíferos and
Corporación de Derecho Público tutelada por el Ministerio para la Transición
Ecológica y el Reto Demográfico. https://www.cores.es/es/estadisticas
Shape of the dataset: (644, 5)


Unnamed: 0_level_0,consumption
date,Unnamed: 1_level_1
1969-01-01,1.668752
1969-02-01,1.554668
1969-03-01,1.849837


In addition to the past values of the series (lags), an additional variable indicating the month of the year is added. This variable is included in the model to capture the seasonality of the series.

In [7]:
# Calendar features
# ==============================================================================
data['month_of_year'] = data.index.month
data.head(3)

Unnamed: 0_level_0,consumption,month_of_year
date,Unnamed: 1_level_1,Unnamed: 2_level_1
1969-01-01,1.668752,1
1969-02-01,1.554668,2
1969-03-01,1.849837,3


To facilitate the training of the models, the search for optimal hyperparameters, and the evaluation of their predictive accuracy, the data are divided into three separate sets: training, validation, and test.

In [8]:
# Split train-validation-test
# ==============================================================================
end_train = '2007-12-01 23:59:00'
end_validation = '2012-12-01 23:59:00'
data_train = data.loc[: end_train, :]
data_val   = data.loc[end_train:end_validation, :]
data_test  = data.loc[end_validation:, :]

print(f"Dates train      : {data_train.index.min()} --- {data_train.index.max()}  (n={len(data_train)})")
print(f"Dates validacion : {data_val.index.min()} --- {data_val.index.max()}  (n={len(data_val)})")
print(f"Dates test       : {data_test.index.min()} --- {data_test.index.max()}  (n={len(data_test)})")

Dates train      : 1969-01-01 00:00:00 --- 2007-12-01 00:00:00  (n=468)
Dates validacion : 2008-01-01 00:00:00 --- 2012-12-01 00:00:00  (n=60)
Dates test       : 2013-01-01 00:00:00 --- 2019-01-01 00:00:00  (n=73)


## StackingRegressor

With scikit-learn it is very easy to combine multiple regressors thanks to its [StackingRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.StackingRegressor.html#sklearn.ensemble.StackingRegressor) class.

The `estimators` parameter corresponds to the list of the estimators (base learners) which are stacked together in parallel on the input data. It should be given as a list of names and estimators. The `final_estimator` (meta model) will use the predictions of the estimators as input.

In [9]:
# Create stacking regressor
# ==============================================================================
params_ridge = {'alpha': 0.001}
params_lgbm = {'learning_rate': 0.1, 'max_depth': 5, 'n_estimators': 500}

estimators = [
    ('ridge', Ridge(**params_ridge)),
    ('lgbm', LGBMRegressor(random_state=42, **params_lgbm)),
]
stacking_regressor = StackingRegressor(
                        estimators = estimators,
                        final_estimator = Ridge(),
                        cv = KFold(n_splits=5, shuffle=False)

                     )
stacking_regressor

In [10]:
# Create forecaster
# ==============================================================================
forecaster = ForecasterAutoreg(
                 regressor = stacking_regressor,
                 lags = 12 # Last 12 months used as predictors
             )

In [11]:
# Backtesting on test data
# ==============================================================================
metric, predictions = backtesting_forecaster(
                            forecaster         = forecaster,
                            y                  = data['consumption'],
                            exog               = data['month_of_year'],
                            initial_train_size = len(data.loc[:end_validation]),
                            fixed_train_size   = False,
                            steps              = 12, # Forecast horizon
                            refit              = False,
                            metric             = 'mean_squared_error',
                            n_jobs             = 'auto',
                            verbose            = False
                      )        

print(f"Backtest error: {metric:.2f}")

[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.043877 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2089
[LightGBM] [Info] Number of data points in the train set: 516, number of used features: 13
[LightGBM] [Info] Start training from score 5.564194
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000443 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 1669
[LightGBM] [Info] Number of data points in the train set: 412, number of used features: 13
[LightGBM] [Info] Start training from score 6.075838
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000254 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 1669
[LightGBM] [Info] Number of data points in the train set: 413, number of used features: 13
[LightGBM] [Info] Start training 

 29%|██▊       | 2/7 [00:00<00:00, 15.27it/s]



 57%|█████▋    | 4/7 [00:00<00:00, 15.64it/s]



100%|██████████| 7/7 [00:00<00:00, 16.44it/s]


Backtest error: 0.05


## Hiperparameters search of StackingRegressor

When using `StackingRegressor`, the hyperparameters of individual regressors must be prefixed with the name of the regressor followed by two underlines. For example, the hyperparameter `alpha` of the Ridge regressor must be specified as `ridge__alpha`. The hyperparameter of the final estimator must be specified with the `final_estimator__` prefix.

In [12]:
# Grid search of hyperparameters and lags
# ==============================================================================
param_grid = {
    'ridge__alpha': [0.1, 1, 10],
    'lgbm__n_estimators': [100, 500],
    'lgbm__max_depth': [3, 5, 10],
    'lgbm__learning_rate': [0.01, 0.1]
}

# Lags used as predictors
lags_grid = [24]

results_grid = grid_search_forecaster(
                   forecaster         = forecaster,
                   y                  = data['consumption'],
                   exog               = data['month_of_year'],
                   param_grid         = param_grid,
                   lags_grid          = lags_grid,
                   steps              = 36,
                   refit              = False,
                   metric             = 'mean_squared_error',
                   initial_train_size = len(data.loc[:end_train]),
                   fixed_train_size   = False,
                   return_best        = True,
                   n_jobs             = 'auto',
                   verbose            = False
               )

results_grid.head()

Number of models compared: 36.


lags grid:   0%|          | 0/1 [00:00<?, ?it/s]

[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000487 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825


[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000562 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000457 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 5.976745
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000559 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.003913 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.001262 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000369 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000533 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.002652 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000516 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000402 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
Collecting plotly
  Downloading plotly-5.18.0-py3-none-any.whl.metadata (7.0 kB)
Collecting tenacity>=6.2.0 (from plotly)
  Downloading tenacity-8.2.3-py3-none-any.whl.metadata (1.0 kB)
Downloading plotly-5.18.0-py3-none-any.whl (15.6 MB)
   ---------------------------------------- 0.0/15.6 MB ? eta -:--:--
   ---------------------------------------- 0.0/15.6 MB ? eta -:--:--
   ---------------------------------------- 0.0/15.6 MB 435.7 kB/s eta 0:00:36
   ---------------------------------------- 0.0/15.6 MB 326.8 kB/s eta 0:00:48
   ---------------------------------------- 0.1/15.6 MB 409.6 kB/s eta 0:00:39
   ---------------------------------------- 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.003235 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000465 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000749 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000778 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000359 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000653 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.001915 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000423 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000812 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000616 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000328 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.001958 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000488 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000378 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000340 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.001907 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000423 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000390 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000308 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000241 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000204 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000256 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000217 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000199 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000316 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000341 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000310 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000372 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000353 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000300 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000306 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000386 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000300 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000362 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000381 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000290 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000392 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000253 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000490 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.002003 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000359 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000365 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000240 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000276 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000260 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000286 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000250 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000365 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 35



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000336 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000438 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000633 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000430 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.032611 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000382 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 35



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000338 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000280 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000194 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000220 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000172 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000296 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000408 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000348 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000306 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000277 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000453 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000428 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000281 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000244 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000281 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000227 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000280 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000253 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000631 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000240 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000183 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000220 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000193 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000191 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000194 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000179 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000172 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000199 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000199 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000210 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000212 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000202 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000207 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000224 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000178 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000193 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000179 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000177 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000196 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 



[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000245 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3589
[LightGBM] [Info] Number of data points in the train set: 444, number of used features: 25
[LightGBM] [Info] Start training from score 5.765825
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000170 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training from score 6.303141
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000236 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2869
[LightGBM] [Info] Number of data points in the train set: 355, number of used features: 25
[LightGBM] [Info] Start training 

lags grid: 100%|██████████| 1/1 [01:09<00:00, 69.92s/it]

[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000326 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 4645
[LightGBM] [Info] Number of data points in the train set: 577, number of used features: 25
[LightGBM] [Info] Start training from score 5.428649
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000225 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3709
[LightGBM] [Info] Number of data points in the train set: 461, number of used features: 25
[LightGBM] [Info] Start training from score 5.806790
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000214 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3709
[LightGBM] [Info] Number of data points in the train set: 461, number of used features: 25
[LightGBM] [Info] Start training 




[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000254 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3733
[LightGBM] [Info] Number of data points in the train set: 462, number of used features: 25
[LightGBM] [Info] Start training from score 4.939236
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000721 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3733
[LightGBM] [Info] Number of data points in the train set: 462, number of used features: 25
[LightGBM] [Info] Start training from score 5.228287
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000225 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3733
[LightGBM] [Info] Number of data points in the train set: 462, number of used features: 25
[LightGBM] [Info] Start training 

Unnamed: 0,lags,params,mean_squared_error,lgbm__learning_rate,lgbm__max_depth,lgbm__n_estimators,ridge__alpha
0,"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14...","{'lgbm__learning_rate': 0.01, 'lgbm__max_depth...",0.508066,0.01,3.0,100.0,0.1
1,"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14...","{'lgbm__learning_rate': 0.01, 'lgbm__max_depth...",0.51929,0.01,3.0,100.0,1.0
6,"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14...","{'lgbm__learning_rate': 0.01, 'lgbm__max_depth...",0.531137,0.01,5.0,100.0,0.1
12,"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14...","{'lgbm__learning_rate': 0.01, 'lgbm__max_depth...",0.532003,0.01,10.0,100.0,0.1
33,"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14...","{'lgbm__learning_rate': 0.1, 'lgbm__max_depth'...",0.535757,0.1,10.0,500.0,0.1


Once the best hyperparameters have been determined for each regressor in the ensemble, the test error is computed through back-testing.

In [13]:
# Backtesting on test data
# ==============================================================================
metric, predictions = backtesting_forecaster(
                            forecaster         = forecaster,
                            y                  = data['consumption'],
                            exog               = data['month_of_year'],
                            initial_train_size = len(data.loc[:end_validation]),
                            fixed_train_size   = False,
                            steps              = 12, # Forecast horizon
                            refit              = False,
                            metric             = 'mean_squared_error',
                            n_jobs             = 'auto',
                            verbose            = False
                      )        

print(f"Backtest error: {metric:.2f}")

[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000297 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 4069
[LightGBM] [Info] Number of data points in the train set: 504, number of used features: 25
[LightGBM] [Info] Start training from score 5.639092
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000319 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3253
[LightGBM] [Info] Number of data points in the train set: 403, number of used features: 25
[LightGBM] [Info] Start training from score 6.108009
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000338 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 3253
[LightGBM] [Info] Number of data points in the train set: 403, number of used features: 25
[LightGBM] [Info] Start training 

  0%|          | 0/7 [00:00<?, ?it/s]

100%|██████████| 7/7 [00:00<00:00, 22.78it/s]

Backtest error: 0.01





## Feature importance in StackingRegressor

When a regressor of type `StackingRegressor` is used as a regressor in a predictor, its `get_feature_importances` method will not work. This is because objects of type `StackingRegressor` do not have either the `feature_importances` or `coef_` attribute. Instead, it is necessary to inspect each of the regressors that are part of the stacking.

In [20]:
# Feature importances for each regressor in the stacking
# ==============================================================================
if forecaster.regressor.__class__.__name__ == 'StackingRegressor':
    importancia_pred = []
    for regressor in forecaster.regressor.estimators_:
        try:
            importancia = pd.DataFrame(
                data = {
                    'feature': forecaster.regressor.feature_names_in_,
                    f'importance_{type(regressor).__name__}': regressor.coef_,
                    f'importance_abs_{type(regressor).__name__}': np.abs(regressor.coef_)
                }
            ).set_index('feature')
        except:
            importancia = pd.DataFrame(
                data = {
                    'feature': forecaster.regressor.feature_names_in_,
                    f'importance_{type(regressor).__name__}': regressor.feature_importances_,
                    f'importance_abs_{type(regressor).__name__}': np.abs(regressor.feature_importances_)
                }
            ).set_index('feature')
        importancia_pred.append(importancia)
    
    importancia_pred = pd.concat(importancia_pred, axis=1)
    
else:
    importancia_pred = forecaster.get_feature_importances()
    importancia_pred['importance_abs'] = importancia_pred['importance'].abs()
    importancia_pred = importancia_pred.sort_values(by='importance_abs', ascending=False)

importancia_pred.head(5)

Unnamed: 0_level_0,importance_Ridge,importance_abs_Ridge,importance_LGBMRegressor,importance_abs_LGBMRegressor
feature,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
lag_1,0.020984,0.020984,59,59
lag_2,0.216998,0.216998,1,1
lag_3,0.188519,0.188519,0,0
lag_4,0.200916,0.200916,0,0
lag_5,0.106734,0.106734,0,0
