# Benchmarking skforecast Recursive Forecasters

This notebook benchmarks the performance (velocity) of the `skforecast` in its different versions and keeps track of the results.

**Notes**

+ In version `0.15.0` the binning of residuals was introduced in multi-series forecasters. This explains the increase in the time taken to fit the model.
+ Since version `0.17.0`, the `RecursiveMultiSeriesForecaster` only accepts as input a long format DataFrame with a MultiIndex or a dictionary of series. Wide format DataFrames where each column is a different time series are no longer supported. If input data is a pandas dataframe with with MultiIndex, it is internally converted to a dictionary of series what increases notably the computation time.

In [1]:
%load_ext autoreload
%autoreload 2
import sys
from pathlib import Path
path = str(Path.cwd().parent)
print(path)
sys.path.insert(1, path)

c:\Users\jaesc2\GitHub\skforecast


In [None]:
# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
import sklearn
import joblib
import skforecast
import platform
import psutil
import plotly.express as px
import plotly.graph_objects as go
from benchmarks.utils import plot_benchmark_results

In [3]:
print(f"Python version: {platform.python_version()}")
print(f"skforecast version: {skforecast.__version__}")
print(f"scikit-learn version: {sklearn.__version__}")
print(f"pandas version: {pd.__version__}")
print(f"numpy version: {np.__version__}")
print(f"Computer network name: {platform.node()}")
print(f"Processor type: {platform.processor()}")
print(f"Platform type: {platform.platform()}")
print(f"Operating system: {platform.system()}")
print(f"Operating system release: {platform.release()}")
print(f"Operating system version: {platform.version()}")
print(f"Number of physical cores: {psutil.cpu_count(logical=False)}")
print(f"Number of logical cores: {psutil.cpu_count(logical=True)}")

Python version: 3.12.11
skforecast version: 0.17.0
scikit-learn version: 1.6.1
pandas version: 2.3.1
numpy version: 2.1.3
Computer network name: ITES015-NB0029
Processor type: Intel64 Family 6 Model 141 Stepping 1, GenuineIntel
Platform type: Windows-11-10.0.26100-SP0
Operating system: Windows
Operating system release: 11
Operating system version: 10.0.26100
Number of physical cores: 8
Number of logical cores: 16


In [4]:
import warnings
warnings.filterwarnings(
    "ignore",
    category=FutureWarning,
    message="'force_all_finite' was renamed to 'ensure_all_finite'"
)

# ForecasterRecursive

In [None]:
display_df = False
selected_date = None
# 'Linux-6.11.0-24-generic-x86_64-with-glibc2.39'
# 'Windows-11-10.0.26100-SP0'
selected_platform = None
python_version = None

results_benchmark_all = joblib.load("./benchmark.joblib")
results_benchmark = results_benchmark_all.query("forecaster_name in ['ForecasterRecursive', 'ForecasterAutoreg']")
results_benchmark = results_benchmark.query("regressor_name == 'DummyRegressor'")
results_benchmark.head(2)

Unnamed: 0,forecaster_name,regressor_name,function_name,function_hash,method_name,run_time_avg,run_time_median,run_time_p95,run_time_std,n_repeats,...,python_version,skforecast_version,numpy_version,pandas_version,sklearn_version,lightgbm_version,platform,processor,cpu_count,memory_gb
0,ForecasterRecursive,DummyRegressor,ForecasterRecursive__create_train_X_y,59b823f1ff395872fac4f7578bd859fa,_create_train_X_y,0.004261,0.004189,0.004373,0.000321,30,...,3.12.11,0.17.0,2.1.3,2.3.2,1.6.1,4.6.0,Linux-6.11.0-1018-azure-x86_64-with-glibc2.39,x86_64,4,16.77
1,ForecasterRecursive,DummyRegressor,ForecasterRecursive_fit,9d73eaf5faa980194d715362601eed68,fit,0.005308,0.005252,0.005579,0.000161,10,...,3.12.11,0.17.0,2.1.3,2.3.2,1.6.1,4.6.0,Linux-6.11.0-1018-azure-x86_64-with-glibc2.39,x86_64,4,16.77


In [52]:
plot_benchmark_results_v2(
    results_benchmark_all,
    forecaster_names=['ForecasterRecursive', 'ForecasterAutoreg'],
    regressors='DummyRegressor',
    add_mean=True,
    add_median=True
)