# Darts benchamrk 

This notebook showcases tools developed for benchmarking the different forecasting models in Darts, which could be useful for other developers.
We will showcase:

**Auto-ML with Optuna and raytune**: hyperparameter tuning with predefined broad parameter space for all model supported<br />
**Illustrate**: A function to visualise the behavior of different darts models on a dataset <br />
**Experiment**: A function to perform cross comparison of a list of models on a list of datasets

In [None]:
#@title Install dependencies
!pip install -q git+https://github.com/Loudegaste/darts-benchmark

from darts_benchmark.benchmark_tools import Dataset, experiment, silence_prompt, illustrate
from darts_benchmark.model_evaluation import evaluate_model
from darts_benchmark.model_evaluation import set_randommness
from darts.models import NaiveSeasonal, NHiTSModel, Prophet, NLinearModel
from darts.utils import missing_values
from darts_benchmark.optuna_search import optuna_search
from darts.datasets import AirPassengersDataset, WeatherDataset
from darts.metrics import mae

set_randommness(42)

### Dataset loading
To ensure a uniform structure for the data loading, each dataset must be contained in a ```benchmark_tools.Dataset``` Named tuple with the following fields:

**name**: Dataset name for display<br />
**series**: A darts.timeseries object<br />
**future_covariates**:(Optional) future covariates used as support for the forecast<br />
**past_covariates**:(Optional) past covariates used as support for the forecast<br />


In [None]:
#@title Preparing two datasets: Air passengers and Weather

air_pass_series = missing_values.fill_missing_values(AirPassengersDataset().load())
weather_ds = missing_values.fill_missing_values(WeatherDataset().load().resample("1h"))[
    -1500:
]
weather_past_cov = weather_ds[
    ["p (mbar)", "wv (m/s)", "wd (deg)", "rain (mm)", "raining (s)", "SWDR (W/m²)"]
]
weather_series = weather_ds["T (degC)"]

dataset_air_passengers = Dataset(name="Air passengers", series=air_pass_series)
dataset_weather = Dataset(
    name="Weather", series=weather_series, past_covariates=weather_past_cov
)

###  Hyperparameter search using optuna and raytune

The function ```optuna_search.optuna_search``` implements a hyperparameter search for the provided model on the provided dataset. For the search, the function will lookup the hyper parameter space from ```optuna_search/param_space.py```. 

In param_space.py, for each model supported, there will be:
* **fixed_params_"modelname" function:** it defines default parameters for the model
* **optuna_params_"modelname" function:** it defines the parameter space for optuna too search.

To add support for another model, simply add the corresponding 2 functions for the desired model and add the function to the ditionnary lookup at the end of the param_space.py file.

In [None]:
#@title Hyperparameter search using optuna and raytune

silence_prompt()
model_class = NLinearModel
time_budget = 60
forecast_horizon = 5
split = 0.75

best_params_60sec = optuna_search(
    model_class=model_class, 
    dataset=dataset_air_passengers,
    time_budget=time_budget,
    forecast_horizon=forecast_horizon,
    scale_data=True,
)

# display the output after hyperparameter search
error_60sec, forecasts_60sec = evaluate_model( # type: ignore
    model_class, dataset_air_passengers.series,
    model_params=best_params_60sec,
    forecast_horizon=forecast_horizon,
    split=split,
    get_output_sample=True,
)

error_default, forecasts_default = evaluate_model( # type: ignore
    model_class, dataset_air_passengers.series,
    model_params=None,
    forecast_horizon=forecast_horizon,
    split=split,
    get_output_sample=True,
)

dataset_air_passengers.series.plot(label="actual")
forecasts_default.plot(label=f"prediction_default\nerror={error_default:.2f}")
forecasts_60sec.plot(label=f"prediction_60sec\nerror={error_60sec:.2f}")

###  Model comparison

To compare visually the performance of different models on a dataset, the ```benchmark_tools.illustrate``` function fits and plots multiple models in a standardized way. With ```grid_search=True``` and ```time_budget: int```, the ```illustrate``` function will first perform an optuna hyperparameter search 

In [None]:
#@title Model comparison
illustrate(models=[NaiveSeasonal, NHiTSModel, Prophet, NLinearModel], 
           dataset=dataset_air_passengers, 
           forecast_horizon=5, grid_search=False, 
           silent_search=True)

###  Experiment

To run a full scale comparison of multiple models on multiple datasets, the ```benchmark_tools.experiment``` function fits and evaluates a list of models on alist of datasets. With ```grid_search=True``` and ```time_budget: int```, the ```experiment``` function will first perform an optuna hyperparameter search for each model/dataset combinaison.

In [None]:
results = experiment(
    list_datasets=[dataset_air_passengers, dataset_weather],
    models=[NaiveSeasonal, NHiTSModel, Prophet],
    grid_search=False,
    forecast_horizon=0.01, # the forecast horizon will be set to 1% of the time series length
    repeat=3,
    silent_search=True,
)

print("\n\n", results)