make baseline global #2016

SimTheGreat · 2023-10-02T14:47:41Z

Fixes #2002

Summary

makes all baseline models Global

dennisbader · 2023-10-05T07:54:52Z

darts/models/forecasting/baselines.py

Thanks for this @SimTheGreat, it looks like a very good start to make the baseline models global 🚀

I have some notes about the general idea behind of some of the model methods:

Baseline models don't support past or future covariates. The Darts encoders generate covariates that can used by the models. So for the baseline models we can simply return (None, None, ...) in _model_encoder_settings.

the extreme_lags() property gives the expected input boundaries for the target series and covariates. Since baseline models don't support covariates, we can simply use the same logic from LocalForecastingModel.extreme_lags() for all models.

From here on I have a couple of suggestions on how to further enhance this:

we could write a new parent BaselineModel class that all base line models inherit from. This class takes care of:

fit logic that is shared across models

predict logic that is shared across models

encoder handling

extreme lags handling

I have written an example below of such a parent BaselineModel class and applied it to the NaiveMean model.

Could you try to adapt the other models (execpt NaiveEnsemble) in a similar way?

from abc import ABC, abstractmethod from typing import List, Optional, Sequence, Tuple, Union import numpy as np from darts.logging import get_logger, raise_if, raise_if_not, raise_log from darts.models.forecasting.ensemble_model import EnsembleModel from darts.timeseries import TimeSeries from darts.models.forecasting.forecasting_model import ( ForecastingModel, GlobalForecastingModel, ) from darts.utils.utils import seq2series, series2seq logger = get_logger(__name__) class BaselineModel(GlobalForecastingModel, ABC): def __init__(self): super().__init__(add_encoders=None) def fit(self, series: Union[TimeSeries, Sequence[TimeSeries]]) -> "BaselineModel": """Fit/train the model on a (or potentially multiple) series. This method is only implemented for naive baseline models to provide a unified fit/predict API with other forecasting models. The models are not really trained on the input, but they store the training `series` in case only a single `TimeSeries` was passed. This allows to call `predict()` without having to pass the single `series`. All baseline models compute the forecasts for each series directly when calling `predict()`. Parameters ---------- series One or several target time series. The model will be trained to forecast these time series. The series may or may not be multivariate, but if multiple series are provided they must have the same number of components. Returns ------- self Fitted model. """ series = seq2series(series) super().fit(series=series) self._fit_model(series=series) @abstractmethod def _fit_model(self, series: Union[TimeSeries, Sequence[TimeSeries]]) -> None: """Must implement the fit logic and checks for each sub model.""" pass def predict( self, n: int, series: Optional[Union[TimeSeries, Sequence[TimeSeries]]] = None, num_samples: int = 1 ) -> Union[TimeSeries, Sequence[TimeSeries]]: """Forecasts values for `n` time steps after the end of the series. If :func:`fit()` has been called with only one ``TimeSeries`` as argument, then the `series` argument of this function is optional, and it will simply produce the next `horizon` time steps forecast. If :func:`fit()` has been called with `series` specified as a ``Sequence[TimeSeries]`` (i.e., the model has been trained on multiple time series), the `series` argument must be specified. When the `series` argument is specified, this function will compute the next `n` time steps forecasts for the simple series (or for each series in the sequence) given by `series`. Parameters ---------- n Forecast horizon - the number of time steps after the end of the series for which to produce predictions. series The series whose future values will be predicted. Returns ------- Union[TimeSeries, Sequence[TimeSeries]] If `series` is not specified, this function returns a single time series containing the `n` next points after then end of the training series. If `series` is given and is a simple ``TimeSeries``, this function returns the `n` next points after the end of `series`. If `series` is given and is a sequence of several time series, this function returns a sequence where each element contains the corresponding `n` points forecasts. """ if series is None: # then there must be a single TS, and that was saved in super().fit as self.training_series if self.training_series is None: raise_log( ValueError( "Input `series` must be provided. This is the result either from fitting on multiple series, " "or from not having fit the model yet." ), logger, ) series = self.training_series called_with_single_series = True if isinstance(series, TimeSeries) else False series = series2seq(series) super().predict(n=n, series=series, num_samples=num_samples) predictions = self._predict(n=n, series=series, num_samples=num_samples) return predictions[0] if called_with_single_series else predictions @abstractmethod def _predict( self, n: int, series: Sequence[TimeSeries] = None, num_samples: int = 1 ) -> Sequence[TimeSeries]: """Must implement the prediction logic for each sub model.""" pass @property def extreme_lags( self, ) -> Tuple[ Optional[int], Optional[int], Optional[int], Optional[int], Optional[int], Optional[int], ]: return -self.min_train_series_length, -1, None, None, None, None @property def _model_encoder_settings( self, ) -> Tuple[ Optional[int], Optional[int], bool, bool, Optional[List[int]], Optional[List[int]], ]: """Baseline models do not support covariates and therefore also no encoders.""" return None, None, False, False, None, None def supports_multivariate(self) -> bool: return True class NaiveMean(BaselineModel): def __init__(self): """Naive Mean Model This model has no parameter, and always predicts the mean value of the training series. Examples -------- >>> from darts.datasets import AirPassengersDataset >>> from darts.models import NaiveMean >>> series = AirPassengersDataset().load() >>> model = NaiveMean() >>> model.fit(series) >>> pred = model.predict(6) >>> pred.values() array([[280.29861111], [280.29861111], [280.29861111], [280.29861111], [280.29861111], [280.29861111]]) """ super().__init__() def _fit_model(self, series: Union[TimeSeries, Sequence[TimeSeries]]) -> None: super()._fit_model(series) def _predict( self, n: int, series: Sequence[TimeSeries] = None, num_samples: int = 1 ) -> Sequence[TimeSeries]: predictions = [] for series_ in series: mean_val = np.mean(series_.values(copy=False), axis=0) predictions.append( self._build_forecast_series( points_preds=np.tile(mean_val, (n, 1)), input_series=series_, ) ) return predictions

dennisbader

Very nice, thanks a lot @SimTheGreat, this looks great!

Only added some minor suggestions.

Before merging, we should:

add the _fit_wrapper and _predict_wrapper methods without past/future covariates support to the BaselineModel class to fix the current errors.
add this PR/improvement to the CHANGELOG.md file
README.md:
- change the LocalForecastingModel link to GlobalForecastingModel for the naive baselines in the forecasting model table
- add multivariate and multiple target series support for baseline models
darts/docs/userguide/covariates.md:
- update the text for Global forecasting models with baseline models
- move the baseline models in the model tabel to the global forecasting models
add some tests for the naive models now being global. I think these tests adapted for the naive models should be fine:
- darts.tests.models.forecasting.test_global_forecasting_model.test_single_ts()
- darts.tests.models.forecasting.test_global_forecasting_model.test_multi_ts()
- darts.tests.models.forecasting.test_global_forecasting_model.test_prediction_with_different_n()

Let me know if you need help with anything

dennisbader · 2023-10-07T13:27:15Z

darts/models/forecasting/baselines.py

-        return self._build_forecast_series(forecast)
+        predictions = []
+        for series_ in series:
+            last_k_vals = series_.values(copy=False)[-self.K :, :]


we should check that each series has at least K points, as done previously in fit()

dennisbader · 2023-10-07T13:37:50Z

darts/models/forecasting/baselines.py

+
+        series = series2seq(series)
+        super().predict(n=n, series=series, num_samples=num_samples)
+        predictions = self._predict(n=n, series=series, num_samples=num_samples)


nice :) It looks like all models now use the same for loop to go over all input series.
We can put the for here into the base class:

Suggested change

predictions = self._predict(n=n, series=series, num_samples=num_samples)

predictions = []

for series_ in series:

predictions.append(

self._build_forecast_series(

points_preds=self._predict(

n=n, series=series_, num_samples=num_samples

),

input_series=series_,

)

)

And then _predict() from all models can be adapted to act on a single TimeSeries and return a single np.ndarray (see comment below)

dennisbader · 2023-10-07T13:38:20Z

darts/models/forecasting/baselines.py

+    @abstractmethod
+    def _predict(
+        self, n: int, series: Sequence[TimeSeries] = None, num_samples: int = 1
+    ) -> Sequence[TimeSeries]:


Suggested change

@abstractmethod

def _predict(

self, n: int, series: Sequence[TimeSeries] = None, num_samples: int = 1

) -> Sequence[TimeSeries]:

@abstractmethod

def _predict(

self, n: int, series: TimeSeries = None, num_samples: int = 1

) -> np.ndarray:

dennisbader · 2023-10-07T13:52:11Z

darts/models/forecasting/baselines.py

+        """
+        series = seq2series(series)
+        super().fit(series=series)
+        self._fit_model(series=series)


It looks like none of the models actually use the _fit_model method, so we can remove the method and save some lines :)

Suggested change

self._fit_model(series=series)

dennisbader · 2023-10-07T13:52:20Z

darts/models/forecasting/baselines.py

+    @abstractmethod
+    def _fit_model(self, series: Union[TimeSeries, Sequence[TimeSeries]]) -> None:
+        """Must implement the fit logic and checks for each sub model."""
+        pass


Suggested change

@abstractmethod

def _fit_model(self, series: Union[TimeSeries, Sequence[TimeSeries]]) -> None:

"""Must implement the fit logic and checks for each sub model."""

pass

VascoSch92 · 2024-02-04T16:22:53Z

darts/models/forecasting/baselines.py

+    @abstractmethod
+    def _fit_model(self, series: Union[TimeSeries, Sequence[TimeSeries]]) -> None:
+        """Must implement the fit logic and checks for each sub model."""
+        pass


I think you don't have to write the pass (I'm sure for 3.9+ but check for 3.8 :-) )

VascoSch92 · 2024-02-04T16:24:49Z

darts/models/forecasting/baselines.py

+
+    @abstractmethod
+    def _predict(
+        self, n: int, series: Sequence[TimeSeries] = None, num_samples: int = 1


put a comma at the end. in this way the method looks like the override ones

VascoSch92 · 2024-02-04T16:25:35Z

darts/models/forecasting/baselines.py


+        for series_ in series:
+            first, last = (


first, last = series_.first_values(), series_.last_values(),

VascoSch92 · 2024-02-04T16:26:28Z

darts/models/forecasting/baselines.py

+
+            chunk_length = self.input_chunk_length
+            for i in range(chunk_length, chunk_length + n):
+                prediction = rolling_sum / chunk_length


this is a constant. You can put it outside of the for loop

or it is an error?

dennisbader · 2024-03-06T12:14:10Z

Closing this one, since #2261 was merged.

make baseline global

a62ff5e

SimTheGreat requested a review from dennisbader as a code owner October 2, 2023 14:47

dennisbader requested changes Oct 5, 2023

View reviewed changes

make baseline models global changes

04c6017

dennisbader reviewed Oct 7, 2023

View reviewed changes

VascoSch92 reviewed Feb 4, 2024

View reviewed changes

dennisbader closed this Mar 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make baseline global #2016

make baseline global #2016

SimTheGreat commented Oct 2, 2023

dennisbader Oct 5, 2023 •

edited

dennisbader left a comment •

edited

dennisbader Oct 7, 2023

dennisbader Oct 7, 2023

dennisbader Oct 7, 2023

dennisbader Oct 7, 2023

dennisbader Oct 7, 2023

VascoSch92 Feb 4, 2024

VascoSch92 Feb 4, 2024

VascoSch92 Feb 4, 2024

VascoSch92 Feb 4, 2024

dennisbader commented Mar 6, 2024

make baseline global #2016

make baseline global #2016

Conversation

SimTheGreat commented Oct 2, 2023

Summary

dennisbader Oct 5, 2023 • edited

Choose a reason for hiding this comment

dennisbader left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dennisbader commented Mar 6, 2024

dennisbader Oct 5, 2023 •

edited

dennisbader left a comment •

edited