unit8co · dennisbader · Apr 16, 2024 · Apr 12, 2024 · Apr 15, 2024 · Apr 15, 2024
@@ -90,14 +90,18 @@ but cannot always guarantee backwards compatibility. Changes that may **break co
 - Improvements to `RegressionModel`: [#2320](https://github.com/unit8co/darts/pull/2320) by [Felix Divo](https://github.com/felixdivo).
   - Added a progress bar when performing optimized historical forecasts (`retrain=False` and no autoregression) to display the series-level progress.
 - Improvements to `DataTransformer`: [#2267](https://github.com/unit8co/darts/pull/2267) by [Alicja Krzeminska-Sciga](https://github.com/alicjakrzeminska).
-  - `InvertibleDataTransformer` now supports parallelized inverse transformation for `series` being a list of lists of `TimeSeries` (`Sequence[Sequence[TimeSeries]]`). This `series` type represents for example the output from `historical_forecasts()` when using multiple series. 
+  - `InvertibleDataTransformer` now supports parallelized inverse transformation for `series` being a list of lists of `TimeSeries` (`Sequence[Sequence[TimeSeries]]`). This `series` type represents for example the output from `historical_forecasts()` when using multiple series.
+- Improvements to `RNNModel`: [#2329](https://github.com/unit8co/darts/pull/2329) by [Dennis Bader](https://github.com/dennisbader).
+  - 🔴 Enforce `training_length>input_chunk_length` since otherwise, during training the model is never run for as many iterations as it will during prediction.
+  - Historical forecasts now correctly infer all possible prediction start points for untrained and pre-trained `RNNModel`.
 
 **Fixed**
 - Fixed a bug in `quantile_loss`, where the loss was computed on all samples rather than only on the predicted quantiles. [#2284](https://github.com/unit8co/darts/pull/2284) by [Dennis Bader](https://github.com/dennisbader).
 - Fixed type hint warning "Unexpected argument" when calling `historical_forecasts()` caused by the `_with_sanity_checks` decorator. The type hinting is now properly configured to expect any input arguments and return the output type of the method for which the sanity checks are performed for. [#2286](https://github.com/unit8co/darts/pull/2286) by [Dennis Bader](https://github.com/dennisbader).
 - Fixed the order of the features when using component-wise lags so that they are grouped by values, then by components (before, were grouped by components, then by values). [#2272](https://github.com/unit8co/darts/pull/2272) by [Antoine Madrona](https://github.com/madtoinou).
 - Fixed a segmentation fault that some users were facing when importing a `LightGBMModel`. [#2304](https://github.com/unit8co/darts/pull/2304) by [Dennis Bader](https://github.com/dennisbader).
-- Fixed a bug when using a dropout with a `TorchForecasting` and pytorch lightning versions >= 2.2.0, where the dropout was not properly activated during training. [#2312](https://github.com/unit8co/darts/pull/2312) by [Dennis Bader](https://github.com/dennisbader).
+- Fixed a bug when using a dropout with a `TorchForecastingModel` and pytorch lightning versions >= 2.2.0, where the dropout was not properly activated during training. [#2312](https://github.com/unit8co/darts/pull/2312) by [Dennis Bader](https://github.com/dennisbader).
+- Fixed a bug when performing historical forecasts with an untrained `TorchForecastingModel` and using covariates, where the historical forecastable time index generation did not take the covariates into account. [#2329](https://github.com/unit8co/darts/pull/2329) by [Dennis Bader](https://github.com/dennisbader).
 
 **Dependencies**
 

@@ -402,6 +402,7 @@ def extreme_lags(
         Optional[int],
         Optional[int],
         int,
+        Optional[int],
     ]:
         def find_max_lag_or_none(lag_id, aggregator) -> Optional[int]:
             max_lag = None
@@ -413,7 +414,7 @@ def find_max_lag_or_none(lag_id, aggregator) -> Optional[int]:
                     max_lag = aggregator(max_lag, curr_lag)
             return max_lag
 
-        lag_aggregators = (min, max, min, max, min, max, max)
+        lag_aggregators = (min, max, min, max, min, max, max, max)
         return tuple(
             find_max_lag_or_none(i, agg) for i, agg in enumerate(lag_aggregators)
         )

@@ -446,12 +446,13 @@ def extreme_lags(
         Optional[int],
         Optional[int],
         int,
+        Optional[int],
     ]:
         """
-        A 7-tuple containing in order:
+        A 8-tuple containing in order:
         (min target lag, max target lag, min past covariate lag, max past covariate lag, min future covariate
-        lag, max future covariate lag, output shift). If 0 is the index of the first prediction, then all lags are
-        relative to this index.
+        lag, max future covariate lag, output shift, max target lag train (only for RNNModel)). If 0 is the index of the
+        first prediction, then all lags are relative to this index.
 
         See examples below.
 
@@ -474,27 +475,27 @@ def extreme_lags(
         >>> model = LinearRegressionModel(lags=3, output_chunk_length=2)
         >>> model.fit(train_series)
         >>> model.extreme_lags
-        (-3, 1, None, None, None, None, 0)
+        (-3, 1, None, None, None, None, 0, None)
         >>> model = LinearRegressionModel(lags=3, output_chunk_length=2, output_chunk_shift=2)
         >>> model.fit(train_series)
         >>> model.extreme_lags
-        (-3, 1, None, None, None, None, 2)
+        (-3, 1, None, None, None, None, 2, None)
         >>> model = LinearRegressionModel(lags=[-3, -5], lags_past_covariates = 4, output_chunk_length=7)
         >>> model.fit(train_series, past_covariates=past_covariates)
         >>> model.extreme_lags
-        (-5, 6, -4, -1,  None, None, 0)
+        (-5, 6, -4, -1,  None, None, 0, None)
         >>> model = LinearRegressionModel(lags=[3, 5], lags_future_covariates = [4, 6], output_chunk_length=7)
         >>> model.fit(train_series, future_covariates=future_covariates)
         >>> model.extreme_lags
-        (-5, 6, None, None, 4, 6, 0)
+        (-5, 6, None, None, 4, 6, 0, None)
         >>> model = NBEATSModel(input_chunk_length=10, output_chunk_length=7)
         >>> model.fit(train_series)
         >>> model.extreme_lags
-        (-10, 6, None, None, None, None, 0)
+        (-10, 6, None, None, None, None, 0, None)
         >>> model = NBEATSModel(input_chunk_length=10, output_chunk_length=7, lags_future_covariates=[4, 6])
         >>> model.fit(train_series, future_covariates)
         >>> model.extreme_lags
-        (-10, 6, None, None, 4, 6, 0)
+        (-10, 6, None, None, 4, 6, 0, None)
         """
 
     @property
@@ -510,10 +511,13 @@ def _training_sample_time_index_length(self) -> int:
             min_future_cov_lag,
             max_future_cov_lag,
             output_chunk_shift,
+            max_target_lag_train,
         ) = self.extreme_lags
 
+        # some models can have different output chunks for training and prediction (e.g. `RNNModel`)
+        output_lag = max_target_lag_train or max_target_lag
         return max(
-            max_target_lag + 1,
+            output_lag + 1,
             max_future_cov_lag + 1 if max_future_cov_lag else 0,
         ) - min(
             min_target_lag if min_target_lag else 0,
@@ -2452,12 +2456,13 @@ def extreme_lags(
         Optional[int],
         Optional[int],
         int,
+        Optional[int],
     ]:
         # TODO: LocalForecastingModels do not yet handle extreme lags properly. Especially
         #  TransferableFutureCovariatesLocalForecastingModel, where there is a difference between fit and predict mode)
         #  do not yet. In general, Local models train on the entire series (input=output), different to Global models
         #  that use an input to predict an output.
-        return -self.min_train_series_length, -1, None, None, None, None, 0
+        return -self.min_train_series_length, -1, None, None, None, None, 0, None
 
     @property
     def supports_transferrable_series_prediction(self) -> bool:
@@ -2927,12 +2932,13 @@ def extreme_lags(
         Optional[int],
         Optional[int],
         int,
+        Optional[int],
     ]:
         # TODO: LocalForecastingModels do not yet handle extreme lags properly. Especially
         #  TransferableFutureCovariatesLocalForecastingModel, where there is a difference between fit and predict mode)
         #  do not yet. In general, Local models train on the entire series (input=output), different to Global models
         #  that use an input to predict an output.
-        return -self.min_train_series_length, -1, None, None, 0, 0, 0
+        return -self.min_train_series_length, -1, None, None, 0, 0, 0, None
 
 
 class TransferableFutureCovariatesLocalForecastingModel(

@@ -229,9 +229,6 @@ def _verify_predict_sample(self, predict_sample: Tuple):
         # have to match the training sample
         pass
 
-    def min_train_series_length(self) -> int:
-        return self.input_chunk_length
-
     def supports_likelihood_parameter_prediction(self) -> bool:
         return False
 

@@ -316,9 +316,9 @@ def fit(
             # shift by the forecasting models' largest input length
             all_shifts = []
             # when it's not clearly defined, extreme_lags returns
-            # min_train_serie_length for the LocalForecastingModels
+            # `min_train_series_length` for the LocalForecastingModels
             for model in self.forecasting_models:
-                min_target_lag, _, _, _, _, _, _ = model.extreme_lags
+                min_target_lag, _, _, _, _, _, _, _ = model.extreme_lags
                 if min_target_lag is not None:
                     all_shifts.append(-min_target_lag)
 
@@ -459,6 +459,7 @@ def extreme_lags(
         Optional[int],
         Optional[int],
         int,
+        Optional[int],
     ]:
         extreme_lags_ = super().extreme_lags
         # shift min_target_lag in the past to account for the regression model training set

@@ -449,6 +449,7 @@ def extreme_lags(
         Optional[int],
         Optional[int],
         int,
+        Optional[int],
     ]:
         min_target_lag = self.lags["target"][0] if "target" in self.lags else None
         max_target_lag = self.output_chunk_length - 1 + self.output_chunk_shift
@@ -464,6 +465,7 @@ def extreme_lags(
             min_future_cov_lag,
             max_future_cov_lag,
             self.output_chunk_shift,
+            None,
         )
 
     @property

@@ -321,9 +321,9 @@ def __init__(
             Fraction of neurons afected by Dropout.
         training_length
             The length of both input (target and covariates) and output (target) time series used during
-            training. Generally speaking, `training_length` should have a higher value than `input_chunk_length`
-            because otherwise during training the RNN is never run for as many iterations as it will during
-            inference. For more information on this parameter, please see `darts.utils.data.ShiftedDataset`
+            training. Must have a larger value than `input_chunk_length`, because otherwise during training
+            the RNN is never run for as many iterations as it will during inference. For more information on
+            this parameter, please see `darts.utils.data.ShiftedDataset`.
         **kwargs
             Optional arguments to initialize the pytorch_lightning.Module, pytorch_lightning.Trainer, and
             Darts' :class:`TorchForecastingModel`.
@@ -485,6 +485,13 @@ def encode_year(idx):
             `RNN example notebook <https://unit8co.github.io/darts/examples/04-RNN-examples.html>`_ presents techniques
             that can be used to improve the forecasts quality compared to this simple usage example.
         """
+        if training_length < input_chunk_length:
+            raise_log(
+                ValueError(
+                    f"`training_length` ({training_length}) must be `>=input_chunk_length` ({input_chunk_length})."
+                ),
+                logger=logger,
+            )
         # create copy of model parameters
         model_kwargs = {key: val for key, val in self.model_params.items()}
 
@@ -585,3 +592,27 @@ def supports_multivariate(self) -> bool:
     @property
     def min_train_series_length(self) -> int:
         return self.training_length + 1
+
+    @property
+    def extreme_lags(
+        self,
+    ) -> Tuple[
+        Optional[int],
+        Optional[int],
+        Optional[int],
+        Optional[int],
+        Optional[int],
+        Optional[int],
+        int,
+        Optional[int],
+    ]:
+        return (
+            -self.input_chunk_length,
+            self.output_chunk_length - 1,
+            None,
+            None,
+            -self.input_chunk_length,
+            self.output_chunk_length - 1,
+            self.output_chunk_shift,
+            self.training_length - self.input_chunk_length,
+        )
@@ -2494,15 +2494,17 @@ def extreme_lags(
         Optional[int],
         Optional[int],
         int,
+        Optional[int],
     ]:
         return (
             -self.input_chunk_length,
             self.output_chunk_length - 1 + self.output_chunk_shift,
-            -self.input_chunk_length if self.uses_past_covariates else None,
-            -1 if self.uses_past_covariates else None,
+            -self.input_chunk_length,
+            -1,
             None,
             None,
             self.output_chunk_shift,
+            None,
         )
 
 
@@ -2583,19 +2585,17 @@ def extreme_lags(
         Optional[int],
         Optional[int],
         int,
+        Optional[int],
     ]:
         return (
             -self.input_chunk_length,
             self.output_chunk_length - 1 + self.output_chunk_shift,
             None,
             None,
-            self.output_chunk_shift if self.uses_future_covariates else None,
-            (
-                self.output_chunk_length - 1 + self.output_chunk_shift
-                if self.uses_future_covariates
-                else None
-            ),
             self.output_chunk_shift,
+            self.output_chunk_length - 1 + self.output_chunk_shift,
+            self.output_chunk_shift,
+            None,
         )
 
 
@@ -2677,19 +2677,17 @@ def extreme_lags(
         Optional[int],
         Optional[int],
         int,
+        Optional[int],
     ]:
         return (
             -self.input_chunk_length,
             self.output_chunk_length - 1 + self.output_chunk_shift,
             None,
             None,
-            -self.input_chunk_length if self.uses_future_covariates else None,
-            (
-                self.output_chunk_length - 1 + self.output_chunk_shift
-                if self.uses_future_covariates
-                else None
-            ),
+            -self.input_chunk_length,
+            self.output_chunk_length - 1 + self.output_chunk_shift,
             self.output_chunk_shift,
+            None,
         )
 
 
@@ -2771,19 +2769,17 @@ def extreme_lags(
         Optional[int],
         Optional[int],
         int,
+        Optional[int],
     ]:
         return (
             -self.input_chunk_length,
             self.output_chunk_length - 1 + self.output_chunk_shift,
-            -self.input_chunk_length if self.uses_past_covariates else None,
-            -1 if self.uses_past_covariates else None,
-            -self.input_chunk_length if self.uses_future_covariates else None,
-            (
-                self.output_chunk_length - 1 + self.output_chunk_shift
-                if self.uses_future_covariates
-                else None
-            ),
+            -self.input_chunk_length,
+            -1,
+            -self.input_chunk_length,
+            self.output_chunk_length - 1 + self.output_chunk_shift,
             self.output_chunk_shift,
+            None,
         )
 
     def predict(
@@ -2922,17 +2918,15 @@ def extreme_lags(
         Optional[int],
         Optional[int],
         int,
+        Optional[int],
     ]:
         return (
             -self.input_chunk_length,
             self.output_chunk_length - 1 + self.output_chunk_shift,
-            -self.input_chunk_length if self.uses_past_covariates else None,
-            -1 if self.uses_past_covariates else None,
-            self.output_chunk_shift if self.uses_future_covariates else None,
-            (
-                self.output_chunk_length - 1 + self.output_chunk_shift
-                if self.uses_future_covariates
-                else None
-            ),
+            -self.input_chunk_length,
+            -1,
             self.output_chunk_shift,
+            self.output_chunk_length - 1 + self.output_chunk_shift,
+            self.output_chunk_shift,
+            None,
         )
@@ -55,6 +55,25 @@ class TestRNNModel:
         dropout=0,
     )
 
+    def test_training_length_input(self):
+        # too small training length
+        with pytest.raises(ValueError) as msg:
+            RNNModel(input_chunk_length=2, training_length=1)
+        assert (
+            str(msg.value)
+            == "`training_length` (1) must be `>=input_chunk_length` (2)."
+        )
+
+        # training_length >= input_chunk_length works
+        model = RNNModel(
+            input_chunk_length=2,
+            training_length=2,
+            n_epochs=1,
+            random_state=42,
+            **tfm_kwargs,
+        )
+        model.fit(self.series[:3])
+
     def test_creation(self):
         # cannot choose any string
         with pytest.raises(ValueError) as msg: