[ENH] Support Global Time Series Forecasting #4651

benHeid · 2023-05-27T12:03:02Z

Is your feature request related to a problem? Please describe.
In deep learning-based time series forecasting, a model is often trained on multiple time series (e.g. Hourly Subset of the M4 Competition or UCI ElectricityLoadDiagramDataset) and applied to time series. Thereby, only one model for all time series exists. Furthermore, the set of time series for which the prediction is made does not have to be equal to the time series on which the model is trained.

Describe the solution you'd like
Based on this consideration, I would like to make forecasts on time series on which a model is not fitted. Based on a proposal of @fkiraly and @ahmedgc in the pysf Repo a solution may look at follows:

model = ForecastModel(()
model.fit(y=y_train, X=x_train)
model.predict(y=y_test, X=x_test, fh=ForecastHorizon(24))

This would require adding an optional argument y to the predict method where y would comprise all required historical values of the time series that should be predicted.

Note depending on the use case, different time series also get different values of exogenous variables. E.g., consider a set of time series describing the electrical load of different buildings at different locations. Thereby, for each location exist a different temperature time series. Then a solution may require a mapping from the target time series exogenous time series. Theoretically, this could be solved by adding a further dimension to the exogenous data, whereby the first dimension has to equal the number of considered time series and then the mapping could be performed via the index.

Required Decisions

Do we want to extend the signature of predict?
Do we want directly support mapping from time series to exogenous time series?
Would it be beneficial to introduce a new class of forecasters to implement the stuff related to Global Time Series Forecasting?
Should we think about which already supported models/wrappers are generally capable of performing global time series forecasts?
- I suppose that at least sklearn-based regressors could be used for global time series forecasting

The text was updated successfully, but these errors were encountered:

fkiraly · 2023-05-29T06:33:44Z

Addendum: this might be the same request as by @romanlutz some time earlier:
#4209

fkiraly · 2023-05-29T06:36:59Z

Yes, I think the design above still makes sense.
From a technical perspective, given the current sktime architecture, it should not be too difficult to introduce framework-wise:

the y arg can be added to predict, but it must be the last arg
probably we want a tag to signify that this forcaster can or only can make global forecasts

The one thing that bothers me is that ordinary panel forecasting can be considered a special case, in that case we fit to what, in the code snippet, is y_test (rather than y_train).

Should this be bothering us? Ultimately, it would result in an interface wrapper class for the naive case where we ignore y_train, rather than in ordinary use.

fkiraly · 2023-08-02T18:36:44Z

should we maybe circle back to finalizing this design, @benHeid? Perhaps in the Fri meeting?

benHeid · 2023-08-02T22:50:35Z

Yes that is a good idea. I think we can split the pipeline meeting and talk about that topic in the first or second half of the meeting

fkiraly · 2023-08-04T11:11:28Z

Questions for discussion:

if we extend predict, what should the default behaviour be for a non-global forecaster such as ARIMA when it sees new y in predict? Re-fit on the new y seen?
what do we assume about exogeneous time series if we pass a new y in predict?
do we want to allow a partial overlap between y indices in fit and predict?

fkiraly · 2023-08-04T11:20:02Z

Regarding the first point, there is a problem that I cannot seem to satisfy both conditions at the same time:

for an ordinary forecaster, do sth sensible when you see m series in fit and n series in predict
allow some combination of commands that amounts to ARIMA coefficient reuse on a new time series, as requested by @olerch in Use fitted model on new unseen dataset without changing parameters #5006

options to do in the general case, from discussion with @benHeid:

manage local/global by tags, potentially raise exception
by default, ignore series in fit if predict-y is provided
control local/global behaviour by new parameters

fkiraly · 2023-08-04T12:24:27Z

from meeting on August 4, decisions (please comment, @benHeid):

Do we want to extend the signature of predict?

yes, as described above.

Would it be beneficial to introduce a new class of forecasters to implement the stuff related to Global Time Series Forecasting?

same class with extended signature.

As @benHeid also suggested, would use tags to differentiate, not new class

Do we want directly support mapping from time series to exogenous time series?

already supported by current interface, no action needed.

Should we think about which already supported models/wrappers are generally capable of performing global time series forecasts?

default could be "ignore train series", or raise exception dependent on tag (tbd)
reduction forecasters theoretically support general global
- might be good to do after current rework, or even as part of if easy (but probably not easy)
BH: ARIMA? in theory, the model could be fitted to have "same coefficients" for all instances, then it can be applied globally
- -> GlobalARIMA
- we should open an issue
all the neural networks from the DL forecasting workstream!

fkiraly · 2023-08-04T12:25:37Z

hackmd notebook for Aug 4 meeting here for reference.

To view, click the dropdown arrow.

Design notes for global forecasting API

FK questions

Questions for discussion:

if we extend predict, what should the default behaviour be for a non-global forecaster such as ARIMA when it sees new y in predict? Re-fit on the new y seen?
what do we assume about exogeneous time series if we pass a new y in predict?
do we want to allow a partial overlap between y indices in fit and predict?

FK - modelling global math/interface

Definition of "global" is often fuzzy, and how it maps onto fit/predict.

perspective: without fit/predict

In a single input-output perspective, we have:

A. "panel" forecasting:

given n instances of time series $x_1(t), ..., x_n(t)$, all observed at $t= t_1, ..., t_T$ (ordered ascendingly),

produce forecasts of $x_1(t_k + h), ..., x_n(t_k + h)$ at horizons $h= h_1, ..., h_H$

B. "global" forecasting with forecasted instances different from training instances

given n instances of time series $x_1(t), ..., x_n(t)$, all observed at $t = t_1, ..., t_T$ (ordered ascendingly),

produce forecasts of $x_b(t_k + h), ..., x_n(t_k + h)$ at horizons $h= h_1, ..., h_H$, where $b$ is the starting index of all the instances to forecast

notes from discussion with BH

this does not cover unequal indices, but that might be important
- but this can easily be extended
- for now, equal time stamps for all series
up to choice of horizons and time stamps of observation, this is the same as BH definition subtracting train/inference split
- if considering unequal time stamps or horizons, both definitions can be brought congruent

mapping on fit/predict

A -> in sktime, we've mapped this as follows:

fit gets information as arguments:

y = $x_1(t), ..., x_n(t)$ observed at $t = t_1, ..., t_T$ (ordered ascendingly),
fh = horizons $h= h_1, ..., h_H$

predict produces the forecasts of $x_1(t_k + h), ..., x_n(t_k + h)$

B -> we could map this two different ways!

B1 - as currently in issue #4651

fit gets information as arguments:

y = $x_1(t), ..., x_{b-1}(t)$ observed at $t = t_1, ..., t_T$ (ordered ascendingly),
fh = horizons $h= h_1, ..., h_H$
note that only the "train" $x$-es are passed! indices $1$ to $b$

predict gets additional information:

y = $x_b(t), ..., x_n(t)$ observed at $t = t_1, ..., t_T$
and produces the forecasts of $x_b(t_k + h), ..., x_n(t_k + h)$

B2 - pass all data to fit!

fit gets information as arguments:

y = $x_1(t), ..., x_{n}(t)$ observed at $t = t_1, ..., t_T$ (ordered ascendingly),
fh = horizons $h = h_1, ..., h_H$ and an indicator of which series are train/test, i.e., the index $b$ or a set-generalization
note that all $x$-es are passed!

predict produces the forecasts of $x_1(t_k + h), ..., x_n(t_k + h)$

What is the difference between B1 and B2?

B1 gets additional instances in predict
B2 sees everything already in fit

In B2, "which series" is fh-like information

Difference is similar to passing fh early or late

notes from discussion with BH

BH: B2 does not appropriately support pre-trained neural networks!
- or, more generally, serving without refitting, on varying inference data
FK: is identity on the intersection important, B1 cannot model this?
- BH - knows no common algorithm that make use of identity of index (as opposed to identity of values)
- BH - but can be extended later - e.g., a y_index arg of predict, or an init parameter for a specific algorithm

BH's undersanding of global forecasting:

You have a training dataset containing n time series $x_{i_1}(t), ..., x_{i_n}(t)$ on which the forecaster is trained.

During inference you have a set of m time series, $x_{j_1}(t), ..., x_{j_m}(t)$ whereby m need not be equal to n. And where there might or might not be intersection between the sets {i_1,.., i_n} and {j_1, ... j_m}. For this set of time series $x_{j_1}(t), ..., x_{j_m}(t)$ we then aim to provide forecasts.

Technically note

The forecasts for each of this $x_{j_1}(t), ..., x_{j_m}(t)$ time series does not performed at once but can also be performed sequentially.

Probable Definition in Literature:

https://arxiv.org/abs/2212.03523

tentative decisions?

Do we want to extend the signature of predict?

FK: yes - after discusison, favours option B1 above B2

Would it be beneficial to introduce a new class of forecasters to implement the stuff related to Global Time Series Forecasting?

FK: thinks, same class with extended signature.

As BH also suggested, would use tags to differentiate, not new class

Do we want directly support mapping from time series to exogenous time series?

FK - can you explain?

BH - different vs same exogenous data per instance

FK: currently, exogenous data nees to have same instance indexing. We cannot pass exogenous data that has no instance index if the y does (but could be extended in the future)

-> current state is fine

Should we think about which already supported models/wrappers are generally capable of performing global time series forecasts?

default could be "ignore train series", or raise exception dependent on tag (tbd)
reduction forecasters theoretically support general global
- might be good to do after current rework, or even as part of if easy (but probably not easy)
BH: ARIMA? in theory, the model could be fitted to have "same coefficients" for all instances, then it can be applied globally
- -> GlobalARIMA
- we should open an issue
all the neural networks from the DL forecasting workstream!

fkiraly · 2023-08-04T12:26:09Z

FYI @olerch

benHeid · 2023-08-11T11:36:22Z

Actions:

Add examples on neural network-based forecasters @benHeid
- Based on Neural Network that will be implemented in the deep learning forecaster workstream.
Extend Base Forecaster Classes: @fkiraly
- based on the design approach in the hackmd
- based on the ideas from the pysf repository
Search if there are existing issues related to the topic: key words: global forecasting, supervised forecasting, ... @benHeid

fkiraly · 2024-05-24T12:47:55Z

@Xinyu-Wu-0000, can you kindly comment so I can assign you this issue?

Xinyu-Wu-0000 · 2024-05-25T11:43:40Z

Yeah of course.

benHeid added API design API design & software architecture module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting enhancement Adding new functionality labels May 27, 2023

benHeid added this to ToDo in Workstream: Deep Learning based Forecasters via automation Jul 21, 2023

fkiraly mentioned this issue Aug 4, 2023

[ENH] global ARIMA #5021

Open

fkiraly changed the title ~~[ENH] Support Global Time Series Forecast~~ [ENH] Support Global Time Series Forecasting Aug 4, 2023

fkiraly moved this from ToDo to In Progress in Workstream: Deep Learning based Forecasters Aug 6, 2023

benHeid mentioned this issue Jan 19, 2024

[ENH] Loader for pretrained transformer models from hugging face #5790

Open

fkiraly mentioned this issue Mar 20, 2024

[ENH] umbrella issue - foundation models and pre-trained models #6177

Open

Xinyu-Wu-0000 mentioned this issue Mar 28, 2024

[ENH] pytorch forecasting adapter with Global Forecasting API #6228

Open

Xinyu-Wu-0000 mentioned this issue May 9, 2024

[MENTEE] Xinyu Wu sktime/mentoring#37

Open

fkiraly assigned Xinyu-Wu-0000 May 25, 2024

This was referenced Jun 10, 2024

[ENH] Support for probabilistic forecasting of global forecasters #6569

Open

[DOC] Global Forecast Example #6575

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Support Global Time Series Forecasting #4651

[ENH] Support Global Time Series Forecasting #4651

benHeid commented May 27, 2023

fkiraly commented May 29, 2023

fkiraly commented May 29, 2023

fkiraly commented Aug 2, 2023

benHeid commented Aug 2, 2023

fkiraly commented Aug 4, 2023

fkiraly commented Aug 4, 2023

fkiraly commented Aug 4, 2023

fkiraly commented Aug 4, 2023

FK questions

FK - modelling global math/interface

perspective: without fit/predict

notes from discussion with BH

mapping on fit/predict

What is the difference between B1 and B2?

notes from discussion with BH

BH's undersanding of global forecasting:

Technically note

Probable Definition in Literature:

tentative decisions?

fkiraly commented Aug 4, 2023

benHeid commented Aug 11, 2023

fkiraly commented May 24, 2024

Xinyu-Wu-0000 commented May 25, 2024

[ENH] Support Global Time Series Forecasting #4651

[ENH] Support Global Time Series Forecasting #4651

Comments

benHeid commented May 27, 2023

fkiraly commented May 29, 2023

fkiraly commented May 29, 2023

fkiraly commented Aug 2, 2023

benHeid commented Aug 2, 2023

fkiraly commented Aug 4, 2023

fkiraly commented Aug 4, 2023

fkiraly commented Aug 4, 2023

fkiraly commented Aug 4, 2023

FK questions

FK - modelling global math/interface

perspective: without fit/predict

notes from discussion with BH

mapping on fit/predict

What is the difference between B1 and B2?

notes from discussion with BH

BH's undersanding of global forecasting:

Technically note

Probable Definition in Literature:

tentative decisions?

fkiraly commented Aug 4, 2023

benHeid commented Aug 11, 2023

Actions:

fkiraly commented May 24, 2024

Xinyu-Wu-0000 commented May 25, 2024