Multivariate time series forecasting question #652

CMobley7 · 2020-02-19T21:31:38Z

My apologies for the ignorant questions in advance, while I’m not necessarily new to deep learning, I’m a new fairly new to time series forecasting, especially when using deep learning techniques for it.

Due to the fact gluon-ts is making use of DL based approaches, dealing with non-stationarity in training datasets is not necessary, unlike when using AR/MA and VAR based models, correct? This appears to be outlined here.

Also, I am working with a multivariate time series dataset in which the target/dependent variable is related and/or dependent on other features/independent variables. So, while I’m only trying to predict one target variable, the relationship between this target variable and the other features is important; consequently, this leads to two questions.

First, since the relationship between the target variable and other features is important, are the most applicable models deepvar and gpvar or will other models in gluon-ts work and I’m just thinking too much in terms of classical time series forecasting?

Second, if I’m using deepvar or gpvar, I’m assuming that when making the dataset, the target should be a vector of vectors which include my target variable and the other features, right? However, if I’m thinking too much in terms of classical time series forecasting, target should be a vector of the target variable and I should store the other features as vectors of vectors in either dynamic_feat or cat, right?

Again, I’m sorry for my ignorance. Thanks in advance for any assistance you provide.

The text was updated successfully, but these errors were encountered:

ehsanmok · 2020-02-20T00:48:55Z

DL based methods can handle non-stationary, multivariate time-series with missing values and categorical features. In multivariate case, the target is at least 2 dimensional, where one dim is the number of variates (number of time-series). Normally, when you use any of the provided *Estimators like DeepVAREstimator the requirements will be checked and the builtin transformations will create the required features automatically.

Note that in multivariate case, you can use MultivariateGrouper to group the target into 2 dim, like

from gluonts.dataset.artificial import constant_dataset
from gluonts.dataset.common import TrainDatasets
from gluonts.dataset.multivariate_grouper import MultivariateGrouper

def load_multivariate_constant_dataset():
    metadata, train_ds, test_ds = constant_dataset()
    grouper_train = MultivariateGrouper(max_target_dim=10)
    grouper_test = MultivariateGrouper(max_target_dim=10)
    return TrainDatasets(
        metadata=metadata,
        train=grouper_train(train_ds),
        test=grouper_test(test_ds),
    )

dataset = load_multivariate_constant_dataset()

mbohlkeschneider · 2020-02-20T09:36:24Z

Hi @CMobley7,

as @ehsanmok wrote already, you can use the MultivariateGrouper to convert any univariate time series dataset into multivariate time series.

Which model is the right one for your task depends. If you know the values of your related time series in the future (because they are time series indicators of holidays or known promotion), using these as dynamic features in univariate models (like DeepAR) does a fine job.

If this is not the case, I would recommend using gpvar, as this is the multivariate time series model so far for which we have the most empirical evidence that it works well (see this paper).

Hope that heps.

CMobley7 · 2020-02-21T00:22:34Z

@ehsanmok and @mbohlkeschneider , thank you for your advice thus far, I really appreciate. Unfortunately, I’m still slightly confused regarding which model I should choose and consequently how to create the training and test sets.

I planned to recreate the following notebook, but instead of using straight gluon or keras, I'd used gluon-ts. The author creates a model to forecast pollution given previous pollution, as well as other factors like rain, wind speed and temperature. So, which models do you think best fits this type of data and what is the best method to take the dataframe in cell 8 and turn it into both a training and test set given the model chosen. In addition, while dealing with non-stationarity may not be a problem with the DL based approaches in gluon-ts, I’m assuming scaling the features still is. Are there methods inside gluon-ts to deal with this or should I just use scikit-learn or similar library to do this prior to creating the dataframe in cell 8? Thank you again in advance.

CMobley7 · 2020-02-25T00:07:11Z

@ehsanmok and @mbohlkeschneider, I've looked through gluon-ts's extended tutorial and understand how to make a traditional dataset, but I'm still not sure exactly how to create a dataset for gpvar. The MultivariateGrouper is only useful for converting univariate datasets to multivariate, right? After looking at gpvar, it seems like it won't use any feature beside target. It looks like I need to group the target and all features into the target field or am I mistaken and I should use the traditional dataset with the target in the target field and all features in their appropriate fields (feat_static_cat, feat_static_real, feat_dynamic_cat, feat_dynamic_real)?

mbohlkeschneider · 2020-02-25T17:40:40Z

Hi @CMobley7 ,

you are correct this is what the MultivariateGrouper is doing. Essentially, multivariate time series should have target fields that look like this. Then, the data should be loadable with our standard loaders.

You are right that GPVar is not using additional features atm. Let me breakdown why:

feat_static_cat: In our paper, we addressed the use-case of having a single multivariate dataset with shape (time, dim). Thus, the concept of a feat_static_cat (which is a way to mark different time series) does not make sense because every time series is "the same".

feat_static_real: We have not looked into this in the paper, but this could be implemented.

feat_dynamic_cat: Currently, I think GluonTS provides the functionality to pass feat_dynamic_cat to models but no model is using this so far. Feel free to experiment and share your findings!

feat_dynamic_real: We have not looked into this in the paper. This could be quite challenging depending on how your data looks like. The two cases are:

Dynamic features are the same for all (marginal) time series: This is the case we have in for using our standard time features here. We don't really have the infrastructure from loading the data from files atm, I think. This case is straightforward.
Dynamic features are different for all (marginal) time series: This comes with a lot of practical issues: What values should features have if the time series are not same length (time series could be longer or shorter). Also, every feature introduced this way will add target_dim inputs to the model, so my gut feeling is that this blows up fairly quickly and becomes hard to train.

jaschau · 2020-02-26T09:19:27Z

I had the same issue that I had a dataset with feat_dynamic_real.
Although gpvar and deepvar ignore feat_dynamic_real in principle, my trainings initially still crashed. I figured out that the root cause for this was the fact that the TrainingDataLoader would try to batch the feat_dynamic_real which, however, were not cut to the approriate length by InstanceSplitter in the default transformation.
I fixed this by replacing the code in https://github.com/awslabs/gluon-ts/blob/master/src/gluonts/model/gpvar/_estimator.py#L253,

VstackFeatures(
    output_field=FieldName.FEAT_TIME,
    input_fields=[FieldName.FEAT_TIME],
)

by

VstackFeatures(
    output_field=FieldName.FEAT_TIME,
    input_fields=[FieldName.FEAT_TIME, FieldName.FEAT_DYNAMIC_REAL],
)

This works because VstackFeatures will by default drop the input_fields from the dataset.
Maybe this is of help.

CMobley7 · 2020-02-28T19:30:33Z

Thanks @mbohlkeschneider and @jaschau. Unfortunately, feature engineering is talking longer than I anticipated; so, it will probably be another week before I'm able to test gluon-ts with my dataset. I'll close this issue now since I believe all my question have been answered and post back later with results or potentially additional questions. Thanks again.

vblagoje · 2020-05-11T13:27:53Z

@mbohlkeschneider can any other model be used for multivariate series prediction or just gpvar?

mbohlkeschneider · 2020-05-11T13:37:24Z

Technically, DeepAR and DeepVAR should work as well. However, GPVAR is the model I would recommend.

pratikgehlott · 2021-04-27T06:22:32Z

@mbohlkeschneider do you have an example notebook on how to make multivariate time series forecasting using gluon-ts?

mbohlkeschneider · 2021-04-27T06:24:38Z

@Pratik325, I don't have a notebook, but this test does show the setup. Let me know if you have questions.

pratikgehlott · 2021-04-27T07:08:56Z

@mbohlkeschneider I need to know how to use this on custom datasets. It would be beneficial if you performed on the simple dataset and shared the notebook because none of the platforms have any good explanation of gluon-ts for multivariate. It would help many of the learners. Thank you.

mbohlkeschneider · 2021-04-27T07:10:57Z

Hi @Pratik325,

Basically, the data preparation is the same as for all other models. The only difference is that the target field becomes a 2D array. So instead target=[1,2,3,4,5] you would have target=[[1,2,3,4,5],[6,7,8,9,10]]. Does this help?

pratikgehlott · 2021-04-27T08:27:07Z

No sir, @mbohlkeschneider

pratikgehlott · 2021-04-27T16:55:41Z

Hi @mbohlkeschneider , can you please help me..?

jaschau · 2021-04-27T17:19:38Z

A complete example #382 can be found here. I am not sure it's entirely up to date but it sure demonstrates the basic setup.

pratikgehlott · 2021-04-28T06:12:48Z

@jaschau its outdated!!!

CMobley7 added the question Further information is requested label Feb 19, 2020

CMobley7 closed this as completed Feb 28, 2020

timoschowski mentioned this issue May 11, 2021

Multivariate time series forecasting #1393

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multivariate time series forecasting question #652

Multivariate time series forecasting question #652

CMobley7 commented Feb 19, 2020

ehsanmok commented Feb 20, 2020

mbohlkeschneider commented Feb 20, 2020

CMobley7 commented Feb 21, 2020

CMobley7 commented Feb 25, 2020

mbohlkeschneider commented Feb 25, 2020

jaschau commented Feb 26, 2020

CMobley7 commented Feb 28, 2020

vblagoje commented May 11, 2020

mbohlkeschneider commented May 11, 2020

pratikgehlott commented Apr 27, 2021

mbohlkeschneider commented Apr 27, 2021

pratikgehlott commented Apr 27, 2021

mbohlkeschneider commented Apr 27, 2021 •

edited

Loading

pratikgehlott commented Apr 27, 2021

pratikgehlott commented Apr 27, 2021

jaschau commented Apr 27, 2021

pratikgehlott commented Apr 28, 2021

Multivariate time series forecasting question #652

Multivariate time series forecasting question #652

Comments

CMobley7 commented Feb 19, 2020

ehsanmok commented Feb 20, 2020

mbohlkeschneider commented Feb 20, 2020

CMobley7 commented Feb 21, 2020

CMobley7 commented Feb 25, 2020

mbohlkeschneider commented Feb 25, 2020

jaschau commented Feb 26, 2020

CMobley7 commented Feb 28, 2020

vblagoje commented May 11, 2020

mbohlkeschneider commented May 11, 2020

pratikgehlott commented Apr 27, 2021

mbohlkeschneider commented Apr 27, 2021

pratikgehlott commented Apr 27, 2021

mbohlkeschneider commented Apr 27, 2021 • edited Loading

pratikgehlott commented Apr 27, 2021

pratikgehlott commented Apr 27, 2021

jaschau commented Apr 27, 2021

pratikgehlott commented Apr 28, 2021

mbohlkeschneider commented Apr 27, 2021 •

edited

Loading