# multivariate modeling #49

Open
opened this issue Feb 28, 2017 · 12 comments

Projects
None yet
9 participants

### jmarca commented Feb 28, 2017

 I see from issue #18 that there are no concrete plans yet for prophet to handle multivariate data. I am interested in tackling this feature; any pointers on where to start? My application area is vehicle traffic. For example, numbers of cars and trucks in each lane moving past a point (yes I also read that sub-daily data is also not yet well supported). While I can predict traffic lane by lane by car by truck, it would make more sense to model all lanes and vehicle types at once, since they are related (lots of trucks in the right lanes means fewer cars in those lanes, more cars in left lanes, etc etc.).
Contributor

### seanjtaylor commented Mar 1, 2017

Thanks for the interest in taking this on! Here are my current thoughts.

## Simultaneous models

y_1(t) = f_1(t) + e_1(t)
y_2(t) = f_2(t) + e_2(t)


The two ways y_1(t) and y_2(t) can be related are:

1. If f_1 and f_2 are related somehow. The way to do that would be to have all of their parameters have a prior distribution so they are shrunk toward the same values. E.g. they share related seasonality, trend, and changepoints.
2. If e_1(t) and e_2(t) are correlated. This is a bit harder to model and interpret.

We could do 1. pretty easily in Stan, but the benefits for sharing the parameters would be small unless you have many time series (I'd say at least 5-10).

## Hierarchy of time series

y_1(t) = f_1(t) + e_1(t)
y_2(t) = f_2(t) + g(y_1(t)) + e_2(t)


In this case there's a hierarchy of time series. So it's a bit simpler (assuming you know the order):

1. you fit the y_1 model by itself
2. you regress y_1(t) on y_2(t) and learn their relationship.
3. instead fitting y_2 model, you forecast residual: [y_2(t) - g(y_1(t))]
4. To forecast y_2, you forecast y_1, use the model from (2) to and the forecast from (3) to construct the y_2 forecast.

This strategy requires more bookkeeping and model fits, whereas the first strategy everything would be done in one fitting step.

Curious if @bletham has thoughts.

Contributor

### bletham commented Mar 1, 2017

 I think that handling this with a hierarchical prior for each parameter could work really well in some situations but I see two potential complications. The first is that I do not expect MAP estimation to perform well with the hierarchical model, so we'd have to do MCMC which can be slow. The second, and more serious, is that putting a hierarchical prior on a parameter (e.g. a seasonality coefficient) implies that we think they should be similar between the two time series, i.e. they are positively correlated. We need a solution that will borrow strength across negatively correlated time series also. So maybe a joint prior for the seasonality coefficients that includes a covariance parameter? I think this would be especially challenging to extend to the changepoint deltas which have a Laplace prior. If we fit a prophet model to y1, and then fit a second prophet model to the residuals y2 - g(y1), it isn't clear to me that we have actually borrowed strength anywhere or will do any better than just forecasting y2 separately, unless we put some additional regularization on the y2-g(y1) forecast. I think something along this line could be successful if the fit were done jointly. The model might be: y_1(t) = f_1(t) + \eps_1 y_2(t) = a f_1(t) + b + \lambda f_2(t) + \eps_2  where f_1 and f_2 are prophet models. Here y_1 basically gets the usual prophet model, and for y_2 there would be some scalar factor a and offset b, and then an additional prophet model f_2 but which has a factor \lambda that is regularized to 0. f_1, f_2, a, b, and the noise parameters are jointly estimated. b could potentially also be something like a linear function b(t) to allow for an unregularized constant trend difference. We'd want to get some idea of what relationship is actually needed for a useful set of multivariate problems. It may also be difficult to have a default for \lambda that 'usually works.'
Author

### jmarca commented Mar 8, 2017

 Perhaps I'm thinking about this incorrectly, but I see the outcome as a vector, not as a single value. y(t) = f(t) + e(t)  where y(t) is a multi-dimensional vector of values (measurements) characterizing traffic levels across all freeway lanes at some point. I think I should probably learn Stan first, and understand why it might/might not be restricted to one-dimensional observations before attempting to "tackle this feature" for prophet.

Closed

### mgshaw commented Sep 14, 2017

 Hi, are there any plans to progress development on prophet handling multivariate data? I've found the package to be extremely useful for univariate time series - thanks for your recent release! - and there does seem to be some demand from other users.

### hardagerianil123 commented Sep 18, 2017

 can anyone tell me how to implement hierarchical forecasting using PROPHET in R.i am struggling with the use of time series data frame in prophet function as it throws error.

Closed

### jmwoloso commented Mar 28, 2018

 any update on this?

Closed

Open

Closed

Closed

Contributor

### bletham commented Oct 16, 2018 • edited

 A note about this: There isn't currently a way to do joint forecasts in Prophet. However, there is something you can do if you have time series that you believe are correlated, AND one of them is easier to forecast than the others. A typical setting would be that you have a top-level thing you want to forecast, but then you also want to forecast sub-groups of it (e.g., demographic breakdowns). The top-level thing is often easier to forecast because it has the most data, and noise levels are often lower. If this is the case, then you can try the following procedure: Forecast the top-level time series with Prophet. Forecast each subgroup independently, but include the top-level forecast as an extra regressor. This will encourage the subgroup forecast to track the top-level forecast unless there is clear evidence to the contrary. Something to try, YMMV. The subgroup forecasts may also benefit from reducing the prior scales for both trend changepoints and seasonality, since especially the seasonality will likely be largely captured by the top-level extra regressor. As a side note, I would not expect including forecast A as an extra regressor for forecast B to be beneficial unless A is somehow easier to forecast than B.

### ciberger commented Oct 25, 2018

 Hi @bletham @seanjtaylor , I'd appreciate your thoughts on the following sequential setup. Consider we have a number of time series m from which there is a high correlation between subsequent series, meaning you can clearly establish the following sequence: t_1 -> t_2 -> ... -> t_m. Would be preferable to regress based on previous forecast (i.e. t_m-1 as regressor of t_m) or fixed the regressor to the very first series t_1? Please note that t_1 is the top-level time series of the example. Also, would you consider as regressor prophet's forecast or a merge of historical value plus forecasted values. Thanks
Contributor

### bletham commented Oct 30, 2018

 These are both good questions. For the first, I think there are two important things for the inclusion of the extra regressor to be valuable. The first is that it be correlated with the target time series. By this consideration, t_m-1 seems best. However, the second important consideration is that the extra regressor needs to somehow be easier to forecast than t_m. Conditioning the forecast of t_m on something that has just as high forecast uncertainty as t_m would have if I were to forecast it by itself is clearly not useful. As a thought experiment, you can imagine including t_m as an extra regressor for forecasting t_m! They are perfectly correlated, but obviously I will not get any value from doing this because of the uncertainty in the forecast of the regressor. This is the biggest challenge with having meaningful extra regressors, but is also something that may be satisfied for these sort of hierarchical time series. It seems to often be the case that the top-level time series has more data, less noise, and is easier to forecast. In this case, there can be value to including it as an extra regressor (the actual value will depend on how correlated it is). For your second question, I suspect it would be best to use the Prophet forecast yhat as the extra regressor for both historical values and future values so that we cut out the observation noise in both segments. But I'd be really curious if you try both to hear what you find. As a note for the future, it will be very useful/important to incorporate the uncertainty in the regressor forecast into the prophet forecast. Conceptually, this is pretty straightforward: When sampling future uncertainty, for each sample use a different sampled draw from the regressor forecast (but ignoring the noise level, which means a change here: https://github.com/facebook/prophet/blob/master/python/fbprophet/forecaster.py#L1330 ). In terms of implementation, this means we'll have to have a way to pass along this uncertainty, or more likely have a specific interface for including a prophet model as a regressor so we can sample directly from it. Related to #442.

### jheffez commented Nov 27, 2018 • edited

 To summarize the above: at this point it's not possible/simple to use prophet to predict more than one series/feature, right? If so, any plans to add this capability? (Example of multiple series/feature: instead of using just the temperature to forecast weather, it would be beneficial, and more accurate, to use other feature such as humidity, pressure, etc)
Contributor

### bletham commented Nov 29, 2018

 @jheffez Correct. If the other humidity and pressure are easier to forecast than temperature than you could include them (and their forecasts) as extra regressors in your forecast of temperature, but there is no way to jointly forecast all three at the same time. This is certainly on the to-do list, but it will require a pretty substantial amount of new modeling and testing, and I don't expect to happen soon. We'll likely first focus on the hierarchical case that I describe in my Oct 16 comment above.

### cdeterman commented Dec 11, 2018

 Regarding the question of hierarchical problems I recently came across this page regarding using pystan for hierarchical/multilevel modeling. I know prophet utilizes stan so perhaps this can provide some ideas? Not entirely sure as Bayesian analysis is not my primary strength.

Closed

Closed

Open

Closed