Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forecasting at scale (in parallel) #479

Closed
sauberf opened this issue Mar 26, 2018 · 8 comments
Closed

Forecasting at scale (in parallel) #479

sauberf opened this issue Mar 26, 2018 · 8 comments

Comments

@sauberf
Copy link

sauberf commented Mar 26, 2018

I'd like to use prophet to predict time series for different entities over the same time frame (these entities are not related so it's not a multivariate problem). i.e. let's say I have 1,000 features that change over the same period of time, and I'd like to predict their values for the next 30 days. How can I train these models in parallel (or distribute the computation) using Prophet? Is there any example that covers something similar?

@sauberf sauberf changed the title Forecasting at scale Forecasting at scale (in parallel) Mar 26, 2018
@leafonsword
Copy link

I found forecast = m.predict(future) this statement is slow in python when dataframe contains many rows, could add a parallel options so all cpu cores could compute parallelly~

@bletham
Copy link
Contributor

bletham commented Apr 11, 2018

The predict can be slow if future has a lot of rows because of the simulation for trend forecast uncertainty. You can speed things up by using fewer samples to estimate trend uncertainty intervals - the argument uncertainty_samples to Prophet() defaults to 1000, and cutting that down will make it run faster at a cost of more noise in the upper and lower bounds.

Otherwise, if the time series are all totally independent then you could always just run a process for each processor separately on a subset of time series.

@leafonsword
Copy link

leafonsword commented Apr 11, 2018

@bletham
predict() has three sub-function:

predict_trend()
predict_seasonal_components()
predict_uncertainty()

If these three sub-functions could execute parallelly, Prophet will also run faster without decresing uncertainty_samples

@bletham
Copy link
Contributor

bletham commented Apr 11, 2018

I think that predict_uncertainty is the only one that should take substantial resources, but anyway agree that it'd be good to speed these up.

We've been talking recently about the possibility of moving the prediction code into Stan, which would speed it up a bunch and also reduce some code duplication between R and Python versions. I think that'd be the nicest way to improve prediction speed.

@leafonsword
Copy link

leafonsword commented Apr 12, 2018

@bletham
prediction code into Stan centainly will have a huge speedup!

I make a predict() which takes about 5s, and these 3 sub-function takes about:

predict_trend() # less than 1 s
predict_seasonal_components() # about 2s
predict_uncertainty()   # about 3s

Since predict_seasonal_components() takes about 40% time, parallel execuing will sppedup about 40% comparing with in order executing whatever prediction code into Stan or not~

With Python3's concurrent.futures module, writing parallel executing code seems very easy-- just a few lines?

@bletham
Copy link
Contributor

bletham commented Apr 13, 2018

That's pretty beneficial. I'd be a bit hesitant to lose Py2 compatibility though, at least around here the transition is still a work in progress. I think I'd like to first evaluate the feasibility of doing the predictions in Stan.

@leafonsword
Copy link

@bletham
python2 also has a concurrent.futures lib:
https://github.com/agronholm/pythonfutures

Anyway,predictions in Stan is first priority~

@bletham
Copy link
Contributor

bletham commented May 25, 2018

It looks like predictions in Stan are going to possible once an upstream issue in RStan has been fixed (#501). I'm going to close this issue since we are definitely moving in that direction, and once that is done we can revisit parallelization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants