Speeding up the prediction #2030

orenmatar · 2021-09-18T07:22:50Z

I was working on scaling up Prophet to be used as a forecast engine for hundreds of thousands of items for a project I'm working on, and realized that predicting with Prophet takes the vast majority of the running time,
After some research I figured out a way to speed it up significantly, and wrote the following post:

https://towardsdatascience.com/how-to-run-facebook-prophet-predict-x100-faster-cce0282ca77d#4b11-31bf17c6772

I think it would be great if something like this was implemented.

winedarksea · 2022-04-04T10:07:52Z

@orenmatar and @tcuongd do you know if anyone has made an attempt on this yet? Otherwise I might take a try at integrating that code.

orenmatar · 2022-04-04T12:37:01Z

@winedarksea I don't know of any attempt... and would love to see one. If you have any questions about the code - I'd be happy to help.
BTW, I'm working on optimizing the fit now. It can be done without using pystan, which consumes a lot of time.

winedarksea · 2022-04-04T22:11:47Z

@orenmatar thanks, hoping to get this done in the next week or two (perhaps a foolish hope). First have to figure out the new cmdstanpy backend for the develop

winedarksea · 2022-04-09T15:41:19Z

some questions for you @orenmatar

Easy first: uncertainty Samples is the same as "k" in your code, right?
Seasonal Components have an uncertainty part in the original code, when Prophet(mcmc_samples=300).
- doesn't look like it adds much compute to add it in, just takes a percentile of components in predict_seasonal_components when uncertainty_samples is True, then add or multiple terms along with trend and random
- it really looks like it is only used in plot_components, are you aware if this is actually used in yhat_upper/lower when mcmc sampling is used?
Flat growth, which is one of the 3 official growths now, how does this look as an addition to your code?

        elif prophet_obj.growth == "flat":
            sample_trends = np.zeros(k, len(forecast_df))

I also played around with this variation based on the flat_trend function but it is wrong, being not centered around 0

        elif prophet_obj.growth == "flat":
            sample_trends = np.repeat(
                prophet_obj.params['m'] * np.ones_like(future_time_series),
                k, axis=0
            )

orenmatar · 2022-04-10T08:53:32Z

@winedarksea

that's right - k is just the number of samples we take for the trend
As far as I could tell from the code it doesn't influence the yhat_upper and lower.. and having tested it on many items without that component - I believe I'm right... But maybe it would be best to address one of the original writers of the code.
Ya that seems like the simplest way to do it. But if the growth is flat then I believe the only source of uncertainty is the sigma of the normal distribution or errors, right? In which case there's no need to create a matrix of 0s, instead we can find the required percentile on a normal distribution. e.g. if sigma = 5, then for 95% uncertainty we have yhat+51.96 for the upper yhat and yhat-51.96 for the lower. this will take even less time and will be more precise than creating a random matrix and computing percentiles on it... but since the saves time here is insignificant, the simplest code to write would be what you suggested.

Good luck! looking forward to seeing this implemented!

winedarksea · 2022-04-22T18:54:04Z

I've been away on vacation in Italy, but I'm back!

I have a question which really is for the Prophet maintainers if they will come to the party? I am FB Internal and from what I can tell Ben Letham is entirely moved on to other projects - for example in Workspaces for Prophet he ignored call outs to him from others when there was a bug in the internal build. So request to @dmitryvinn and @tcuongd as to whom to coordinate work on?

My basic question is how many internal functions supporting the prediction need to be maintained. The simplest way to integrate these speed ups would be to cut the number of functions - because most of them are nested in for loops that will no longer be needed.

OLD/CURRENT WAY

predict
- predict_uncertainty
  - generates percentiles from samples from:
  - sample_posterior_predictive
    - generates make_all_seasonality_features
    - generates many samples from:
    - sample_model
      - adds a random component to:
      - sample_predictive_trend
        
        generates a new trend from piecewise_linear (or logistic)

New Way

predict
- predict_uncertainty (method changed, output same)

Would be obsolete
sample_posterior_predictive, sample_model, sample_predictive_trend

Alternatives:
Could maintain them unchanged, but remove them from use in predict_uncertainty.
Or could attempt to rewrite to access from new method.
sample_posterior_predictive would be fairly easy to update to the new method, the others would be harder.

A possible compromise solution would be to use the old way when mcmc_samples is True, which users are already expecting to be slow, and use the new way (with a trimmed down number of functions supporting it) for other cases.

orenmatar · 2022-05-11T06:22:56Z

@winedarksea How is it going with the project?
BTW, I just published another post on optimizing Prophet, which you may find interesting...
https://medium.com/towards-data-science/need-for-speed-optimizing-facebook-prophet-fit-method-to-run-20x-faster-166cd258f456

winedarksea · 2022-05-11T15:00:57Z

@orenmatar I messaged Ben Letham internally and his current priority is updating the internal platforming for Prophet at Meta which I may be helping with. But he did look at your work and liked it, so I think we will get this in - although that's a matter of months probably.

Neural Prophet (which is from a different team) uses PyTorch as well, and I think having a PyTorch alternative backend would be great since many people already have some familiarity with it. Altlhough 60 ms is hardly slow to begin with!

orenmatar · 2022-05-11T15:37:22Z

@winedarksea I think 60ms is slow compared to most linear regressions which is what Prophet ultimately is. some datasets really do contain hundreds of thousands of items if not millions, so these gains could be meaningful. but for sure it is more important to optimize the predict method... Thanks for the update!

nicolaerosia · 2022-05-16T07:53:17Z

@orenmatar do you have a fork where you implemented the changes? Thanks!

orenmatar · 2022-05-22T07:22:50Z

@nicolaerosia I don't... I only implemented it as a side function to be called separately and @winedarksea started working on a fork. Not too difficult to replace the existing confidence interval with the new one. I'll be happy to help if there's a real chance it will be incorporated into the main package.

nicolaerosia · 2022-05-26T06:33:32Z

@orenmatar I have hopes for inclusion, it seems like @tcuongd is active maintaining!
@winedarksea could you please share your branch? I see something here, https://github.com/winedarksea/prophet/tree/vectorization but it is not related

winedarksea · 2022-05-26T13:36:45Z

@nicolaerosia sure thing.
So right now, I am working with the idea that the "old way" (unmodified) will be run whenever mcmc_sampling=True. The reason for this is that our 'new' function will lose some of the inner functional calls (buried in slow for loops) that are not hidden and are possibly maybe being used by people and that we don't want to break. See my post above.
Basically I still have unresolved feelings on what the best way to change the code base is, since we don't want to break anything people may be depending on

nicolaerosia · 2022-05-26T15:10:23Z

@winedarksea I think it's a good idea, do you have a branch I can take a look at? Thanks a lot!

winedarksea · 2022-05-26T21:06:26Z

sure, I need to publish my changes, I'll try to do that tomorrow

tcuongd added the enhancement label Oct 2, 2021

winedarksea mentioned this issue May 27, 2022

Speed Up Uncertainty Predictions #2186

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speeding up the prediction #2030

Speeding up the prediction #2030

orenmatar commented Sep 18, 2021

winedarksea commented Apr 4, 2022 •

edited

Loading

orenmatar commented Apr 4, 2022

winedarksea commented Apr 4, 2022

winedarksea commented Apr 9, 2022

orenmatar commented Apr 10, 2022

winedarksea commented Apr 22, 2022 •

edited

Loading

orenmatar commented May 11, 2022

winedarksea commented May 11, 2022

orenmatar commented May 11, 2022

nicolaerosia commented May 16, 2022

orenmatar commented May 22, 2022

nicolaerosia commented May 26, 2022

winedarksea commented May 26, 2022

nicolaerosia commented May 26, 2022

winedarksea commented May 26, 2022

Speeding up the prediction #2030

Speeding up the prediction #2030

Comments

orenmatar commented Sep 18, 2021

winedarksea commented Apr 4, 2022 • edited Loading

orenmatar commented Apr 4, 2022

winedarksea commented Apr 4, 2022

winedarksea commented Apr 9, 2022

orenmatar commented Apr 10, 2022

winedarksea commented Apr 22, 2022 • edited Loading

OLD/CURRENT WAY

New Way

orenmatar commented May 11, 2022

winedarksea commented May 11, 2022

orenmatar commented May 11, 2022

nicolaerosia commented May 16, 2022

orenmatar commented May 22, 2022

nicolaerosia commented May 26, 2022

winedarksea commented May 26, 2022

nicolaerosia commented May 26, 2022

winedarksea commented May 26, 2022

winedarksea commented Apr 4, 2022 •

edited

Loading

winedarksea commented Apr 22, 2022 •

edited

Loading