Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dispersion of result is too small #1679

Closed
leoknuth opened this issue Sep 18, 2020 · 1 comment
Closed

Dispersion of result is too small #1679

leoknuth opened this issue Sep 18, 2020 · 1 comment

Comments

@leoknuth
Copy link

Blue points are predict result from Prophet. Is there any idea to improve?

result

Predict doesn't fit actual data well.
image

@bletham
Copy link
Contributor

bletham commented Sep 21, 2020

The fit is pretty poor. One reason it is poor is because it looks like the time series is of something that is always positive, while the model assumes Gaussian noise and that's why you see the lower uncertainty estimate below 0. There is a bunch of discussion of this issue in #1668 and you could try some of the strategies there. I specified there a ProphetPos class and I'd be really interested to see what it does on this data. Is there any chance you can share a CSV of this time series?

Besides that, there is a second reason the model fit is poor and it's that these spikes seem to be totally missed. I can't quite tell what the time course of those spikes is. Is this sub-daily data and those spikes take place on a single day? Or do they last for a week? In either case, it won't be captured by the seasonality estimates. I'm guessing each spike is a day, in which case what this means is that the magnitude of the daily seasonality is fluctuating from day to day. Prophet assumes fixed daily seasonality, and so what it fits is something like the average of what is seen in the history. Do you need to forecast at the hourly level? If not, then aggregating the data into a daily total would probably be much easier to forecast. Otherwise, you'll have to somehow specify to the model what is different about the days where there is a tall spike and the days where there isn't. As is, it has no way to know that.

@bletham bletham closed this as completed Nov 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants