Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Negative Binomial or Poisson to handle counts data? #337

Open
wlad-svennik opened this issue Oct 24, 2017 · 13 comments
Open

Use Negative Binomial or Poisson to handle counts data? #337

wlad-svennik opened this issue Oct 24, 2017 · 13 comments

Comments

@wlad-svennik
Copy link
Contributor

wlad-svennik commented Oct 24, 2017

There is a simple regression algorithm for counts data, called Poisson regression. This algorithm assumes that every regressor has a multiplicative effect. It's similar to computing the log of the data, except it works even when the data has zeroes.

It's conceivable that you could replace the Poisson distribution with the more general Negative Binomial distribution. The NB distribution is a generalization of the Poisson that allows the mean to be different from the variance. In contrast, in a Poisson distribution the mean is always the same as the variance.

The main difficulty with just changing the Normal distribution to a Negative Binomial is it's then necessary to add the constraint that $0 < \mu < \sigma^2$.

Does this seem like a good idea?

@bletham
Copy link
Contributor

bletham commented Nov 2, 2017

Stan has negative binomial parameterizations that don't require constraints, such as neg_binomial_2. I think this is a very reasonable thing to do. Basically the model is

y(t) ~ Normal(g(t) + X*beta, sigma**2)

where g(t) is the rate model. The alternative would then be

y(t) ~ NegativeBinomial(g(t) + X*beta, phi)

My only concern would be that if the count values are relatively small then we would likely need a lot of data to reasonably estimate seasonality. I think MCMC would be important here. On the other hand if counts are large, then a normal distribution probably provides a reasonable enough approximation. Worth doing though!

@kidddw
Copy link

kidddw commented May 3, 2019

I recently modified the underlying Stan model to assume a negative binomial distribution like that suggested above. My concern at the moment is that it appears that prophet scales the target values before fitting. This cannot be done when using such a distribution which assumes integer target values.

Are there any thoughts on the sensitivity of the prophet model to the scaling of the data? For instance, are parameters initialized under the assumption that the target values will be between 0 and 1?

@bletham
Copy link
Contributor

bletham commented May 6, 2019

Good point about the scaling for negative binomial link function.

There is hard-coded in the Stan a prior on the noise term of N(0, 1/2). This is rather weak for y \in [0, 1], but would probably need to be adjusted if the data were not scaled.

The priors on changepoint delta, seasonality beta, and holiday beta would also be scale-dependent. These can be directly set with changepoint.prior.scale, seasonality.prior.scale, and holiday.prior.scale inputs so no real difficulty there other than the default values might be really bad for unscaled data.

I'm pretty sure there isn't anything else that is scale dependent or assumes y <= 1.

One other thing to note is that the Gaussian link function is encoded in both the Stan and in the R/Py, so you'll need to also adjust the sample_model function (https://github.com/facebook/prophet/blob/master/R/R/prophet.R#L1559) to make predictions.

@oren0e
Copy link

oren0e commented Jun 16, 2019

Regarding the poisson link function - I thought it will be out in the 0.5 version. Do you know when you'll add this feature?

@bletham
Copy link
Contributor

bletham commented Jun 17, 2019

We were working on getting it in place with #865 but that got stuck by some perf issues in rstan for which we're still trying to figure out the best workaround, so unfortunately not yet.

@sammo
Copy link

sammo commented Sep 17, 2019

Hi @bletham . Thank you for this amazing package called fbprophet!! It's moving the forecasting world to a new level.
Question: are there any plans to get the negative binomial function into 0.6?

@eromoe
Copy link

eromoe commented Oct 22, 2019

@bletham

got stuck by some perf issues in rstan

Could we assume it is fine in pystan , if so, could you please release a workable python version for now ?

@bletham
Copy link
Contributor

bletham commented Feb 21, 2020

I was targeting for this to happen after #501, which would have made it easier / more generic, but after the addition of the cmdstanpy backend I've decided to no longer pursue #501 so this should just be done directly.

This should be able to look a lot like how there is currently a switch between the linear and logistic trends; here we would have a switch between link functions. The link function is defined in Stan, right here:



So that would need a switch to alternatively use a NB/Poisson.

What is currently yhat in the rest of the codebase would now be the rate of that process - I feel like this is what we are really interested in anyway, so I don't think we'd need to make any changes to that. What would need to be changed is the uncertainty estimation. For that, the Gaussian link function shows up here:

noise = np.random.normal(0, sigma, df.shape[0]) * self.y_scale
return pd.DataFrame({
'yhat': trend * (1 + Xb_m) + Xb_a + noise,
'trend': trend
})

prophet/R/R/prophet.R

Lines 1579 to 1583 in 46e5611

noise <- stats::rnorm(nrow(df), mean = 0, sd = sigma) * m$y.scale
return(list("yhat" = trend * (1 + Xb_m) + Xb_a + noise,
"trend" = trend))
}

And I believe that should be it. So code-wise, this should not require massive changes. The main questions that I have is around validation (checking on some realistic small-count datasets that this is doing something reasonable / making sure that the fitting doesn't fail / do we need NB or is Poisson sufficient?).

@bletham
Copy link
Contributor

bletham commented Jun 18, 2020

I just added a working negative binomial likelihood in #1544. It's a work in progress and will need some more validation, in particular we'll want to try it out on some real datasets. If you have real time series with count data that you can share, please post them on #1500. Thanks!

@oren0e
Copy link

oren0e commented Jun 28, 2020

@bletham I would be interested to contribute somehow to the effort of bringing these count-data likelihoods into the library since this was super important for me at my previous job. What can I do?

@bletham
Copy link
Contributor

bletham commented Jul 13, 2020

@oren0e sorry for the slow reply, and thanks for being willing to contribute!
I'd like to get the NB likelihood in the package in the next couple weeks, and then do a version release at the end of the month. I have an initial sketch version in #1500 (PR is #1544). My main concern now is whether or not this will actually work on real data (I'm worried about numerical issues from the logit link function), so if you still have any real count-data time series that we could test this on, that'd be most useful. Beyond that, my PR is pretty bare bones. A review of #1544 would be great, especially to be sure I didn't mess things up when switching from the stan NB parameterization to the scipy parameterization, and then we'll have to see if there are any edge cases that PR isn't handling correctly, write tests, translate it to R, write documentation.

@oren0e
Copy link

oren0e commented Jul 13, 2020

@bletham sadly I don't have the data I was working on since it was left on my previous company's servers.
I will do my best to review the PR, I hope it does not involve a lot of Stan syntax since I'm not familiar with it. Will add comments to the PR if I find something useful.

@bletham
Copy link
Contributor

bletham commented Mar 1, 2021

I implemented a NB likelihood in #1544. There were significant numerical issues around the hinge function that is required to convert the latent forecast into a positive process rate. Discussion of this is in #1500. As discussed there, I'm not very optimistic about the NB likelihood being broadly useful in Prophet due to these challenges. For the purposes of handling small-count data (especially when we're trying to get a forecast that stays positive), there are some much more robust approaches that are explored in #1668 that I think provide a better direction than a NB likelihood. So in light of the issues in my PR, this effort is deprioritized and probably won't ever make it into the package. Though interested individuals can of course patch in my PR and try it out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants