Add test with alpha as a RV for negative binomial models. #106

xjing76 · 2022-03-08T18:24:41Z

This PR added test for NB with an alpha as RV. However, it seems like that the samples in the toy model set up could not converge close enough to the real alpha

Where assert np.abs(trace.posterior.alpha.values[0].mean(0) - true_alpha)/true_alpha < 0.1 would not be able to pass even with very large number of samples.

brandonwillard

First, for work like this, a report characterizing the data to be fit, the model, and the MCMC results (e.g. posterior trace plots, posterior predictive plots, relevant posterior summary statistics, convergence diagnostics, etc.) needs to be constructed. Without those elements, the only real conclusions that can be drawn are too high-level and/or "syntactic", and there isn't anything that stands out at those levels right now.

In such a report, it would help to confirm that each part is working as expected conditional on the other parts. For instance, does the HSStep sampler converge as expected when true_alpha is used in the model? How about the NUTS estimation of alpha when beta_true is used?

Aside from that, perhaps alpha around 100 results in a large-ish overdispersion—under NegativeBinomial's parameterization—relative to a mean like exp(X.dot(beta)) for the given X and beta_true. If the "true" model generates mean values around 500 on average, then the negative-binomial variance is six times the corresponding Poisson model variance, I believe. This isn't necessarily bad, but it could be adding some understandable signal-to-noise trouble that prevents a quick recovery of the "true" parameters.

xjing76 · 2022-03-08T22:10:19Z

Here is a notebook report on the findings. After I reduce the alpha parameter to smaller value. I think the true values are much easier to recover. However, when I set it to 100 there is still about 20% overshoot. 10% when provided true beta. But I am not sure if this is still a concern. I think I probably set alpha for toy data in python-model-deployment pretty unrealistically (100) and I shall just reduce it,

brandonwillard · 2022-03-09T00:00:12Z

Here is a notebook report on the findings.

This is very helpful! It's missing a plot of the simulated data, though, which would help illustrate some of the challenges in this simulation. For instance, I'm pretty sure that the variance is very large.

After I reduce the alpha parameter to smaller value. I think the true values are much easier to recover. However, when I set it to 100% there is still about 20% overshoot. 10% when provided true beta. But I am not sure if this is still a concern. I think I probably set alpha for toy data in python-model-deployment pretty unrealistically (100) and I shall just reduce it,

I think the results you're seeing are likely a result of a relatively small number of observations, a large variance, a small MCMC sample size, and/or the choice of alpha's prior.

It's difficult to see where the true values lie relative to the posterior distributions (see my comment), but none of them look too far off from the true values—at least not in a way that clearly implies systematic bias or the like. Aside from that, all of your alpha prior distributions were centered around the true value, so it's hard to gauge the quality of posterior estimation for that term.

Contrary to what you wrote in that notebook, it looks like the last model (i.e. with beta fixed at the true value) does worse than all the other ones that included alpha. Even though it's also centered at the true value, its posterior mean is noticeably lower. Plus, its trace plot doesn't look any better than the other ones (e.g. in terms of overt autocorrelation), but it's very difficult to tell given the small chain size.

Also, care needs to be taken when choosing priors for constrained parameters. In this case, alpha must be greater than zero, but the normal prior you've been using won't respect that on its own.

To help move past some of the ambiguities in the current simulation, try

increasing the number of observations,
increasing the number of MCMC samples,
choose a prior that matches alpha's constraints and isn't centered at the true value, but has enough density there to allow convergence.

You can also try decreasing the magnitude of the true beta (and alpha commensurately) in order to keep the simulated variance down.

xjing76 · 2022-03-09T14:29:52Z

Updated notebook

1 Increased number of obs to 500
2 Increased number of samples to 1000
3 Changed alpha prior to gamma distribution.
4 Added in a model with prior that is not centered around true alpha.

I think the prior did quite affect the ability of convergence quite a bit. Base on the toy setup it usually results in about 10% off from the true alpha.

brandonwillard · 2022-03-09T21:25:36Z

Updated notebook

1 Increased number of obs to 500 2 Increased number of samples to 1000 3 Changed alpha prior to gamma distribution. 4 Added in a model with prior that is not centered around true alpha.

I think the prior did quite affect the ability of convergence quite a bit. Base on the toy setup it usually results in about 10% off from the true alpha.

Do you think one of these would work as a test?

Add tests for alpha as variable

eee8c26

xjing76 force-pushed the NB_alpha_var branch from 7b78033 to eee8c26 Compare March 8, 2022 18:32

xjing76 changed the title ~~Add tests with alpha as a RV for negative binomial models.~~ Add test with alpha as a RV for negative binomial models. Mar 8, 2022

brandonwillard reviewed Mar 8, 2022

View reviewed changes

xjing76 force-pushed the NB_alpha_var branch 3 times, most recently from 438712d to f9a9afd Compare March 11, 2022 19:46

Add test with dispersion rv for NB

64d62a8

xjing76 force-pushed the NB_alpha_var branch from f9a9afd to 64d62a8 Compare March 14, 2022 15:48

brandonwillard assigned xjing76 Mar 15, 2022

brandonwillard added enhancement New feature or request samplers labels Mar 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add test with alpha as a RV for negative binomial models. #106

Add test with alpha as a RV for negative binomial models. #106

xjing76 commented Mar 8, 2022 •

edited

Loading

brandonwillard left a comment

xjing76 commented Mar 8, 2022 •

edited

Loading

brandonwillard commented Mar 9, 2022 •

edited

Loading

xjing76 commented Mar 9, 2022

brandonwillard commented Mar 9, 2022

Add test with alpha as a RV for negative binomial models. #106

Are you sure you want to change the base?

Add test with alpha as a RV for negative binomial models. #106

Conversation

xjing76 commented Mar 8, 2022 • edited Loading

brandonwillard left a comment

Choose a reason for hiding this comment

xjing76 commented Mar 8, 2022 • edited Loading

brandonwillard commented Mar 9, 2022 • edited Loading

xjing76 commented Mar 9, 2022

brandonwillard commented Mar 9, 2022

xjing76 commented Mar 8, 2022 •

edited

Loading

xjing76 commented Mar 8, 2022 •

edited

Loading

brandonwillard commented Mar 9, 2022 •

edited

Loading