-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ajr/negativebinomialvariants #1536
base: master
Are you sure you want to change the base?
ajr/negativebinomialvariants #1536
Conversation
This introduces 3 new distribution types, corresponding to re-parameterizations of the NegativeBinomial distribution as 1. location-scale [Negative binomial distribution (alternative paramerization), Stan](https://mc-stan.org/docs/2_29/functions-reference/nbalt.html) 2. loglocation-scale [Negative binomial distribution (log alternative parameterization), Stan](https://mc-stan.org/docs/2_29/functions-reference/neg-binom-2-log.html) 3. shape-scale [Negative binomial distribution, Bayesian Data Analysis (3rd edition), Appendix A](http://www.stat.columbia.edu/~gelman/book/BDA3.pdf) These have a number of uses. In particular, 1. and 2. emphasize the that the negative binomial distribution is a robust form of the Poisson, which has noted utility for negative binomial regression involving offsets (a model frequently employed in epidemiology, failure modeling). The shape-scale parameterization confers superior numerical stability, which can be useful when working with rare events (particularly when one wishes to draw samples). In particular, for rare events, the alternative parameterizations are robust in the presence of values which might otherwise suffer from floating point roundoff using the `NegativeBinomial`. This can be demonstrated by considering the conversion of a location-scale `NegativeBinomial2` to `NegativeBinomial` ``` using Plots gr(size=(600,400)) f(μ, ϕ) = ϕ / (μ + ϕ) p1 = contour(collect(1e-16:5e-17:1e-15), collect(1.0:0.01:10.0), (x, y) -> log(f(x, y)), ylabel="ϕ", xlabel="μ", right_margin=40*Plots.px); p2 = heatmap(collect(1e-16:5e-17:1e-15), collect(1.0:0.01:10.0), (x, y) -> log(f(x, y)), ylabel="ϕ", xlabel="μ", right_margin=40*Plots.px); ``` I write the above not to malign the choice of the `NegativeBinomial` in terms of (`r`, `p`), but to emphasize that there are real advantages to using the alternative parameterizations. Naturally, conversions are provided between the parameterizations, so that the experience is seamless.
Codecov Report
@@ Coverage Diff @@
## master #1536 +/- ##
==========================================
- Coverage 85.45% 85.37% -0.09%
==========================================
Files 128 131 +3
Lines 7819 8035 +216
==========================================
+ Hits 6682 6860 +178
- Misses 1137 1175 +38
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
I made some suggestions. My main suggestion would be to use more descriptive names for the different parameterizations, similar to e.g. Normal
/NormalCanon
.
Additionally, I think the tests should be extended such that most/all newly added lines are covered.
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
With respect to tests, a quick inspection of the Thoughts:
|
Yeah, unfortunately the test setup is a bit of a mess, mostly due to historical reasons. Since Regarding R, I guess one could copy the existing references for |
(the previous commit was included only for archival purposes)
This follows the Poisson distribution's @check_args, and, admittedly, did not arise until I was pressing some extreme cases of rare events and a sampler threw an exception due to location being 0. (The location would never be in fact not zero, but perhaps numerical roundoff might drive it there. -- I do not know the history of Poisson distribution's @check_args, but I surmise something similar).
Just a friendly ping of this PR since I indeed would like to use these alternative parameterizations of the NegativeBinomial. Is there anything major holding a merge back? |
This introduces 3 new distribution types, corresponding to
re-parameterizations of the NegativeBinomial distribution as
These have a number of uses. In particular, 1. and 2. emphasize the
that the negative binomial distribution is a robust form of the Poisson,
which has noted utility for negative binomial regression involving
offsets (a model frequently employed in epidemiology, failure
modeling). The shape-scale parameterization confers superior
numerical stability, which can be useful when working with rare
events (particularly when one wishes to draw samples).
In particular, for rare events, the alternative parameterizations
are robust in the presence of values which might otherwise suffer
from floating point roundoff using the
NegativeBinomial
.This can be demonstrated by considering the conversion of a
location-scale
NegativeBinomial2
toNegativeBinomial
I write the above not to malign the choice of the
NegativeBinomial
in terms of (
r
,p
), but to emphasize that there are realadvantages to using the alternative parameterizations. Naturally,
conversions are provided between the parameterizations, so that
the experience is seamless.
Notes:
] test Distributions
revealed no issues.