ajr/negativebinomialvariants #1536

andrewjradcliffe · 2022-04-25T20:06:41Z

This introduces 3 new distribution types, corresponding to
re-parameterizations of the NegativeBinomial distribution as

location-scale Negative binomial distribution (alternative paramerization), Stan
loglocation-scale Negative binomial distribution (log alternative parameterization), Stan
shape-scale Negative binomial distribution, Bayesian Data Analysis (3rd edition), Appendix A

These have a number of uses. In particular, 1. and 2. emphasize the
that the negative binomial distribution is a robust form of the Poisson,
which has noted utility for negative binomial regression involving
offsets (a model frequently employed in epidemiology, failure
modeling). The shape-scale parameterization confers superior
numerical stability, which can be useful when working with rare
events (particularly when one wishes to draw samples).

In particular, for rare events, the alternative parameterizations
are robust in the presence of values which might otherwise suffer
from floating point roundoff using the NegativeBinomial.
This can be demonstrated by considering the conversion of a
location-scale NegativeBinomial2 to NegativeBinomial

using Plots
gr(size=(600,400))
f(μ, ϕ) = ϕ / (μ + ϕ)
p1 = contour(1e-16:5e-17:1e-15, 1.0:0.01:10.0, (x, y) -> log(f(x, y)), ylabel="ϕ", xlabel="μ", right_margin=40*Plots.px);
p2 = heatmap(1e-16:5e-17:1e-15, 1.0:0.01:10.0, (x, y) -> log(f(x, y)), ylabel="ϕ", xlabel="μ", right_margin=40*Plots.px);

I write the above not to malign the choice of the NegativeBinomial
in terms of (r, p), but to emphasize that there are real
advantages to using the alternative parameterizations. Naturally,
conversions are provided between the parameterizations, so that
the experience is seamless.

Notes:

] test Distributions revealed no issues.
Documentation additions seem to work without issue.

This introduces 3 new distribution types, corresponding to re-parameterizations of the NegativeBinomial distribution as 1. location-scale [Negative binomial distribution (alternative paramerization), Stan](https://mc-stan.org/docs/2_29/functions-reference/nbalt.html) 2. loglocation-scale [Negative binomial distribution (log alternative parameterization), Stan](https://mc-stan.org/docs/2_29/functions-reference/neg-binom-2-log.html) 3. shape-scale [Negative binomial distribution, Bayesian Data Analysis (3rd edition), Appendix A](http://www.stat.columbia.edu/~gelman/book/BDA3.pdf) These have a number of uses. In particular, 1. and 2. emphasize the that the negative binomial distribution is a robust form of the Poisson, which has noted utility for negative binomial regression involving offsets (a model frequently employed in epidemiology, failure modeling). The shape-scale parameterization confers superior numerical stability, which can be useful when working with rare events (particularly when one wishes to draw samples). In particular, for rare events, the alternative parameterizations are robust in the presence of values which might otherwise suffer from floating point roundoff using the `NegativeBinomial`. This can be demonstrated by considering the conversion of a location-scale `NegativeBinomial2` to `NegativeBinomial` ``` using Plots gr(size=(600,400)) f(μ, ϕ) = ϕ / (μ + ϕ) p1 = contour(collect(1e-16:5e-17:1e-15), collect(1.0:0.01:10.0), (x, y) -> log(f(x, y)), ylabel="ϕ", xlabel="μ", right_margin=40*Plots.px); p2 = heatmap(collect(1e-16:5e-17:1e-15), collect(1.0:0.01:10.0), (x, y) -> log(f(x, y)), ylabel="ϕ", xlabel="μ", right_margin=40*Plots.px); ``` I write the above not to malign the choice of the `NegativeBinomial` in terms of (`r`, `p`), but to emphasize that there are real advantages to using the alternative parameterizations. Naturally, conversions are provided between the parameterizations, so that the experience is seamless.

codecov-commenter · 2022-04-25T21:09:17Z

Codecov Report

Merging #1536 (c98862a) into master (bb11df8) will decrease coverage by 0.08%.
The diff coverage is 77.90%.

@@            Coverage Diff             @@
##           master    #1536      +/-   ##
==========================================
- Coverage   85.45%   85.37%   -0.09%     
==========================================
  Files         128      131       +3     
  Lines        7819     8035     +216     
==========================================
+ Hits         6682     6860     +178     
- Misses       1137     1175      +38

Impacted Files	Coverage Δ
src/Distributions.jl	`100.00% <ø> (ø)`
src/univariates.jl	`74.07% <ø> (ø)`
...nivariate/discrete/negativebinomialpoissongamma.jl	`72.00% <72.00%> (ø)`
...rc/univariate/discrete/negativebinomiallocation.jl	`75.00% <75.00%> (ø)`
...univariate/discrete/negativebinomialloglocation.jl	`76.00% <76.00%> (ø)`
src/conversion.jl	`100.00% <100.00%> (ø)`
src/univariate/discrete/bernoulli.jl	`89.39% <0.00%> (-1.92%)`	⬇️
src/univariate/continuous/uniform.jl	`92.20% <0.00%> (-0.74%)`	⬇️
src/univariate/discrete/binomial.jl	`94.28% <0.00%> (-0.28%)`	⬇️
src/univariate/discrete/geometric.jl	`89.47% <0.00%> (-0.24%)`	⬇️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bb11df8...c98862a. Read the comment docs.

devmotion

Thanks for the PR!

I made some suggestions. My main suggestion would be to use more descriptive names for the different parameterizations, similar to e.g. Normal/NormalCanon.

Additionally, I think the tests should be extended such that most/all newly added lines are covered.

docs/src/univariate.md

src/univariate/discrete/negativebinomial.jl

src/univariate/discrete/negativebinomial2.jl

src/univariate/discrete/negativebinomial2log.jl

src/univariate/discrete/negativebinomial2.jl

src/univariate/discrete/negativebinomial3.jl

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

andrewjradcliffe · 2022-04-25T23:16:20Z

Additionally, I think the tests should be extended such that most/all newly added lines are covered.

With respect to tests, a quick inspection of the NegativeBinomial reveals that it mostly relies on R for the tests. The readme.md file in /test/ref/ seems to indicate that extending the R tests to these additional distributions should proceed analogously to NormalCanon?

Thoughts:

What sorts of additional tests should I add in Julia?
The R tests are nicely set up, but for this example, will merely verify the math specifying the conversions between parameterizations. Any other suggestions for tests?

devmotion · 2022-04-26T21:43:36Z

Yeah, unfortunately the test setup is a bit of a mess, mostly due to historical reasons.

Since NegativeBinomial is already checked with the values from R, I think a good start would be to check that the alternative parameterizations are consistent with NegativeBinomial in Julia. Currently, in this PR only the logpdf values are compared it seems but actually ideally we would check all implemented functions. Sampling should be checked as well, e.g., by computing summary statistics (not sure how it's done for NegativeBinomial).

Regarding R, I guess one could copy the existing references for NegativeBinomial and replace the Julia type and its parameters with the ones from the alternative parameterizations (ie. update the Julia side without changing the R part).

src/univariate/discrete/negativebinomial2.jl

src/univariate/discrete/negativebinomial3.jl

(the previous commit was included only for archival purposes)

This follows the Poisson distribution's @check_args, and, admittedly, did not arise until I was pressing some extreme cases of rare events and a sampler threw an exception due to location being 0. (The location would never be in fact not zero, but perhaps numerical roundoff might drive it there. -- I do not know the history of Poisson distribution's @check_args, but I surmise something similar).

DoktorMike · 2023-02-11T12:31:03Z

Just a friendly ping of this PR since I indeed would like to use these alternative parameterizations of the NegativeBinomial. Is there anything major holding a merge back?

andrewjradcliffe added 9 commits April 25, 2022 11:09

Modify placement of negative binomial conversions

d9cd214

Update export list to include new parameterizations

63747dc

Fix typo

71e6a6d

Update docstrings

4b573fd

Additional docstring adjustments

3701642

Make compatible with older Julia versions

0781eea

Actually fix destructuring errors

560a219

Change all to getproperty calls

5c20cd0

devmotion reviewed Apr 25, 2022

View reviewed changes

andrewjradcliffe and others added 19 commits April 25, 2022 14:55

Update src/univariate/discrete/negativebinomial.jl

d990253

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial2.jl

6a77786

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial2log.jl

a152734

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial2.jl

df5e05e

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial3.jl

8cb6b96

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial2log.jl

7f0bf1d

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial2.jl

8c7c867

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial2.jl

fc23887

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial3.jl

1807f0c

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial2log.jl

1d4e3b9

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial2log.jl

8c0329f

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial3.jl

e001ec7

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial3.jl

be4ff38

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial2log.jl

38ae782

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial2.jl

1002243

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial2.jl

ad9902c

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial3.jl

1e0075a

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial2log.jl

6bbd10a

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

Update src/univariate/discrete/negativebinomial2.jl

433e69e

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

st-- mentioned this pull request Apr 28, 2022

Update NegativBinomial parametrizations with more stable Distributions.jl backend JuliaGaussianProcesses/GPLikelihoods.jl#85

Open

st-- reviewed Apr 28, 2022

View reviewed changes

src/univariate/discrete/negativebinomial2.jl Outdated Show resolved Hide resolved

st-- reviewed Apr 28, 2022

View reviewed changes

src/univariate/discrete/negativebinomial2.jl Outdated Show resolved Hide resolved

st-- reviewed Apr 28, 2022

View reviewed changes

src/univariate/discrete/negativebinomial3.jl Outdated Show resolved Hide resolved

andrewjradcliffe added 19 commits May 5, 2022 18:46

Update NegativeBinomial2

8867d1e

Update NegativeBinomial2Log

28c797c

Update NegativeBinomial3

46b7b13

Update conversions

4bb68fc

Re-name files to match types

9076c59

Update names in univariates list

e7dfcef

Update extant tests

a542b30

Re-name test files

2ad9239

Update names in export list

105fd0b

Fix minor oversight in zero-arg constructor

8fc3164

Fix re-name in test

3881886

Fix conversions

d11de19

Clarify parametric conversions in docstrings

a415728

Fix cquantile

61aa914

Make note of alias in docstring

d6084d5

Add negative binomial tests, including expanded code

a008eb4

Clean, non-repetitive form of tests

84b0c2c

(the previous commit was included only for archival purposes)

Update docs

c7c9d00

Polish the docstrings

781b4be

andrewjradcliffe requested a review from devmotion May 6, 2022 20:16

andrewjradcliffe added 3 commits June 2, 2022 09:22

Minor touch-ups to docstrings

c8489f6

Correct checkargs on log-location parameter

55b53e4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ajr/negativebinomialvariants #1536

ajr/negativebinomialvariants #1536

andrewjradcliffe commented Apr 25, 2022

codecov-commenter commented Apr 25, 2022 •

edited

devmotion left a comment

andrewjradcliffe commented Apr 25, 2022

devmotion commented Apr 26, 2022

DoktorMike commented Feb 11, 2023

ajr/negativebinomialvariants #1536

Are you sure you want to change the base?

ajr/negativebinomialvariants #1536

Conversation

andrewjradcliffe commented Apr 25, 2022

codecov-commenter commented Apr 25, 2022 • edited

Codecov Report

devmotion left a comment

Choose a reason for hiding this comment

andrewjradcliffe commented Apr 25, 2022

devmotion commented Apr 26, 2022

DoktorMike commented Feb 11, 2023

codecov-commenter commented Apr 25, 2022 •

edited