Add NB and ZINB likelihoods #1656

dirmeier · 2021-03-27T18:09:00Z

PR type: new feature

Summary

Ciao all,

in case you're interested, this PR adds negative binomial and zero-inflated negative binomial likelihoods. The two are used fairly often, for instance in genomics, to model overdispersion in count data. If you want to merge this, I'd add some tests and use cases and clean it up a bit, otherwise feel free to close this again of course.

Cheers,
Simon

Minimal working example

n = 30
k = Exponential()

x = np.linspace(-1, 1, n).reshape(-1, 1)
f = tfd.MultivariateNormalTriL(
    np.repeat(0.0, x.shape[0]),
    tf.linalg.cholesky(k(x, x))
).sample(1, seed=23).numpy().reshape((n, 1))

y = tfd.NegativeBinomial(logits=f, total_count=10).sample(1, seed=23).numpy()
y = y.reshape((n, 1))

lik = ZeroInflatedNegativeBinomial()
m = gpflow.models.VGP((x, y),
                      mean_function=gpflow.mean_functions.Constant(),
                      likelihood=lik, kernel=k)

opt = gpflow.optimizers.Scipy()
opt.minimize(m.training_loss, variables=m.trainable_variables)

st-- · 2021-03-31T16:44:37Z

Hi & welcome to the GPflow community! Thanks for your contribution. I had actually needed negative-binomial likelihood in a previous project. This is on the edge between "should just be its own package on top of GPflow" (maintained by you however you see fit, for as long as you care about) vs "should be in GPflow-core" (maintained by the GPflow maintainers, and in general for as long as the GPflow project keeps existing). I'd be willing to incorporate this into GPflow itself, but it'd be great if you could address the following points:

Run make format to apply black's formatting to the code (the rest of the tests won't run otherwise)
Add the appropriate setup to tests/gpflow/likelihoods/test_likelihoods.py to ensure the likelihood is tested (gives us confidence the code is correct, and helps ensuring we won't break the code in the future)
Ideally add a notebook (or a section within a pre-existing notebook) that shows to users how to use the likelihood, maybe comparing against Poisson?

Would you be up for that ?

st-- · 2021-03-31T16:47:24Z

One thought specific to negative binomial likelihood is that there's different parametrisations (type I, type II, and I believe some more that are less commonly used) - maybe good to make that explicit in the docstring? Or even provide both parametrisations.

dirmeier · 2021-03-31T16:55:57Z

Would you be up for that ?

yeah, definitely, no problem.

xliiauo · 2021-04-04T17:21:26Z

Hi @dirmeier, thanks for your contribution, I was looking to implement the negative binomial likelihood as well before I found your PR here. More than happy to help with additional work to get this PR merged. Thank you.

codecov · 2021-04-05T10:28:22Z

Codecov Report

Merging #1656 (555628d) into develop (405eb97) will increase coverage by 0.03%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop    #1656      +/-   ##
===========================================
+ Coverage    96.08%   96.12%   +0.03%     
===========================================
  Files           86       86              
  Lines         3861     3893      +32     
===========================================
+ Hits          3710     3742      +32     
  Misses         151      151

Impacted Files	Coverage Δ
gpflow/likelihoods/__init__.py	`100.00% <100.00%> (ø)`
gpflow/likelihoods/scalar_discrete.py	`97.02% <100.00%> (+1.37%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 405eb97...555628d. Read the comment docs.

dirmeier · 2021-04-05T20:13:13Z

Ciao @st-- ,
I've adapted the docstrings, added a notebook and updated the test setup for both negative binomial and its zero-inflated version. Since you didn't mention the ZINB, should I rather remove it here?

Cheers,
Simon

dirmeier · 2021-05-01T10:18:55Z

Ciao @st-- , did you find the time to have a look at this PR?

mathDR · 2021-11-23T02:31:09Z

bumping this.

Add NB and ZINB likelihoods

704b978

Add unit tests for NB and ZINB

c41a422

dirmeier added 2 commits April 5, 2021 01:34

Add notebook on NB regression

077ee60

Make format

855eef7

Add entry in docs

555628d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add NB and ZINB likelihoods #1656

Add NB and ZINB likelihoods #1656

dirmeier commented Mar 27, 2021 •

edited

st-- commented Mar 31, 2021 •

edited

st-- commented Mar 31, 2021

dirmeier commented Mar 31, 2021

xliiauo commented Apr 4, 2021

codecov bot commented Apr 5, 2021 •

edited

dirmeier commented Apr 5, 2021

dirmeier commented May 1, 2021

mathDR commented Nov 23, 2021

Add NB and ZINB likelihoods #1656

Are you sure you want to change the base?

Add NB and ZINB likelihoods #1656

Conversation

dirmeier commented Mar 27, 2021 • edited

Summary

Minimal working example

st-- commented Mar 31, 2021 • edited

st-- commented Mar 31, 2021

dirmeier commented Mar 31, 2021

xliiauo commented Apr 4, 2021

codecov bot commented Apr 5, 2021 • edited

Codecov Report

dirmeier commented Apr 5, 2021

dirmeier commented May 1, 2021

mathDR commented Nov 23, 2021

dirmeier commented Mar 27, 2021 •

edited

st-- commented Mar 31, 2021 •

edited

codecov bot commented Apr 5, 2021 •

edited