Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NB and ZINB likelihoods #1656

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

dirmeier
Copy link

@dirmeier dirmeier commented Mar 27, 2021

PR type: new feature

Summary

Ciao all,

in case you're interested, this PR adds negative binomial and zero-inflated negative binomial likelihoods. The two are used fairly often, for instance in genomics, to model overdispersion in count data. If you want to merge this, I'd add some tests and use cases and clean it up a bit, otherwise feel free to close this again of course.

Cheers,
Simon

Minimal working example

n = 30
k = Exponential()

x = np.linspace(-1, 1, n).reshape(-1, 1)
f = tfd.MultivariateNormalTriL(
    np.repeat(0.0, x.shape[0]),
    tf.linalg.cholesky(k(x, x))
).sample(1, seed=23).numpy().reshape((n, 1))

y = tfd.NegativeBinomial(logits=f, total_count=10).sample(1, seed=23).numpy()
y = y.reshape((n, 1))

lik = ZeroInflatedNegativeBinomial()
m = gpflow.models.VGP((x, y),
                      mean_function=gpflow.mean_functions.Constant(),
                      likelihood=lik, kernel=k)

opt = gpflow.optimizers.Scipy()
opt.minimize(m.training_loss, variables=m.trainable_variables)

@st--
Copy link
Member

st-- commented Mar 31, 2021

Hi & welcome to the GPflow community! Thanks for your contribution. I had actually needed negative-binomial likelihood in a previous project. This is on the edge between "should just be its own package on top of GPflow" (maintained by you however you see fit, for as long as you care about) vs "should be in GPflow-core" (maintained by the GPflow maintainers, and in general for as long as the GPflow project keeps existing). I'd be willing to incorporate this into GPflow itself, but it'd be great if you could address the following points:

  • Run make format to apply black's formatting to the code (the rest of the tests won't run otherwise)
  • Add the appropriate setup to tests/gpflow/likelihoods/test_likelihoods.py to ensure the likelihood is tested (gives us confidence the code is correct, and helps ensuring we won't break the code in the future)
  • Ideally add a notebook (or a section within a pre-existing notebook) that shows to users how to use the likelihood, maybe comparing against Poisson?

Would you be up for that ?

@st--
Copy link
Member

st-- commented Mar 31, 2021

One thought specific to negative binomial likelihood is that there's different parametrisations (type I, type II, and I believe some more that are less commonly used) - maybe good to make that explicit in the docstring? Or even provide both parametrisations.

@dirmeier
Copy link
Author

Would you be up for that ?

yeah, definitely, no problem.

@xliiauo
Copy link

xliiauo commented Apr 4, 2021

Hi @dirmeier, thanks for your contribution, I was looking to implement the negative binomial likelihood as well before I found your PR here. More than happy to help with additional work to get this PR merged. Thank you.

@codecov
Copy link

codecov bot commented Apr 5, 2021

Codecov Report

Merging #1656 (555628d) into develop (405eb97) will increase coverage by 0.03%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1656      +/-   ##
===========================================
+ Coverage    96.08%   96.12%   +0.03%     
===========================================
  Files           86       86              
  Lines         3861     3893      +32     
===========================================
+ Hits          3710     3742      +32     
  Misses         151      151              
Impacted Files Coverage Δ
gpflow/likelihoods/__init__.py 100.00% <100.00%> (ø)
gpflow/likelihoods/scalar_discrete.py 97.02% <100.00%> (+1.37%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 405eb97...555628d. Read the comment docs.

@dirmeier
Copy link
Author

dirmeier commented Apr 5, 2021

Ciao @st-- ,
I've adapted the docstrings, added a notebook and updated the test setup for both negative binomial and its zero-inflated version. Since you didn't mention the ZINB, should I rather remove it here?

Cheers,
Simon

@dirmeier
Copy link
Author

dirmeier commented May 1, 2021

Ciao @st-- , did you find the time to have a look at this PR?

@mathDR
Copy link
Contributor

mathDR commented Nov 23, 2021

bumping this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants