Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typo/ mistake in glmmtmb dokumentation regarding parameterization of nbinom1 #951

Closed
sophiafrei opened this issue Oct 5, 2023 · 4 comments

Comments

@sophiafrei
Copy link

sophiafrei commented Oct 5, 2023

Hallo,
i am quite sure this has to be an mistake: in the dokumentation there is written regarding the nbinom1 parameterization: V=mu*(1+phi).
I think this has to be: V=mu*(1+1/phi). Like it is now it doesnt fit to the nbinom2 parameterization...
King regards,
Sophia Freimüller

@bbolker
Copy link
Contributor

bbolker commented Oct 5, 2023

I'll take a look. The source code that computes the log-likelihood is here:

case truncated_nbinom1_family:
  // Was:
  //   s1 = mu(i);
  //   s2 = mu(i) * (Type(1)+phi(i));  // (1+phi) guarantees that var >= mu
 //   tmp_loglik = dnbinom2(yobs(i), s1, s2, true);
 s1 = log_inverse_linkfun(eta(i), link);          // log(mu)
 s2 = s1 + etad(i) ;                              // log(var - mu)
 tmp_loglik = dnbinom_robust(yobs(i), s1, s2, true);

This is a little hard to read because all the computation is done on the log scale. You also need to know that the dnbinom_robust function is parameterized by the log(mu) and log(var-mu) arguments.

And, etad(i) is the linear predictor for the dispersion parameter for observation i; this is just betad in a simple model with dispformula = ~1; sigma.glmmTMB() generally returns exp(betad). (So, etad(i) is log(phi), however it's defined ...)

Doing the rest of the algebra suggests that the documentation is actually correct.

s2 = log(var - mu) = log(mu) + log(phi)
var - mu = mu*phi
var = mu*(phi + 1)

It is unfortunate but true that different corners of the statistical world parameterize distributions differently. NB2 is most often parameterized as var = mu*(1+mu/theta) (theta and k are the most common symbols for the dispersion parameter), although DESeq (a very popular package for bioinformatics) uses var = mu*(1+alpha*mu), i.e. the reciprocal parameterization. NB1 is most commonly parameterized the way described here.

@bbolker
Copy link
Contributor

bbolker commented Oct 6, 2023

Is there any clarification we can add to the documentation to avoid this confusion?

@sophiafrei
Copy link
Author

At least for me it would be less confusing with different parameters. I am aware of the parameterizations with theta (or ny) and alpha (alpha = 1/theta) as in hilbe 2011. You could use theta for nbinom2 and alpha for nbinom1. As it is right now it implies an indirect relationship between phi_nbinom1 and phi_nbinom2 if i get it correct.

@bbolker
Copy link
Contributor

bbolker commented Oct 23, 2023

I'm going to go ahead and close this: while it might be nice to sit down and review all the parameterizations, and make them consistent with each other as far as possible, we will always have inconsistency somewhere (either between the different parameterizations within glmmTMB, or between glmmTMB and common parameterizations in the literature. I've added a clarifying note to the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants