A closer look at dispformula=~0 - never needing the R-side in gaussian mixed models again? #653

SchmidtPaul · 2020-12-17T21:57:29Z

In the plant sciences and related fields we often deal with linear mixed models with non-iid variance structures:

sometimes for random effects (on the G-side)
but more often for the residual (R-side).

Therefore, in the past I was forced to use nlme with its weights = argument and basically hoped that my model would not get too complex. If it did get too complex, I switched to the asreml package or even SAS' PROC MIXED and its REPEATED argument. Both of them are certainly great, but not open source. I was happy to learn about glmmTMB's ability to deal with variance structures while giving me the advantages of lme4. I was really happy when I found this in the documentation:

In Gaussian mixed models, dispformula=~0 fixes the residual variance to be 0 (actually a small non-zero value: at present it is set to sqrt(.Machine$double.eps)), forcing variance into the random effects.

Fixing the residual variance to be 0 and then mimicing it with an additional random effect should theoretically lead to identical results. In addition, I may now use all the variance structures with this pseudo error term. So all the problems are solved! What is this issue about then? It's basically this:

How well does `dispformula=~0` really do the trick? What are the caveats and is there more info/citeable proof somewhere of how well it works?

Here is an example of how the pseudo-error-approach can lead to obtaining essentially identical variance component estimates and AICc values.

library(agridat)
library(AICcmodavg)
library(glmmTMB)
library(mixedup)
library(tidyverse)

dat <- agridat::mcconway.turnip %>% 
  as_tibble() %>% 
  mutate(unit = 1:n()) %>% 
  mutate_at(vars(density, unit), as.factor)

m1 <- glmmTMB(yield ~ gen*date*density + 
                (1 | block), 
              dispformula = ~ 1, # default i.e. iid error variance
              REML = TRUE,
              data = dat)

m2 <- glmmTMB(yield ~ gen*date*density + 
                (1 | block) +
                (1 | unit),      # pseudo error term
              dispformula = ~ 0, # fix error variance to be 0
              REML = TRUE,
              data = dat)
#> Warning in fitTMB(TMBStruc): Model convergence problem; false convergence (8).
#> See vignette('troubleshooting')

mixedup::extract_vc(m1)
#>      group    effect variance    sd sd_2.5 sd_97.5 var_prop
#> 1    block Intercept    2.812 1.677  0.635   4.431    0.227
#> 2 Residual              9.591 3.097     NA      NA    0.773

mixedup::extract_vc(m2)
#>      group    effect variance    sd sd_2.5 sd_97.5 var_prop
#> 1    block Intercept    2.810 1.676  0.634   4.430    0.227
#> 2     unit Intercept    9.592 3.097  2.519   3.808    0.773
#> 3 Residual              0.000 0.000     NA      NA    0.000

AICcmodavg::aictab(list(m1, m2), c("m1", "m2"))
#> 
#> Model selection based on AICc:
#> 
#>     K   AICc Delta_AICc AICcWt Cum.Wt      LL
#> m1 18 323.34          0    0.5    0.5 -136.07
#> m2 18 323.34          0    0.5    1.0 -136.07

^{Created on 2020-12-17 by the reprex package (v0.3.0.9001)}

Although this works to satisfaction, there are the first two caveats that become apparent:

As is the case here, we sometimes get a false convergence warning for the pseudo-error-model but not for the standard model.
The variance components will never be exactly identical, may it only be due to the fact that the error variance is not fixed to be 0, but to "a small non-zero value". You address this in a comment in the glmmTMB.R which reads "FIXME: Depending on the final estimates, we should somehow check that this fixed dispersion is small enough."

It can be argued that these are minor, acceptable issues and not so practically relevant. But this is basically how far my train of thought has come and I wanted to reach out to you as I am interested in your thoughts on this.

Thanks!

^{More stuff on this:
Shamelessly referring to my github page with more examples of how to different mixed models with variance structures translate between packages.
Also, this issue bbolker/broom.mixed#97 is related}

The text was updated successfully, but these errors were encountered:

bbolker · 2020-12-22T02:05:43Z

One piece of low-hanging fruit would be to allow the small non-zero value (currently log(sqrt(.Machine$double.eps))) to be user-settable via glmmTMBControl ...

SchmidtPaul · 2020-12-28T22:37:39Z

Do you think that experimenting with even smaller non-zero values would actually resolve issues in some cases? I assume you already chose sqrt(.Machine$double.eps) on purpose with just .Machine$double.eps leading to problems as it is too small (?). (For clarity: I know the code says it's the log() of that small number but only because later on the exp() is taken again, right?) Overall I have the feeling that although this may well be a low-hanging fruit, it is not necessary to put in the effort.

To me this issue was actually more to see if you had any objections or apprehensions to the approach described. It seems that you don't and if so I'm fine with closing this issue and in the future I can refer skeptic colleagues to it 😀

^{P.S. I wrote a mini documentation on this now}

bbolker · 2021-01-02T20:33:25Z

There is an example here (you'll need to remotes::install_github("glmmTMB/glmmTMB/glmmTMB@dispzero_flex") first ...)

I admit I don't understand the behaviour of this example: the random-effects variance decreases a lot when the fixed residual variance is decreased. I don't know whether this represents a bug (although the code to implement this feature is pretty simple, it's hard to see what could go wrong ...?) or an aspect of the stats I don't understand. It might be worth trying with a less mis-specified model (e.g. Reaction ~ Days + (Days|Subject)) and/or trying it in a model with an observation-level random effect, where var(obs_random) + fixed_sigma^2 should equal the estimated sigma^2 in the 'null' model (no obs-level random effect) ...

bbolker · 2021-01-07T00:51:29Z

This is now in #659 (and there are some better tests).

bbolker · 2021-01-09T23:16:18Z

Thoughts on this? Shall we go ahead and close it once #659 is accepted? I admit I don't have much to say about this, beyond the fact that adding this flexibility seems something that has advantages for people who want to use it and doesn't have a big downside (tiny increase in code complexity). @SchmidtPaul do you want to play around with it a bit?

As far as resolving errors go: in my experience the "false convergence" warning has always been fairly mysterious: from the documentation

the gradient ∇f(x) may be computed incorrectly, the other stopping tolerances may be too tight, or either f or ∇f may be discontinuous near the current iterate x

In this case it's hard to believe the gradient is computed incorrectly (since it's computed by AD) or that the objective function or gradient are discontinuous, which leaves "the other stopping tolerances may be too tight". I don't even know what that means. My guess would be that any improvement on the fit/change in warning behaviour would be haphazard.

SchmidtPaul · 2021-01-11T07:24:23Z

Thank you for the effort and your comments on this! As stated above, it would be fine for me to close this issue now as I may or may not find time to play around with it in the future.

bbolker mentioned this issue Jan 11, 2021

Dispzero flex #659

Merged

bbolker closed this as completed Jan 11, 2021

SchmidtPaul mentioned this issue Jun 27, 2022

Fix any variance component to any value #833

Closed

SchmidtPaul mentioned this issue Nov 2, 2023

Residuals for models with dispformula = ~0 and pseudo error on G-side #961

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A closer look at dispformula=~0 - never needing the R-side in gaussian mixed models again? #653

A closer look at dispformula=~0 - never needing the R-side in gaussian mixed models again? #653

SchmidtPaul commented Dec 17, 2020 •

edited

bbolker commented Dec 22, 2020

SchmidtPaul commented Dec 28, 2020 •

edited

bbolker commented Jan 2, 2021

bbolker commented Jan 7, 2021

bbolker commented Jan 9, 2021 •

edited

SchmidtPaul commented Jan 11, 2021

A closer look at dispformula=~0 - never needing the R-side in gaussian mixed models again? #653

A closer look at dispformula=~0 - never needing the R-side in gaussian mixed models again? #653

Comments

SchmidtPaul commented Dec 17, 2020 • edited

How well does dispformula=~0 really do the trick? What are the caveats and is there more info/citeable proof somewhere of how well it works?

bbolker commented Dec 22, 2020

SchmidtPaul commented Dec 28, 2020 • edited

bbolker commented Jan 2, 2021

bbolker commented Jan 7, 2021

bbolker commented Jan 9, 2021 • edited

SchmidtPaul commented Jan 11, 2021

SchmidtPaul commented Dec 17, 2020 •

edited

How well does `dispformula=~0` really do the trick? What are the caveats and is there more info/citeable proof somewhere of how well it works?

SchmidtPaul commented Dec 28, 2020 •

edited

bbolker commented Jan 9, 2021 •

edited