-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix any variance component to any value #833
Comments
You can do a lot, maybe everything you want. Any parameter in the model can be held at a specified value by using the parms (5) (3) (2) (3) / hold=1,3; would be equivalent to
(I used the default starting point, 0, for the non-fixed values, and used It gets tricky if you want to constrain correlation parameters in a model with a random-effects vector of length 3 or more: see the 'mappings' section of http://glmmtmb.github.io/glmmTMB/articles/covstruct.html and https://github.com/glmmTMB/glmmTMB/blob/master/misc/glmmTMB_corcalcs.ipynb
|
Thanks for the detailed answer. This sounded great so I tried to apply it. mod1As a first check I fitted a model with a single basic random effect and fixed its variance and it worked just as intended: library(agridat)
library(broom.mixed)
library(glmmTMB)
library(tidyverse)
dat <- as_tibble(agridat::john.alpha)
theta_start <- c(5, 3, 2, 3) %>% sqrt() %>% log()
# Fix variance of single random effect ------------------------------------
mod1 <- glmmTMB(
yield ~ rep + (1 | gen),
REML = TRUE,
start = list(theta = theta_start[1]),
map = list(theta = factor(c(NA))),
data = dat
)
tidy(mod1, effects = "ran_pars", scales = "vcov")
#> # A tibble: 2 x 5
#> effect component group term estimate
#> <chr> <chr> <chr> <chr> <dbl>
#> 1 ran_pars cond gen var__(Intercept) 5
#> 2 ran_pars cond Residual var__Observation 0.134 mod2And finally, as you can see, the mod2 <- glmmTMB(
yield ~ rep + (1 | gen),
REML = TRUE,
start = list(theta = theta_start[1:2]),
map = list(theta = factor(c(NA, NA))),
data = dat
)
#> Error: parameter vector length mismatch: in ‘start’, length(theta)==2, should be 1 Created on 2022-07-11 by the reprex package (v2.0.1) |
Two corrections: (1) the error variance model is parameterized by mod2 <- glmmTMB(
yield ~ rep + (1 | gen),
REML = TRUE,
start = list(theta = theta_start[1],
betad = log(3)),
map = list(theta = factor(c(NA)),
betad = factor(c(NA))),
data = dat
)
sigma(mod2)^2 ## 3
VarCorr(mod2)$cond$gen[1,1] ## 5 |
Oh wow, this is fantastic. It might be quite the game changer for my field. library(agridat)
library(glmmTMB)
dat <- agridat::john.alpha
mod3 <- glmmTMB(yield ~ rep + (1 | gen),
dispformula = ~ 0,
REML = TRUE,
data = dat)
mod4 <- glmmTMB(yield ~ rep + (1 | gen),
start = list(betad = log(sqrt(.Machine$double.eps))),
map = list(betad = factor(c(NA))),
REML = TRUE,
data = dat)
sigma(mod3)^2
#> [1] 1.490116e-08
sigma(mod4)^2
#> [1] 1.490116e-08
VarCorr(mod3)
#>
#> Conditional model:
#> Groups Name Std.Dev.
#> gen (Intercept) 0.4518
VarCorr(mod4)
#>
#> Conditional model:
#> Groups Name Std.Dev.
#> gen (Intercept) 0.45179614
#> Residual 0.00012207 Created on 2022-07-13 by the reprex package (v2.0.1) |
Yes, they are the same. I think a way to confirm that they are identical would be with the variance covariance matrix output from
|
I apologize for having so many questions, but this is very fruitful. Thank you very much. I noticed that when fixing all variances in a model, having Would you agree or am I missing other potential issues? library(agridat)
library(broom.mixed)
library(glmmTMB)
dat <- agridat::john.alpha
mod6_REML <- glmmTMB(yield ~ rep + (1 | gen),
start = list(theta = log(sqrt(3)), betad = log(5)),
map = list(theta = factor(c(NA)), betad = factor(c(NA))),
REML = TRUE,
data = dat)
#> Warning in fitTMB(TMBStruc): Model convergence problem; extreme or very small
#> eigenvalues detected. See vignette('troubleshooting')
tidy(mod6_REML, effects = "ran_pars", scales = "vcov")
#> Error in solve.default(as.matrix(Qm)): 'a' ist 0-dimensional
mod6_ML <- glmmTMB(yield ~ rep + (1 | gen),
start = list(theta = log(sqrt(3)), betad = log(5)),
map = list(theta = factor(c(NA)), betad = factor(c(NA))),
REML = FALSE,
data = dat)
tidy(mod6_ML, effects = "ran_pars", scales = "vcov")
#> # A tibble: 2 x 5
#> effect component group term estimate
#> <chr> <chr> <chr> <chr> <dbl>
#> 1 ran_pars cond gen var__(Intercept) 3
#> 2 ran_pars cond Residual var__Observation 5 Created on 2022-07-13 by the reprex package (v2.0.1) |
Yes, I would agree. These problems are both edge cases - we didn't foresee anyone fitting a REML model with all random effects parameters fixed (for some reason this also leads to a loss of dimension names in one place, which isn't hard to work around). The new REML_zero branch (probably to be integrated into master shortly) fixes these problems and gives us the same answers for both approaches. |
Ok, thanks again. I'd say it's fine if you closed the issue now. |
Yes, glmmTMB does not have a separate structure for "R-side modeling". I think the description in #653 is still the best. I did a little bit of work on trying to set up "true" zero-residual models, but these require some fairly different internal structure as well as more thought than I've had a chance to devote to the problem ... |
Is it correct that it is currently not possible to fix (some of the) variance components to values of my choice?
How SAS does it via
parms
andNOPROFILE
/NOITER
source: SAS documentation
This linear mixed model has two random effects. Due to the additional i.i.d residual variance, three values need to be provided in the
parms
statement. As the documentation says, this model will not iterate at all, but stick to the provided values:How SAS does it via
parms
andHOLD =
source: SAS documentation
An even more flexible option is to individually choose what subset of the variance components should be fixed to a given value during the model fitting.
What
glmmTMB()
can do - I think?Correct me if I am wrong, but the only variance component in a linear (mixed) model fit via
glmmTMB()
that can be fixed to a value is that of the i.i.d. residual and it can only be fixed to (almost) zero viadisformula ~ 0
(see e.g. #653).Relevance
In my field, two-stage analyses are a relatively big deal. They are at least important enough that a major reason for not switching from SAS to R on more occasions is this very limitation I am addressing here. I actually wrote it out in more detail and with citable references here in #638. Note that this limitation is as far as I know not just true for {glmmTMB} but for any linear mixed model package in R - but I'd be glad to be corrected on this regard.
The text was updated successfully, but these errors were encountered: