-
-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Marginal R2 is a misnomer in brms #526
Comments
If we can expect to obtain a better marginal R2, we could explicit that in the documentation (rather than changing the name which would be super breaking probably) and try to find a way to compute a better score - assuming this is a goal we can reasonably have |
It's a possible misnomer for Bayesian models, not for frequentist. |
TBC, I have no expertise here. I just wanted to bring your attention to this because it seemed like a credible argument from authority. Feel free to close once you decide what should be done. |
I think you can use |
I'm not sure what you mean @strengejacke about the distinction between Bayesian and frequentist frameworks? |
For frequentist models, we follow the approach from Nakagawa et. al, computing the variances for the different components (via Lines 107 to 110 in 9b95409
This is the distinction of "marginal" vs. "conditional" proposed by Nakagawa et al., and which is in line with many other packages that compute R2 for mixed models. For Bayesian mixed models, we rely on But, since we're interested in the variance, library(lme4)
library(brms)
library(performance)
data(sleepstudy)
data(cbpp)
cbpp$out <- ifelse(cbpp$incidence / cbpp$size > 0.2, 1L, 0L)
m1 <- lmer(Reaction ~ Days + (1 + Days | Subject), data = sleepstudy)
m2 <- brm(Reaction ~ Days + (1 + Days | Subject), data = sleepstudy, refresh = 0)
gm1 <- glmer(
out ~ period + (1 | herd),
data = cbpp,
family = binomial
)
gm2 <- brm(
out ~ period + (1 | herd),
data = cbpp,
family = bernoulli,
refresh = 0
)
r2(m1)
r2(m2)
r2(gm1)
r2(gm2) Hence, in terms of "predictions", we don't have predictions marginalized over random effects, but since we're not interested in point estimates, but rather in the variances, we still might produce the "correct" marginal and conditional R2. Not sure about this, though, and that's why there might be a misnomer. |
|
Okay yeah I agree. I think May not really be a Bayes/freq thing but rather than Paul would probably just object to the Nakagawa use of marginal there. |
It already can: library(brms)
library(performance)
data(sleepstudy)
data(cbpp)
cbpp$out <- ifelse(cbpp$incidence / cbpp$size > 0.2, 1L, 0L)
m2 <- brm(Reaction ~ Days + (1 + Days | Subject),
data = sleepstudy,
refresh = 0, backend = "cmdstanr")
r2_bayes(m2)
#> # Bayesian R2 with Compatibility Interval
#>
#> Conditional R2: 0.793 (95% CI [0.758, 0.824])
#> Marginal R2: 0.287 (95% CI [0.160, 0.406])
r2_nakagawa(m2)
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.822
#> Marginal R2: 0.240 |
Am I understanding correctly that there isn't any problem with the code per se, but the issue is with the labeling? So it's not marginal in the sense that the random effect's aren't integrated out, but are "ignored" (as in https://twitter.com/tjmahr/status/1581563839459385344). Correct? How does this explain why the original tweet had the marginal larger than the conditional? |
Maybe related to |
According to the brms author, performance should change its label:
https://twitter.com/paulbuerkner/status/1608896620699226113?s=46&t=3uzWoIWG2nAVyzeyGrxTEg
The text was updated successfully, but these errors were encountered: