-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Effect sizes in Bayesian regressions #8
Comments
Marsman's paper seems very interesting (unfortunately I didn't manage to through the wall of equations 😞) If I understand, the only existing alternative for Bayesian regressions (currently in the work in this package) is a smart standardization of the posterior that would approximate (partial) correlations or some standardized differences? |
@mattansb I've implemented some approximation to test stat for Bayesian models. Not sure if it makes a lot of sense from a theoretical perspective but seems to work from an applied one 😅 https://easystats.github.io/effectsize/articles/bayesian_models.html The idea is to get t from coef and posterior SD, find the frequentist DoF (😱) and get the r posterior... Seems almost to easy |
I it's time for a dev branch so we can discuss commits... |
👍 |
dev is up |
Okay, so I'll split my comment into de facto and de jure: de factoYou are doing: se <- sd(posterior)
t <- posterior / se
df <- from_freq_magic(...)
convert_t_to_r(t, df) This will probably almost always give results very close to the true results.... Except...How does this function deal with multiple dfs (I can't seem to re-build the package to check)? Like in mixed models? If it cannot return the approx-correct df, the calculation will be not be a good approx... de jureThe idea of degrees of freedom in a Bayesian framework is almost heresy! That is - df represent the ~"number of bits of data free to change that can still result in this estimate". But in Bayes the estimates aren't based only on the data - but also on the prior (and thus also on previous data, and so forth). (Also there isn't 1 estimate, but a whole distribution.) So in the most technical sense, this is wrong 😅 But see de facto above 😉 |
Here lies the problem. Currently, I added this function in parameters, and for mixed models the most straigthforward and quick method is to 1) refit the model using lme4 (see here) and 2) use the kenward-roger approximations (same as for p values for lme4).
Haha I'd bet this hybrid approach would trigger all the bayestremists 😅 As a pragmaticist I'd tend to say "as long as it serves better science" but it's true that best would be to also be theoretically correct... But as we said, is there currently any better alternatives out there? Another thing I am wondering is for non-linear models, e.g., logistic regression. The freq models usually report a z value. But as we only have access to the sample SD for the posteriors (same as for linear model), does it make sense to still compute t and then t to r or should we use z values? |
For these models there's not need for all this "black magic" because they have OR / log odds / rate etc. as the model parameters!
|
Forgot about that haha
Interesting... but not quite there yet 😞 |
library(parameters)
library(effectsize)
#>
#> Attaching package: 'effectsize'
#> The following objects are masked from 'package:parameters':
#>
#> cohens_f, epsilon_squared, eta_squared, omega_squared
model <- glm(vs ~ cyl + disp + drat, data = mtcars, family = "binomial")
parameters <- model_parameters(model)
#> Waiting for profiling to be done...
parameters$r <- convert_z_to_r(parameters$z, n = insight::n_obs(model))
parameters$r_from_odds <- convert_odds_to_r(parameters$Coefficient, log = TRUE)
parameters
#> Parameter | Coefficient | SE | 95% CI | z | df | p | r | r_from_odds
#> ---------------------------------------------------------------------------------------------
#> (Intercept) | 25.09 | 11.88 | [ 6.43, 56.94] | 2.11 | 28 | < .05 | 0.35 | 0.99
#> cyl | -1.99 | 1.05 | [-4.57, -0.13] | -1.89 | 28 | 0.06 | -0.32 | -0.48
#> disp | -0.01 | 0.02 | [-0.06, 0.02] | -0.47 | 28 | > .1 | -0.08 | 0.00
#> drat | -3.18 | 2.16 | [-8.70, 0.49] | -1.47 | 28 | > .1 | -0.25 | -0.66 Created on 2019-10-13 by the reprex package (v0.3.0) The two "r" do not match tho... any idea why? |
I think it's because in a multiple regression setting, z and log odds don't directly correspond. In any case r doesn't make any sense in a non-Gaussian model. All of these "just because you can doesn't mean it makes sense" uses should be documented somewhere... Maybe drop Also Also also... The file names and contents for all these functions needs to be make concise - it took me way too long to find Also also also, @DominiqueMakowski this thread is about Bayes! |
That's strange... Which one gives the most "accurate" r/d? For instance in a metaanalysis if you want to convert everything to d, it's kinda problematic if two approaches give quite different values...
If you want to interpret it as variance explained sure, but it makes sense if you just want to have all your effects on the same scales... and possibility + good documentation > impossibility (as we discussed several times ^^)
Since eta squared is equivalent to R2, which is a more generic name ("Eta Squared, however, is used specifically in ANOVA models.", here), wouldn't it be clearer to rename all those as t_to_r2 etc. (it looks better too)?
Agreed
U started! 😁 |
For d, the odds to d looks like the correct one.
It "makes sense" to want that, but it doesn't make sense to get that in this method - you should instead use
I think it's a great idea to have ! F_to_partial_R2 <- F_to_partial_eta_squared
t_to_partial_R2 <- t_to_partial_eta_squared Then the convert_t_to_partial_r <- function(t, df_error) sign(t) * sqrt(t_to_partial_R2(t, df_error)) (I think it is important to have the "partial" in the names or at the very least the doc title - users shouldn't think they're getting the simple correlation.)
You stated! |
Going back to:
I said that de facto...
I'd like to amend that: It will work in cases where the prior is relatively lax (i.e., not strong); in these cases SDposterior≈ SEfrequentist (because the posterior ≈ the likelihood function). But when this equality will be broken with strong priors, where SDposterior< SEfrequentist, which will lead to miss-estimating the effect size. |
Which is implemented in |
@strengejacke interesting.... Care to take a stand at that here for gaussian models? (This will give a non-partial R2, if I understand correctly?) |
@DominiqueMakowski I suggest moving the related functions and vignette to a separate branch, as they don't yet fully work / ongoing debate here about how to do this, and It is possible at this time to get regular beta (standardized coefficients) for Bayesian models. |
Yes I agree, or rename it as something WIP |
@DominiqueMakowski What is left to do here? Standardized regression coeffs are available... Are we talking about an Eta2-like effect size? |
I am talking about a consistent and robust method to get a unified ("similar") index for different kinds of models (regressions with factors, interactions, and logistic models) 😬 |
Okay, I've come up with an idea.
library(rstanarm)
library(effectsize)
eta_squared_posterior <- function(model, partial = FALSE, nsamps = 1000, verbose = TRUE, ...) {
if (!insight::model_info(model)$is_linear) {
stop("Only applicable to linear models")
}
if (partial) {
warning("Only support non partial PVE.")
}
# get ppd
ppd <- rstantools::posterior_predict(model)
nsamps <- min(nsamps, nrow(ppd))
i <- sample(nrow(ppd), size = nsamps)
ppd <- ppd[i,]
# get model data
f <- insight::find_formula(model)$conditional
X <- insight::get_predictors(model)
resp_name <- insight::find_response(model)
if (verbose) {
message("Sampleing effect size... This can take a while...")
}
res <- apply(ppd, 1, function(r) {
# sampled outcome + predictors
temp_dat <- X
temp_dat[[resp_name]] <- r
# fit a simple linear model
temp_fit <- lm(f, temp_dat)
# compute effect size
ANOVA <- car::Anova(temp_fit, type = 3)
es <- eta_squared(ANOVA, ci = NA, partial = FALSE)
data.frame(t(setNames(es$Eta_Sq, es$Parameter)), check.names = F)
})
res <- do.call("rbind", res)
return(res)
}
model <- stan_lmer(mpg ~ wt + qsec * factor(am) + (1|cyl),
data = mtcars, refresh = 0)
pp_eta2 <- eta_squared_posterior(model)
#> Sampleing effect size... This can take a while...
bayestestR::describe_posterior(pp_eta2)
#> # Description of Posterior Distributions
#>
#> Parameter | Median | 89% CI | pd | 89% ROPE | % in ROPE
#> ----------------------------------------------------------------------------
#> wt | 0.408 | [0.178, 0.642] | 1 | [-0.100, 0.100] | 0.000
#> qsec | 0.105 | [0.000, 0.276] | 1 | [-0.100, 0.100] | 53.984
#> factor(am) | 0.024 | [0.000, 0.100] | 1 | [-0.100, 0.100] | 100.000
#> qsec:factor(am) | 0.036 | [0.000, 0.133] | 1 | [-0.100, 0.100] | 92.368
## Compare to:
library(magrittr)
lm(mpg ~ wt + qsec * factor(am), data = mtcars) %>%
car::Anova(type = 3) %>%
eta_squared(partial = FALSE)
#> Parameter | Eta2 | 90% CI
#> -------------------------------------
#> wt | 0.44 | [0.21, 0.60]
#> qsec | 0.08 | [0.00, 0.27]
#> factor(am) | 0.05 | [0.00, 0.21]
#> qsec:factor(am) | 0.07 | [0.00, 0.24] Created on 2020-05-25 by the reprex package (v0.3.0) For now, this only works with non-partial eta squared, so it gives the total (unique) variance explained. For now, I only use a sub-sample of the ppd to speed it up. you can find it here: |
finally some good news on this front, that looks super promissive! So let me rephrase to see if I understood correctly: Let's say you have a model
|
Exactly! I will have to think if there is any way to make it faster, as it is currently pretty slow (working across each sample...). |
A collection of stan threads that seem to support the logic of my method: Using the PPD(I also asked here if my method is applicable. Waiting for a response...) Using the posterior parameters |
Opening a new thread as we strayed from the topic of #7
Originally posted by @mattansb in #7 (comment)
The text was updated successfully, but these errors were encountered: