Would a model_avg function make sense? #82

lukeholman · 2018-05-01T23:32:54Z

Thanks for an interesting and well-documented package!

So, I am new to Bayesian approaches. In the frequentist world, I did many analyses like this, namely specifying a set of plausible models, ranking them by their AIC values, and averaging the models with the appropriate weightings to obtain model-averaged parameter estimates, predictions etc.

full_model <- lm(y ~ x1 + x2)
aic_table <- MuMIn::dredge(full_model)
model_averaging_results <- MuMIn::model.avg(aic_table)

As I understand it, loo provides a way to estimate model weights, which are a lot like Akaike weights in terms of their interpretation (i.e. models with a weight near 1 are likely to be the 'best' model in the set) and intended use (i.e. the aim is then to average across models - NOT to simply pick a single top model). Hope that's right?

Assuming I understand correctly, would it make sense to add a convenience function with a similar aim to MuMIn::model.avg? Ideally, the new loo:model_avg function could be used like this (here I assume you again wrote it in a way that allows integration with brms, rstanarm etc):

model1 <- brm(y ~ x1 + x2)
model2 <- brm(y ~ x1)
loo_values <- loo_model_weights(model1, model2)
model_averaging_results <- model_avg(object = list(model1, model2), 
                                     weights = loo_values)
summary(model_averaging_results) # averaged posterior distributions for each parameter 
predict(model_averaging_results) # averaged predicted values, etc etc

Maybe the model averaging is so easy, or so case-specific, that it doesn't need a separate function? But if that's the case, some worked examples in the vignette would be really handy. As it stands, I am not really sure what to do with the LOO values calculated for my models!

Thanks

The text was updated successfully, but these errors were encountered:

jgabry · 2018-05-02T01:01:03Z

Glad you like the package. I specifically appreciate that you recognized the documentation. We spent a lot of time on that!

As to model averaging, I think it’s a good idea to have that convenience function. brms has posterior_average, which we’ll be putting in rstantools and rstanarm soon. So indeed, we’re thinking along the same lines. Thanks for the suggestion!

yao-yl · 2018-05-02T01:01:09Z

loo provides a way to estimate model weights, which are a lot like Akaike weights in terms of their interpretation (i.e. models with a weight near 1 are likely to be the 'best' model in the set) and intended use (i.e. the aim is then to average across models - NOT to simply pick a single top model). Hope that's right?

Yes, this is indeed the advantage of loo stacking.

Currently we do not have a separate summary function like your description. One reason is we allow each model to have different parameter space.

avehtari · 2018-05-02T02:00:52Z

@lukeholman the specific examples you show do not usually need model averaging this way. What you seem to want is better handled by suitable prior (like horseshoe) and sampling. See first AIC solution
https://atyre2.github.io/2017/06/16/rebutting_cade.html
and then the Bayesian solution
https://rawgit.com/avehtari/modelselection_tutorial/master/collinear.html

Model averaging using stacking or LOO weights is for cases, where it is not easy to have continuous version of the model space and we assume that all models are not well specified.

Let's continue discussion with @lukeholman in Stan forum, and we'll get back here when it's more clear what is needed.

jgabry · 2018-05-02T02:43:36Z

Actually I think instead of my previous suggestion of posterior_average() I prefer predictive_average() as it emphasizes that it’s the posterior predictive distribution that is averaged, not the posterior draws. @paul-buerkner?

paul-buerkner · 2018-05-02T06:33:06Z

In brms, posterior_average averages posterior distributions (i.e. takes samples from the posterior of each model based on the model weights), while pp_average averages posterior-predictive distributions.

avehtari · 2018-05-02T13:03:21Z

posterior_average averages posterior distributions (i.e. takes samples from the posterior of each model based on the model weights),

What is the use case for this? Different models have different model spaces or nonlinear mappings to interpretable scale and thus averaging of posterior distributions rarely makes sense (I can come up with some strangely constrained cases, but I would not advertise them).

paul-buerkner · 2018-05-02T13:13:23Z

I agree. It basically mirrors the functionality of other (frequentist) packages doing model averaging. Not sure it was a good idea implementing this in the first place...

lukeholman · 2018-05-03T01:09:27Z

Hi all,

Thanks very much for the help!

I spent yesterday doing a bunch more reading, and I think that either horseshoe priors or the pp_average() and posterior_average() functions in brms cover my needs just fine, and I'm not sure a new loo function is needed. I guess my only suggestion is to add a quick worked example to the end of the loo vignette, illustrating that one can easily use the computed weights to useful ends (in brms, and presumably also in rstanarm etc). For other people finding this later, I found the following posts especially informative:

http://mc-stan.org/projpred/articles/quickstart.html
https://drewtyre.rbind.io/post/rebutting_cade/
https://rawgit.com/avehtari/modelselection_tutorial/master/collinear.html

Cheers

jgabry · 2018-05-03T04:20:16Z

Thanks @lukeholman. I just opened #83 to help address adding examples and automating the process in brms and rstanarm.

lukeholman closed this as completed May 3, 2018

jgabry mentioned this issue May 3, 2018

Averaging predictive distributions #83

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Would a model_avg function make sense? #82

Would a model_avg function make sense? #82

lukeholman commented May 1, 2018

jgabry commented May 2, 2018

yao-yl commented May 2, 2018

avehtari commented May 2, 2018

jgabry commented May 2, 2018 •

edited

Loading

paul-buerkner commented May 2, 2018

avehtari commented May 2, 2018

paul-buerkner commented May 2, 2018

lukeholman commented May 3, 2018

jgabry commented May 3, 2018

Would a model_avg function make sense? #82

Would a model_avg function make sense? #82

Comments

lukeholman commented May 1, 2018

jgabry commented May 2, 2018

yao-yl commented May 2, 2018

avehtari commented May 2, 2018

jgabry commented May 2, 2018 • edited Loading

paul-buerkner commented May 2, 2018

avehtari commented May 2, 2018

paul-buerkner commented May 2, 2018

lukeholman commented May 3, 2018

jgabry commented May 3, 2018

jgabry commented May 2, 2018 •

edited

Loading