-
-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get_predicted(): features and caveats #315
Comments
Do you think this function should ultimately be assigned to its own package? A package like I would also recommend having an alias for this function called This would mean the ecosystem would have a consistent function naming schemas and also mirror functions for broom::tidy <-> parameters::model_parameters
broom::glance <-> performance::model_performance
broom::augment <-> prediction::model_prediction What do you think? cc @strengejacke, @mattansb |
Well, we have the modelbased package. I think @DominiqueMakowski just wanted to put the workhorse into insight, and then the "visible" user function will be in modelbased. |
I see, gotcha! What do you think about calling this function |
Yes correct, the idea was to have the heavy lifting done at the insight level, mostly cos it is useful elsewhere (performance indices for instance or such). Then, modelbased will be a user-facing package focusing mostly on visualization of models (including visualization of links, contrasts, marginal means etc). I plan to add nice vignettes to modelbased dicsussing a model-based approach to stats, how to make inference and derive indices from models etc. the function are currently named |
But we would still use both emmeans and predict, right? |
yes yes for |
Not as option for estimate_response? See https://strengejacke.github.io/ggeffects/articles/technical_differencepredictemmeans.html |
mmh we'll see, but I feel like for modelbased we could really merge your and indra's experience with model visualization to make something really neat and powerful. |
Support for other models, here I list some special cases:
|
"clmm" is only supported by emmeans, has not predict. |
I've added prediction intervals for Binomial and Poisson models: library(insight)
library(magrittr)
glm(am ~ cyl + mpg + hp, data = mtcars,
family = binomial()) %>%
get_predicted(predict = "prediction") %>%
as.data.frame() %>% head() %>% zapsmall()
#> Predicted CI_low CI_high
#> Mazda RX4 0.1762632 0 1
#> Mazda RX4 Wag 0.1762632 0 1
#> Datsun 710 0.6499998 0 1
#> Hornet 4 Drive 0.2489592 0 1
#> Hornet Sportabout 0.2268359 0 1
#> Valiant 0.0064952 0 0
glm(gear ~ ., data = mtcars,
family = poisson()) %>%
get_predicted(predict = "prediction") %>%
as.data.frame() %>% head() %>% zapsmall()
#> Predicted CI_low CI_high
#> Mazda RX4 4.266709 1 9
#> Mazda RX4 Wag 4.192287 1 9
#> Datsun 710 4.031290 1 8
#> Hornet 4 Drive 3.043687 0 7
#> Hornet Sportabout 2.967043 0 7
#> Valiant 2.890863 0 7 Created on 2021-03-10 by the reprex package (v1.0.0) The Gaussian PIs still work as they use to (: As I noted somewhere prior, each family has its own PI method, so adding each these analytical solutions is somewhat tedious... It might be easier to simulate population values and get the 95% ETI from there instead. Also, these columns should really be |
@mattansb I hope you found the code clear and logical :) One thing to make sure of: Following your comments, currently the PIs for Bayesian models are obtained via their bespoke function: Line 115 in da33355
instead of manually computing them from the draws. But does their function does something else than that? Or is it unnecessarily re-sampling draws from the posterior? (note that the iterations were obtained via posterior_predict). |
Nope - that's exactly what it does: |
thanks for addressing my source-code-checking laziness 😁 - I'll drop that exception then |
The code I wrote should also work with GLM GAMs. Is there any reason the same won't apply for gaussian models? |
@DominiqueMakowski Currently, some methods like |
See 4aa8e44, else insight/tests/testthat/test-get_predicted.R Line 150 in 4aa8e44
|
Note that by default, glmmTMB uses Wald CIs on fixed effects/log variance components, whereas lme4 uses profile likelihood CIs on fixed effects/variance components |
Here's a list of all features / issues etc., to have a global view and avoid opening multiple issues.
data
argument, when holding constant non-focal co-variates (get_predicted_ci: better catch "data" argument #316)PredictionConfidence intervals for GAMs <edit by Mattan: these are CIs, not PIs>predict = "observation"
, transformed output to match shape of observations (i.e., Zeros and Ones for logistic)The text was updated successfully, but these errors were encountered: