get_predicted(): features and caveats #315

DominiqueMakowski · 2021-03-08T02:03:57Z

Here's a list of all features / issues etc., to have a global view and avoid opening multiple issues.

The text was updated successfully, but these errors were encountered:

IndrajeetPatil · 2021-03-08T13:00:22Z

Do you think this function should ultimately be assigned to its own package?

A package like prediction, predicted, etc. (not sure which names are available on CRAN).

I would also recommend having an alias for this function called model_predictions.

This would mean the ecosystem would have a consistent function naming schemas and also mirror functions for broom functions:

broom::tidy     <->  parameters::model_parameters
broom::glance   <->  performance::model_performance
broom::augment  <->  prediction::model_prediction

What do you think? cc @strengejacke, @mattansb

strengejacke · 2021-03-08T13:02:20Z

Well, we have the modelbased package. I think @DominiqueMakowski just wanted to put the workhorse into insight, and then the "visible" user function will be in modelbased.

IndrajeetPatil · 2021-03-08T13:03:51Z

the "visible" user function will be in modelbased

I see, gotcha!

What do you think about calling this function model_prediction?

DominiqueMakowski · 2021-03-08T13:22:25Z

Yes correct, the idea was to have the heavy lifting done at the insight level, mostly cos it is useful elsewhere (performance indices for instance or such). Then, modelbased will be a user-facing package focusing mostly on visualization of models (including visualization of links, contrasts, marginal means etc). I plan to add nice vignettes to modelbased dicsussing a model-based approach to stats, how to make inference and derive indices from models etc.

the function are currently named estimate_link and estimate_response but given the new API of get_predicted I might change to something like estimate_relationship() and estimate_response() or something like this

strengejacke · 2021-03-08T13:27:27Z

But we would still use both emmeans and predict, right?

DominiqueMakowski · 2021-03-08T13:30:32Z

yes yes for estimate_means and estimate_contrasts as the internals of that are definitly beyond us

strengejacke · 2021-03-08T13:35:37Z

Not as option for estimate_response? See https://strengejacke.github.io/ggeffects/articles/technical_differencepredictemmeans.html

DominiqueMakowski · 2021-03-08T13:51:02Z

mmh we'll see, but I feel like for modelbased we could really merge your and indra's experience with model visualization to make something really neat and powerful.

strengejacke · 2021-03-09T08:07:31Z

Support for other models, here I list some special cases:

logistf: https://github.com/strengejacke/ggeffects/blob/master/R/get_predictions_logistf.R
bamlss: https://github.com/strengejacke/ggeffects/blob/master/R/get_predictions_bamlss.R
clm (and other ordinal / cumulative link models): https://github.com/strengejacke/ggeffects/blob/master/R/get_predictions_clm.R (these are great, you have to deal with the different outcome levels)
lrm, multinom, ols, polr, ...: https://github.com/strengejacke/ggeffects/blob/master/R/get_predictions_lrm.R (different names for "type" argument)
VGAM: https://github.com/strengejacke/ggeffects/blob/master/R/get_predictions_vglm.R (different name for "predict"?)

strengejacke · 2021-03-09T08:07:55Z

"clmm" is only supported by emmeans, has not predict.

mattansb · 2021-03-10T10:09:25Z

I've added prediction intervals for Binomial and Poisson models:

library(insight)
library(magrittr)

glm(am ~ cyl + mpg + hp, data = mtcars,
    family = binomial()) %>% 
  get_predicted(predict = "prediction") %>% 
  as.data.frame() %>% head() %>% zapsmall()
#>                   Predicted CI_low CI_high
#> Mazda RX4         0.1762632      0       1
#> Mazda RX4 Wag     0.1762632      0       1
#> Datsun 710        0.6499998      0       1
#> Hornet 4 Drive    0.2489592      0       1
#> Hornet Sportabout 0.2268359      0       1
#> Valiant           0.0064952      0       0

glm(gear ~ ., data = mtcars,
    family = poisson()) %>% 
  get_predicted(predict = "prediction") %>% 
  as.data.frame() %>% head() %>% zapsmall()
#>                   Predicted CI_low CI_high
#> Mazda RX4          4.266709      1       9
#> Mazda RX4 Wag      4.192287      1       9
#> Datsun 710         4.031290      1       8
#> Hornet 4 Drive     3.043687      0       7
#> Hornet Sportabout  2.967043      0       7
#> Valiant            2.890863      0       7

^{Created on 2021-03-10 by the reprex package (v1.0.0)}

The Gaussian PIs still work as they use to (:

As I noted somewhere prior, each family has its own PI method, so adding each these analytical solutions is somewhat tedious... It might be easier to simulate population values and get the 95% ETI from there instead.

Also, these columns should really be PI_low and PI_high...

#315

DominiqueMakowski · 2021-03-10T14:08:53Z

@mattansb I hope you found the code clear and logical :)

One thing to make sure of:

Following your comments, currently the PIs for Bayesian models are obtained via their bespoke function:

insight/R/get_predicted_ci.R

Line 115 in da33355

    
           out <- as.data.frame(rstantools::predictive_interval(x, newdata = data, prob = ci))

instead of manually computing them from the draws. But does their function does something else than that? Or is it unnecessarily re-sampling draws from the posterior? (note that the iterations were obtained via posterior_predict).

mattansb · 2021-03-10T14:31:18Z

instead of manually computing them from the draws. But does their function does something else than that?

Nope - that's exactly what it does:

https://github.com/stan-dev/rstanarm/blob/dee0a2d45bf42b2df791072041151b753edd6af9/R/predictive_interval.R#L82-L91

DominiqueMakowski · 2021-03-10T14:35:00Z

thanks for addressing my source-code-checking laziness 😁 - I'll drop that exception then

mattansb · 2021-03-15T14:39:46Z

@strengejacke @mattansb is it possible to get PIs for GAMs? Can we use the same method that for lm/glms?

The code I wrote should also work with GLM GAMs. Is there any reason the same won't apply for gaussian models?

strengejacke · 2021-07-26T09:39:25Z

@DominiqueMakowski Currently, some methods like get_predicted.glmmTMB() directly check the predict argument, while others like get_predicted.lmerMod() call .get_predicted_args() to check the input. This leads to the situation that we have methods that have predict = c("expectation", "link", "prediction", "response", "relation") in their usage, while others just have predict = "expectation". Can this be harmonized?

strengejacke · 2021-07-26T09:39:58Z

See 4aa8e44, else

insight/tests/testthat/test-get_predicted.R

Line 150 in 4aa8e44

rez <- as.data.frame(get_predicted(x, iterations = 5))

fails

bwiernik · 2021-07-26T15:40:02Z

Prediction intervals for glmmTMB: Also, why are confidence intervals so different in respect to lme4 (could be related to issue above)

Note that by default, glmmTMB uses Wald CIs on fixed effects/log variance components, whereas lme4 uses profile likelihood CIs on fixed effects/variance components

DominiqueMakowski added the enhancement 💥 Implemented features can be improved or revised label Mar 8, 2021

IndrajeetPatil mentioned this issue Mar 8, 2021

new vignette to catalog comparisons between broom and easystats ecosystems easystats/easystats#104

Open

4 tasks

DominiqueMakowski mentioned this issue Mar 8, 2021

get_predicted: type, interval, and transform : design decisions #310

Closed

mattansb added a commit that referenced this issue Mar 10, 2021

binom and pois PIs

fe08f6c

#315

strengejacke added the get_predicted Function specific issues label Jan 10, 2022

krassowski mentioned this issue Jul 25, 2022

get_predicted ordinal missing standard errors for MASS models #597

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get_predicted(): features and caveats #315

get_predicted(): features and caveats #315

DominiqueMakowski commented Mar 8, 2021 •

edited by strengejacke

Loading

IndrajeetPatil commented Mar 8, 2021

strengejacke commented Mar 8, 2021

IndrajeetPatil commented Mar 8, 2021

DominiqueMakowski commented Mar 8, 2021 •

edited

Loading

strengejacke commented Mar 8, 2021

DominiqueMakowski commented Mar 8, 2021

strengejacke commented Mar 8, 2021

DominiqueMakowski commented Mar 8, 2021

strengejacke commented Mar 9, 2021

strengejacke commented Mar 9, 2021

mattansb commented Mar 10, 2021

DominiqueMakowski commented Mar 10, 2021

mattansb commented Mar 10, 2021

DominiqueMakowski commented Mar 10, 2021 •

edited

Loading

mattansb commented Mar 15, 2021

strengejacke commented Jul 26, 2021

strengejacke commented Jul 26, 2021 •

edited

Loading

bwiernik commented Jul 26, 2021

get_predicted(): features and caveats #315

get_predicted(): features and caveats #315

Comments

DominiqueMakowski commented Mar 8, 2021 • edited by strengejacke Loading

IndrajeetPatil commented Mar 8, 2021

strengejacke commented Mar 8, 2021

IndrajeetPatil commented Mar 8, 2021

DominiqueMakowski commented Mar 8, 2021 • edited Loading

strengejacke commented Mar 8, 2021

DominiqueMakowski commented Mar 8, 2021

strengejacke commented Mar 8, 2021

DominiqueMakowski commented Mar 8, 2021

strengejacke commented Mar 9, 2021

strengejacke commented Mar 9, 2021

mattansb commented Mar 10, 2021

DominiqueMakowski commented Mar 10, 2021

mattansb commented Mar 10, 2021

DominiqueMakowski commented Mar 10, 2021 • edited Loading

mattansb commented Mar 15, 2021

strengejacke commented Jul 26, 2021

strengejacke commented Jul 26, 2021 • edited Loading

bwiernik commented Jul 26, 2021

DominiqueMakowski commented Mar 8, 2021 •

edited by strengejacke

Loading

DominiqueMakowski commented Mar 8, 2021 •

edited

Loading

DominiqueMakowski commented Mar 10, 2021 •

edited

Loading

strengejacke commented Jul 26, 2021 •

edited

Loading