Disambiguate terms #98

ddsjoberg · 2021-02-16T18:07:54Z

Sorry to re-visit this topic so soon. I think I didn't communicate very well last time. If possible, I think the variable name should uniquely identify the terms. To do this, we would need to combine both the "term" and the "group" columns. Below are two suggestions. What do you think?

library(broom.helpers)
packageVersion("broom.helpers")
#> [1] '1.1.0.9002'

lme4::lmer(marker ~ (1|grade) + (1|response) + (death|response), gtsummary::trial) %>%
  tidy_plus_plus() %>%
  dplyr::select(term, variable, group, var_type) %>%
  dplyr::mutate(
    variable_suggested1 = paste(term, group, sep = "."),
    variable_suggested2 = glue::glue("{term}.group({group})")
  ) %>%
  knitr::kable()
#> boundary (singular) fit: see ?isSingular
#> Registered S3 method overwritten by 'broom.mixed':
#>   method      from 
#>   tidy.gamlss broom

term	variable	group	var_type	variable_suggested1	variable_suggested2
sd__(Intercept)	sd__(Intercept)	grade	ran_pars	sd__(Intercept).grade	sd__(Intercept).group(grade)
sd__(Intercept)	sd__(Intercept)	response	ran_pars	sd__(Intercept).response	sd__(Intercept).group(response)
sd__(Intercept)	sd__(Intercept)	response.1	ran_pars	sd__(Intercept).response.1	sd__(Intercept).group(response.1)
cor__(Intercept).death	cor__(Intercept).death	response.1	ran_pars	cor__(Intercept).death.response.1	cor__(Intercept).death.group(response.1)
sd__death	sd__death	response.1	ran_pars	sd__death.response.1	sd__death.group(response.1)
sd__Observation	sd__Observation	Residual	ran_pars	sd__Observation.Residual	sd__Observation.group(Residual)

^{Created on 2021-02-16 by the reprex package (v1.0.0)}

The text was updated successfully, but these errors were encountered:

larmarange · 2021-02-16T18:45:41Z

Dear @ddsjoberg

I guess you are referring to the discussion in #90 regarding terms disambiguation. Your suggestion to change "variable" but not "term" seems strange for me.

The thing is that with broom.mixed all terms are not unique, this is why the authors of broom.mixed suggests in a vignette to disambiguate terms, cf. https://cran.r-project.org/web/packages/broom.mixed/vignettes/broom_mixed_intro.html

For me, disambiguation should be done directly in the term column, to restore unicity of that col, and this is why I was suggesting to propose a new function tidy_disambiguate_terms() transforming "sd__(intercept)" into "grade.sd__(intercept)" (with a option to choose the separator and with the creation of a new column "original_term" to identify when disambiguation was performed).

At broom.helpers level, we do not know how the tibble will be used, if results will be grouped by "variable_label" or simply listed by "term".

In tidy_plus_plus() the disambiguation would of course be optional.

Such disambiguation should not be a problem for gtsummaty::tbl_regression().

larmarange · 2021-02-16T18:49:22Z

In broom.mixed vignette:

the categorical or continuous predictor variables that control the expected value (i.e., enter into the linear predictor for some part of the model) are called terms (term column in tidy() output); note that unlike in base broom, the term column may have duplicated values, because the same term may enter multiple model components (e.g. zero-inflated and conditional models; models for more than one response; fixed effects and random effects)

(...)

tidy(fitted_model) %>% tidyr::unite(term, group, term) will create a new term column that’s the combination of the group and term columns (which will disambiguate random-effect terms from different grouping variables); unite(term, component, term) will disambiguate conditional and zero-inflation parameters. The code below shows a slightly more complicated (but prettier) approach. (Some sort of disambiguate_terms() function could be added in a future version of the package …)

ddsjoberg · 2021-02-16T18:57:16Z

Gotcha! Yes, that is the original issue I was referring to!

I had read the word disambiguate differently than the authors of the vignette intended. I read it as separating a single column into multiple columns rather than the opposite! From there, I wasn't reading the rest of our convo correctly....sorry about that! 👀

I was suggesting to change variable rather than the term initially because I was thinking the term and group columns should remain exactly as they come from the broom.mixed tidier function. But I think disambiguating the term column makes a lot of sense, and I am for it!

larmarange · 2021-02-16T19:23:28Z

OK will prepare a PR tomorrow or the day after with a new function tidy_disambiguate_terms().

larmarange changed the title ~~Update variable name for ran_pars ?~~ Disambiguate terms Feb 17, 2021

larmarange closed this as completed in 51840f4 Feb 17, 2021

larmarange mentioned this issue Nov 7, 2023

Feature request: Re-order rows for merged mixed models tables that include random effects component ddsjoberg/gtsummary#1567

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disambiguate terms #98

Disambiguate terms #98

ddsjoberg commented Feb 16, 2021

larmarange commented Feb 16, 2021 •

edited

larmarange commented Feb 16, 2021

ddsjoberg commented Feb 16, 2021

larmarange commented Feb 16, 2021

Disambiguate terms #98

Disambiguate terms #98

Comments

ddsjoberg commented Feb 16, 2021

larmarange commented Feb 16, 2021 • edited

larmarange commented Feb 16, 2021

ddsjoberg commented Feb 16, 2021

larmarange commented Feb 16, 2021

larmarange commented Feb 16, 2021 •

edited