Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggcoef() redesign #392

Merged
merged 40 commits into from
Dec 2, 2020
Merged

ggcoef() redesign #392

merged 40 commits into from
Dec 2, 2020

Conversation

larmarange
Copy link
Contributor

@larmarange larmarange commented Oct 18, 2020

Following #372 this PR proposes a complete redesign of ggcoef() based on broom.helpers package (also now used by gtsummary::tbl_regression()).

The PR includes 3 main functions: ggcoef_model(), ggcoef_compare() and ggcoef_multinom(). The original ggcoef() remains unchanged.

The PR also includes two new geoms: geom_stripped_rows() and geom_stripped_cols().

Pre-requisite before merging this PR:

DESCRIPTION Show resolved Hide resolved
@codecov-io
Copy link

codecov-io commented Oct 19, 2020

Codecov Report

Merging #392 (41c4e90) into master (f289584) will increase coverage by 0.20%.
The diff coverage is 94.98%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #392      +/-   ##
==========================================
+ Coverage   92.05%   92.25%   +0.20%     
==========================================
  Files          34       36       +2     
  Lines        4054     4353     +299     
==========================================
+ Hits         3732     4016     +284     
- Misses        322      337      +15     
Impacted Files Coverage Δ
R/gg-plots.R 82.88% <ø> (ø)
R/ggcoef.R 82.97% <ø> (ø)
R/gglyph.R 100.00% <ø> (ø)
R/ggnet.R 99.64% <ø> (ø)
R/ggnet2.R 99.03% <ø> (ø)
R/ggnetworkmap.R 86.53% <ø> (ø)
R/ggnostic.R 96.48% <ø> (ø)
R/ggpairs_add.R 94.40% <ø> (ø)
R/stat_cross.R 98.70% <ø> (ø)
R/stat_prop.R 68.65% <ø> (ø)
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f289584...af9b1e8. Read the comment docs.

@ddsjoberg
Copy link

Hello @larmarange ! I was playing around with ggcoef_model() and ggcoef_plot() this morning, and they look fantastic. I wanted to share a couple of thoughts I had while using them.

  1. Perhaps a better default for ggcoef_model(significance = 0.05) would be ggcoef_model(significance = 1 - conf.level)?

  2. ggcoef_plot() does not have the same significance labelling ability as ggcoef_model(significance=, significance_labels=, show_p_values=, signif_stars=). Is there a reason they were omitted? Of course, the significance level/confidence level cannot be inferred from a table, but if a user manually supplied the level along with the tidy_plus_plus() tibble I don't think there would be an issue.

  3. For categorical variables, I love the default labelling. Is there a way, however, to avoid the duplicate variable labels for the continuous variables in the models?

    mod <- lm(ttdeath ~ marker + trt, gtsummary::trial)
    ggcoef_model(mod)

    image

@larmarange
Copy link
Contributor Author

Hello @larmarange ! I was playing around with ggcoef_model() and ggcoef_plot() this morning, and they look fantastic. I wanted to share a couple of thoughts I had while using them.

Thanks a lot for your feedback

  1. Perhaps a better default for ggcoef_model(significance = 0.05) would be ggcoef_model(significance = 1 - conf.level)?

Good idea. Implemented through f5743c8

  1. ggcoef_plot() does not have the same significance labelling ability as ggcoef_model(significance=, significance_labels=, show_p_values=, signif_stars=). Is there a reason they were omitted? Of course, the significance level/confidence level cannot be inferred from a table, but if a user manually supplied the level along with the tidy_plus_plus() tibble I don't think there would be an issue.

This is a choice by design. ggcoef_model() only have graphic options for advanced users. If a user want to create is own "significance" variable, this variable could be mapped using the shape argument of ggcoef_model().

library(GGally)
#> Le chargement a nécessité le package : ggplot2
#> Registered S3 method overwritten by 'GGally':
#>   method from   
#>   +.gg   ggplot2
library(broom.helpers)
mod <- lm(Sepal.Length ~ ., iris)
res <- mod %>% 
  tidy_plus_plus() %>%
  dplyr::mutate(stars = signif_stars(p.value))
ggcoef_plot(res, shape = "stars", shape_values = 16:19)

As you can see, any custom variable could be mapped to shape.

  1. For categorical variables, I love the default labelling. Is there a way, however, to avoid the duplicate variable labels for the continuous variables in the models?
    mod <- lm(ttdeath ~ marker + trt, gtsummary::trial)
    ggcoef_model(mod)

This is a good question. One label is needed for facets (the bold ones are in fact labels) and one is required for y-axis.

I guess I found a solution implemented in 732f5fd

library(GGally)
#> Le chargement a nécessité le package : ggplot2
#> Registered S3 method overwritten by 'GGally':
#>   method from   
#>   +.gg   ggplot2
library(broom.helpers)
mod <- lm(Sepal.Length ~ ., iris)
ggcoef_model(mod)

ggcoef_model(mod, group_one_row_variables = TRUE)

mod1 <- lm(Fertility ~ ., data = swiss)
mod2 <- step(mod1, trace = 0)
mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss)
models <- list("Full model" = mod1, "Simplified model" = mod2, "With interaction" = mod3)
ggcoef_compare(models)

ggcoef_compare(models, group_one_row_variables = TRUE)

Created on 2020-10-30 by the reprex package (v0.3.0)

@ddsjoberg
Copy link

This is all great! I am thinking I can pass tbl_regression()$table_body %>% ggcoef_plot() to get a great visual of the tabulated model. 🕺

@larmarange
Copy link
Contributor Author

@ddsjoberg If you want to pass tbl_regression()$table_body, don't forget to remove header rows who are not expected by ggcoef_plot()

@larmarange
Copy link
Contributor Author

larmarange commented Nov 1, 2020 via email

@larmarange
Copy link
Contributor Author

@ddsjoberg I have implemented an alternative to group_one_row_variables.

The functions compute a label_light where variable label is removed when identical to term label. It is still possible to map the full label if needed.

image

@larmarange
Copy link
Contributor Author

library(GGally)
#> Le chargement a nécessité le package : ggplot2
#> Registered S3 method overwritten by 'GGally':
#> method from
#> +.gg ggplot2
library(broom.helpers)
mod <- lm(Sepal.Length ~ ., iris)
ggcoef_model(mod)
image

ggcoef_model(mod, y = "label")
image

ggcoef_model(mod, facet_row = NULL)
image

mod1 <- lm(Fertility ~ ., data = swiss)
mod2 <- step(mod1, trace = 0)
mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss)
models <- list("Full model" = mod1, "Simplified model" = mod2, "With interaction" = mod3)
ggcoef_compare(models)
image

Created on 2020-11-01 by the reprex package (v0.3.0)

@ddsjoberg
Copy link

@larmarange I think these plots look fantastic!

@ddsjoberg
Copy link

@larmarange one more question. Is it possible to get this kind of output from ggcoef_plot() with the proper striping?

library(GGally)

lm(Sepal.Length ~ ., iris) %>%
  broom.helpers::tidy_plus_plus() %>%
  dplyr::mutate(label = ifelse(var_type == "continuous", " ", label)) %>%
  ggcoef_plot()

Created on 2020-10-31 by the reprex package (v0.3.0)

@larmarange
Copy link
Contributor Author

@larmarange one more question. Is it possible to get this kind of output from ggcoef_plot() with the proper striping?

library(GGally)

lm(Sepal.Length ~ ., iris) %>%
  broom.helpers::tidy_plus_plus() %>%
  dplyr::mutate(label = ifelse(var_type == "continuous", " ", label)) %>%
  ggcoef_plot()

The problem here is that due to the relabel, all empty y values have the same value (in terms of factor) and therefore have the same strip.

@larmarange
Copy link
Contributor Author

I have added a little trick using term column for computing strips colour.

library(GGally)
#> Le chargement a nécessité le package : ggplot2
#> Registered S3 method overwritten by 'GGally':
#>   method from   
#>   +.gg   ggplot2

lm(Sepal.Length ~ ., iris) %>%
  broom.helpers::tidy_plus_plus() %>%
  dplyr::mutate(label = ifelse(var_type == "continuous", "", label)) %>%
  ggcoef_plot()

Note: your assumption dplyr::mutate(label = ifelse(var_type == "continuous", " ", label) is not always correct for continuous variables using stats::poly (in that case you ave several terms for one continuous variable).

@ddsjoberg
Copy link

These figures are amazing! Thank you for your work on them!

@larmarange
Copy link
Contributor Author

broom.helpers version 1.1.0 has just been released on CRAN. We have to wait for MacOS and Windows binaries to re-run all tests and then the PR should be finally ready. :-)

@larmarange
Copy link
Contributor Author

Hi @schloerke

broom.helpers v1.1.0 is now on CRAN and windows binaries have been generated.

This PR is now ready for being merged.

Only one check failed (ubuntu 16.04 oldrel) but it seems unrelated to the PR. It fails with the installation of 'r-hub/sysreqs'.

Best regards

@schloerke
Copy link
Member

@larmarange Will look through this later tonight. Thank you!

@larmarange
Copy link
Contributor Author

Thanks to you @schloerke

Copy link
Member

@schloerke schloerke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes.

Great code! Thank you for awesome formatting. It was very easy to read! Your ggplot2 integrations are getting / have been beyond my working knowledge. 😳

The vignette looks great! Thank you!

R/ggcoef_model.R Outdated Show resolved Hide resolved
DESCRIPTION Outdated Show resolved Hide resolved
R/ggcoef_model.R Outdated Show resolved Hide resolved
tests/testthat/test-ggcoef_model.R Outdated Show resolved Hide resolved
R/geom_stripped_rows.R Show resolved Hide resolved
R/ggcoef_model.R Outdated Show resolved Hide resolved
R/ggcoef_model.R Outdated Show resolved Hide resolved
R/ggcoef_model.R Outdated Show resolved Hide resolved
R/ggcoef_model.R Outdated Show resolved Hide resolved
vignettes/ggcoef_model.Rmd Show resolved Hide resolved
R/ggcoef_model.R Outdated Show resolved Hide resolved
larmarange and others added 3 commits December 2, 2020 09:04
Co-authored-by: Barret Schloerke <barret@rstudio.com>
@larmarange
Copy link
Contributor Author

@schloerke

Thank you very much for your careful review. I have merged your proposed corrections.

See my answer regarding xfom and xto.

Best regards

@schloerke schloerke merged commit 9de5508 into ggobi:master Dec 2, 2020
larmarange added a commit to larmarange/ggally that referenced this pull request Dec 11, 2020
schloerke pushed a commit that referenced this pull request Dec 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants