Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plot method for data_tabulate #293

Merged
merged 15 commits into from Aug 31, 2023
Merged

Plot method for data_tabulate #293

merged 15 commits into from Aug 31, 2023

Conversation

etiennebacher
Copy link
Member

Close #289

Minimal working version so far (no docs, etc.). It is such a short step to convert the table into a plot that I think we should restrict the number of args to a minimum here. Otherwise it will just be quicker for the users to make a fancy plot by themselves than go through all possible args (e.g the docs of sjPlot::plot_freq() are a bit overwhelming 😅). I have two "important" args in mind for now:

  • keep or remove NA? if we keep them we show raw counts and percentage, otherwise only the valid ones
  • show or hide counts and percentage on top of bars?

@jmgirard what do you think?

library(see)
library(datawizard)
library(patchwork)

x <- data_tabulate(mtcars, select = "cyl")
plot(x)
#> [[1]]

x <- data_tabulate(mtcars, select = c("cyl", "gear"))
my_plots <- plot(x)
wrap_plots(my_plots)

@bwiernik
Copy link
Contributor

A few comments:

  1. we should include error bars by default
  2. I think we should apply theme_modern() as the default theme for all plots in see
  3. We should relabel "NA" to "(Missing)" and have an argument for the label to use in addition to whether or not to inlcude.

@etiennebacher
Copy link
Member Author

library(see)
library(datawizard)
library(patchwork)

x <- data_tabulate(mtcars, select = "cyl")
plot(x)
#> [[1]]

plot(x, remove_na = TRUE)
#> [[1]]

plot(x, na_label = "Missing values")
#> [[1]]

@bwiernik
Copy link
Contributor

Looking good! Error bars?

@etiennebacher
Copy link
Member Author

@bwiernik did you have something like this in mind? Default is to add error bars (I just took the code from sjPlot::plot_frq())

library(see)
library(datawizard)
library(patchwork)

x <- data_tabulate(mtcars, select = "cyl")

plot(x, remove_na = TRUE)
#> [[1]]

plot(x, remove_na = TRUE, error_bar = FALSE)
#> [[1]]

Created on 2023-06-02 with reprex v2.0.2

@etiennebacher etiennebacher marked this pull request as ready for review August 25, 2023 06:59
@etiennebacher
Copy link
Member Author

Remaining tests failures are not caused by this PR, they are also on main

@strengejacke
Copy link
Member

I think this can be merged?

@bwiernik
Copy link
Contributor

I'm making some tweaks

1. change `remove_na` to `show_na` and set default to conditionally show the missing category if any NA are present
2. use the Wilson confidence interval rather than the normal approximation to avoid CIs that escape the [0, 1] bounds
3. namespace ggplot2 functions
4. add arguments for the column and error bar colors, defaulting to blue hues for each
@bwiernik
Copy link
Contributor

okay, I made those tweaks. The major one is to use the Wilson CI for proportions rather than the normal approximation -- this avoids getting CI bounds that escape [0, 1] or intervals collapsing to nothing at p = 0 or 1

@bwiernik
Copy link
Contributor

this is ready to merge assuming checks still pass

@bwiernik
Copy link
Contributor

@IndrajeetPatil do you know why the R-CMD-check-strict on ubuntu is taking forever to load all of the dependencies?

@bwiernik bwiernik merged commit cd304c4 into main Aug 31, 2023
17 of 26 checks passed
@bwiernik bwiernik deleted the plot_data_tabulate branch August 31, 2023 15:23
@strengejacke
Copy link
Member

The automated checks for the see package are a pain, because there are a lot of minor issues, which yield an error (on purpose), but which are difficult to iron out in order to have a "clean basis" for tests.

@IndrajeetPatil
Copy link
Member

@IndrajeetPatil do you know why the R-CMD-check-strict on ubuntu is taking forever to load all of the dependencies?

Possibly due to this issue: r-lib/vdiffr#137

@bwiernik
Copy link
Contributor

bwiernik commented Sep 5, 2023

That looks like it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Plot method for data_tabulate()?
4 participants