Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a function to inject arbitrary summary tables into the codebook #53

Open
mbcann01 opened this issue May 23, 2024 · 2 comments
Open
Assignees
Labels
enhancement New feature or functionality

Comments

@mbcann01
Copy link
Member

mbcann01 commented May 23, 2024

Overview

on 2024-05-23, I got the following email:

I wonder if the following is possible. I want in the summery statistics to leave out missings. So that for instance the summary statistics of percentages for categories do not have missings as a level. Also I want to add my own number for the missings. The reason is I have a pre-post test in long format where a number of variables by design is missing in eather the pre or the post test and it gives the impression of 50 percent missings sometimes.

My response was:

Thanks for using codebookr. I think I understand your use case. Unfortunately, it isn't currently possible to omit missing from the summary stats table for each variable in a straightforward way. It was never my intention for the summary tables to be used as tables of results. They are simply supposed to describe the data. So, in your case, the fact that 50% of the rows have a missing value isn't a problem -- they are missing by design -- but, it's still a true reflection of the state of the data -- 50% of the rows do have a missing value. Of course, you may want to ignore the missing values for your analysis, but I didn't approach the codebook as the place where I wanted to present the results of my analysis. I hope that makes sense.
Having said that, I want to be helpful. I can think of a few possible workarounds that may be helpful to you.

  1. You can fork the repository and make edits to the code on your end. I think you would want to modify R/cb_summary_stats_few_cats.R and R/cb_summary_stats_many_cats.R. You could add some code to filter out missing before calculating percentages.
  2. All of the tables are built with the flextable and officer packages. I think you can create your own summary tables and then "inject" them into the codebook before printing it to a Word document. I can't give you the exact code for this off the top of my head, but it should be theoretically possible. It may take some experimenting.
  3. Finally, you could manually manipulate the summary tables in Word. Although, I am sure that doesn't sound desirable -- particularly if you have a large number of variables.
    I'm sure that isn't the response you were hoping for, but I hope it's somewhat helpful.

For the purpose of this issue, I'm particularly interested in option 2 above. I wonder if it would be relatively easy to create a function that would allow users to inject arbitrary tables into the summary stats section of the codebook.

@mbcann01 mbcann01 self-assigned this May 23, 2024
@mbcann01 mbcann01 added the enhancement New feature or functionality label May 23, 2024
@mbcann01 mbcann01 changed the title Remove missing from the summary stats tables Create a function to inject arbitrary summary tables into the codebook May 23, 2024
@edambo
Copy link

edambo commented Jun 5, 2024

@mbcann01 I think I got this to work. I created a function cb_custom_summary_stats_to_ft in file cb_custom_summary_stats_to_ft.R to maintain the same flextable formatting for the codebook and modified the codebook function to allow the user to inject a list of data frames containing arbitrary summary statistics for specified columns. Per Morri's suggestion, I added a new argument, omit_na_columns, that prevents "Missing" from appearing as a category for specified variables in the resulting codebook file.

@edambo
Copy link

edambo commented Jun 28, 2024

I've made the updates discussed on 6/14/2024 to address this issue:

  • A new function, cb_add_custom_summary_stats, that allows the user to choose from a list of pre-defined statistics or create custom functions and inject the names of these defined functions into the codebook function was created.
  • The function cb_custom_summary_stats_to_ft was updated to change the width of the summary statistics table depending on the number of columns and accept input for user-determined header labels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or functionality
Projects
Status: Active Development
Development

No branches or pull requests

2 participants