Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check and fix tables that are degenerate #1089

Open
Melkiades opened this issue Mar 1, 2024 · 0 comments
Open

Check and fix tables that are degenerate #1089

Melkiades opened this issue Mar 1, 2024 · 0 comments

Comments

@Melkiades
Copy link
Contributor

Melkiades commented Mar 1, 2024

We recently encounter an issue with @pawelru (#1087) that is easily solvable with the following transformation

Here is the source code of the problem and its solution:

library(shiny)
library(teal.code)
library(teal.data)
library(teal.slice)
library(teal)
#> Registered S3 method overwritten by 'teal':
#>   method        from      
#>   c.teal_slices teal.slice
#> 
#> You are using teal version 0.15.0.9002
#> 
#> Attaching package: 'teal'
#> The following objects are masked from 'package:teal.slice':
#> 
#>     as.teal_slices, teal_slices
library(teal.transform)
library(formatters)
library(magrittr)
library(rtables)
#> 
#> Attaching package: 'rtables'
#> The following object is masked from 'package:utils':
#> 
#>     str
library(tern)
#> Registered S3 method overwritten by 'tern':
#>   method   from 
#>   tidy.glm broom
library(teal.modules.clinical)
library(scda)
#> 

ADSL <- synthetic_cdisc_dataset("latest", "adsl")
ADSL <- df_explicit_na(ADSL)
ADEG <- synthetic_cdisc_dataset("latest", "adeg")
ADEG <- df_explicit_na(ADEG)

stopifnot(rlang::hash(ADSL) == "552167ac39b643be30e10403ef01cf4e")
stopifnot(rlang::hash(ADEG) == "341c76eb8b14cd5903dc376532099b37")

ADEG <- dplyr::inner_join(x = ADEG, y = ADSL[, c("STUDYID", "USUBJID"),
  drop = FALSE
], by = c("STUDYID", "USUBJID"))

ANL_1 <- ADSL %>% dplyr::select(STUDYID, USUBJID, ARM)
ANL_2 <- ADEG %>% dplyr::select(STUDYID, USUBJID, PARAMCD, AVISIT, AVAL)
ANL_3 <- ADEG %>%
  dplyr::filter(PARAMCD == "HR") %>%
  dplyr::select(STUDYID, USUBJID, PARAMCD, AVISIT)
ANL <- ANL_1
ANL <- dplyr::inner_join(ANL, ANL_2, by = c("STUDYID", "USUBJID"))
ANL <- dplyr::inner_join(ANL, ANL_3, by = c("STUDYID", "USUBJID", "PARAMCD", "AVISIT"))
ANL <- ANL %>% teal.data::col_relabel(
  ARM = "Description of Planned Arm",
  AVAL = "Analysis Value", AVISIT = "Analysis Visit",
  PARAMCD = "Parameter Code"
)
ANL_ADSL_1 <- ADSL %>% dplyr::select(STUDYID, USUBJID, ARM)
ANL_ADSL <- ANL_ADSL_1
ANL_ADSL <- ANL_ADSL %>% teal.data::col_relabel(ARM = "Description of Planned Arm")
anl <- ANL %>% df_explicit_na(omit_columns = setdiff(names(ANL), c("AVISIT", "AVAL")), na_level = "<Missing>")
anl <- anl %>% dplyr::mutate(ARM = droplevels(ARM))
arm_levels <- levels(anl[["ARM"]])
ANL_ADSL <- ANL_ADSL %>% dplyr::filter(ARM %in% arm_levels)
ANL_ADSL <- ANL_ADSL %>% dplyr::mutate(ARM = droplevels(ARM))
ANL_ADSL <- df_explicit_na(ANL_ADSL, na_level = "<Missing>")
split_fun <- drop_split_levels
# PROBLEM ---------------------------------------------------------------------------------------
lyt <- rtables::basic_table(title = "Summary Table for AVAL by AVISIT") %>%
  rtables::split_cols_by("ARM", split_fun = drop_split_levels) %>%
  # rtables::add_overall_col("All Patients") %>% # Error in get_acolvar_vars(lyt) : length(clyt) == 1L is not TRUE
  rtables::add_colcounts() %>%
  rtables::split_rows_by("AVISIT",
    split_label = teal.data::col_labels(ANL, fill = FALSE)[["AVISIT"]],
    split_fun = split_fun, label_pos = "topleft"
  ) %>%
  split_cols_by_multivar(vars = "AVAL", varlabels = c(AVAL = "Analysis Value")) %>%
  summarize_colvars(
    na.rm = FALSE,
    denom = "N_col",
    .stats = c(
      "n",
      "mean_sd",
      "median",
      "range",
      "count"
    ),
    na_str = "<Missing>"
  )
all_zero <- function(tr) {
  if (!inherits(tr, "TableRow") || inherits(tr, "LabelRow")) {
    return(FALSE)
  }
  rvs <- unlist(unname(row_values(tr)))
  isTRUE(all(rvs == 0))
}
result <- rtables::build_table(lyt = lyt, df = anl, alt_counts_df = ANL_ADSL) %>%
  rtables::trim_rows(criteria = all_zero)

head(result) # The total is missing
#> Summary Table for AVAL by AVISIT
#> 
#> —————————————————————————————————————————————————————————————————
#>                    A: Drug X        B: Placebo     C: Combination
#>                  Analysis Value   Analysis Value   Analysis Value
#> Analysis Visit      (N=134)          (N=134)          (N=132)    
#> —————————————————————————————————————————————————————————————————
#> SCREENING                                                        
#>   n                   134              134              132      
#>   Mean (SD)       70.8 (20.1)      67.7 (18.9)      70.2 (19.3)  
#>   Median              72.2             66.3             69.9     
#>   Min - Max       11.8 - 116.5     30.3 - 124.4     22.3 - 129.1 
#> BASELINE
# SOLUTION ---------------------------------------------------------------------------------------
lyt <- rtables::basic_table(title = "Summary Table for AVAL by AVISIT") %>%
  rtables::split_cols_by("ARM", split_fun = split_fun) %>%
  rtables::add_overall_col("All Patients") %>%
  rtables::add_colcounts() %>%
  rtables::split_rows_by("AVISIT", split_label = teal.data::col_labels(ANL, fill = FALSE)[["AVISIT"]], 
                         split_fun = split_fun, label_pos = "topleft") %>%
  analyze_vars("AVAL", denom = "N_col", .stats = c("n", "mean_sd", "median", "range", "count"), na_str = "<Missing>")
result <- rtables::build_table(lyt = lyt, df = anl, alt_counts_df = ANL_ADSL) %>%
  rtables::trim_rows(criteria = all_zero)

head(result) # The total is not missing (colnames to fix)
#> Summary Table for AVAL by AVISIT
#> 
#> ————————————————————————————————————————————————————————————————————————————
#>                   A: Drug X      B: Placebo    C: Combination   All Patients
#> Analysis Visit     (N=134)        (N=134)         (N=132)         (N=400)   
#> ————————————————————————————————————————————————————————————————————————————
#> SCREENING                                                                   
#>   n                  134            134             132             400     
#>   Mean (SD)      70.8 (20.1)    67.7 (18.9)     70.2 (19.3)     69.5 (19.4) 
#>   Median             72.2           66.3            69.9            69.9    
#>   Min - Max      11.8 - 116.5   30.3 - 124.4    22.3 - 129.1    11.8 - 129.1
#> BASELINE

Created on 2024-03-01 with reprex v2.1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant