Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Targets cannot find data when plotting survival fits using survminer #541

Closed
7 tasks done
arnold-c opened this issue Jul 1, 2021 · 1 comment
Closed
7 tasks done
Assignees

Comments

@arnold-c
Copy link

arnold-c commented Jul 1, 2021

Prework

  • Read and agree to the code of conduct and contributing guidelines.
  • Confirm that your issue is a genuine bug in the targets package itself and not a user error, known limitation, or issue from another package that targets depends on. For example, if you get errors running tar_make_clustermq(), try isolating the problem in a reproducible example that runs clustermq and not targets. And for miscellaneous troubleshooting, please post to discussions instead of issues.
  • If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
  • Post a minimal reproducible example like this one so the maintainer can troubleshoot the problems you identify. A reproducible example is:
    • Runnable: post enough R code and data so any onlooker can create the error on their own computer.
    • Minimal: reduce runtime wherever possible and remove complicated details that are irrelevant to the issue at hand.
    • Readable: format your code according to the tidyverse style guide.

Description

When using the {survminer} package to plot survival curves with the ggsurvplot()function, {targets} cannot find the data if it is a data.frame/tibble created within a notebook (specified in _targets.R with tar_render()). This is not an issue when the notebook is knit normally. There is a workaround by using a modified plotting function that takes a summary data.frame (see example 3 below). I don't believe this is expected behavior with regards to the way {targets} accesses data.frames in memory, but please close if it is.

Reproducible example

library(survival)
library(survminer)
attach(lung)

# Create a data.frame in the notebook
test <- filter(lung, sex == 1)

# This works in a targets pipeline
reprex_survival_1 <- survfit(Surv(time, status) ~ 1, data = lung)
ggsurvplot(reprex_survival_1)

# This doesn't work in a targets pipeline
reprex_survival_2 <- survfit(Surv(time, status) ~ 1, data = test)
ggsurvplot(reprex_survival_2)

# This also works in a targets pipeline - presumably because the `surv_summary()`
# function creates a data.frame
reprex_survival_3 <- survfit(Surv(time, status) ~ 1, data = test)
ggsurvplot_df(surv_summary(reprex_survival_1))

Expected result

It should create the survival plot, as it does with examples 1 and 3 above.

Diagnostic information

sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.1.0 (2021-05-18)
#>  os       macOS Big Sur 10.16         
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/Toronto             
#>  date     2021-07-01                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date       lib source             
#>  cli           2.5.0   2021-04-26 [1] standard (@2.5.0)  
#>  digest        0.6.27  2020-10-24 [1] standard (@0.6.27) 
#>  evaluate      0.14    2019-05-28 [1] standard (@0.14)   
#>  fs            1.5.0   2020-07-31 [1] standard (@1.5.0)  
#>  glue          1.4.2   2020-08-27 [1] standard (@1.4.2)  
#>  highr         0.9     2021-04-16 [1] standard (@0.9)    
#>  htmltools     0.5.1.1 2021-01-22 [1] standard (@0.5.1.1)
#>  knitr         1.33    2021-04-24 [1] standard (@1.33)   
#>  magrittr      2.0.1   2020-11-17 [1] standard (@2.0.1)  
#>  reprex        2.0.0   2021-04-02 [1] standard (@2.0.0)  
#>  rlang         0.4.11  2021-04-30 [1] standard (@0.4.11) 
#>  rmarkdown     2.8     2021-05-07 [1] standard (@2.8)    
#>  rstudioapi    0.13    2020-11-12 [1] standard (@0.13)   
#>  sessioninfo   1.1.1   2018-11-05 [1] standard (@1.1.1)  
#>  stringi       1.6.2   2021-05-17 [1] standard (@1.6.2)  
#>  stringr       1.4.0   2019-02-10 [1] standard (@1.4.0)  
#>  withr         2.4.2   2021-04-18 [1] standard (@2.4.2)  
#>  xfun          0.23    2021-05-15 [1] standard (@0.23)   
#>  yaml          2.2.1   2020-02-01 [1] standard (@2.2.1)  
#> 
#> [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library
> traceback()
8: stop(error_run(...))
7: throw_run(conditionMessage(e), "\nVisit https://books.ropensci.org/targets/debugging.html ", 
       "for debugging advice.")
6: value[[3L]](cond)
5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
4: tryCatchList(expr, classes, parentenv, handlers)
3: tryCatch(callr_dispatch(targets_function, targets_arguments, 
       callr_function, callr_arguments), callr_error = function(e) {
       throw_run(conditionMessage(e), "\nVisit https://books.ropensci.org/targets/debugging.html ", 
           "for debugging advice.")
   })
2: callr_outer(targets_function = tar_make_inner, targets_arguments = targets_arguments, 
       callr_function = callr_function, callr_arguments = callr_arguments)
1: (function (names = NULL, reporter = Sys.getenv("TAR_MAKE_REPORTER", 
       unset = "verbose"), callr_function = callr::r, callr_arguments = targets::callr_args_default(callr_function, 
       reporter)) 
   {
       assert_script()
       assert_flag(reporter, tar_make_reporters())
       assert_callr_function(callr_function)
       assert_list(callr_arguments, "callr_arguments mut be a list.")
       targets_arguments <- list(names_quosure = rlang::enquo(names), 
           reporter = reporter)
       out <- callr_outer(targets_function = tar_make_inner, targets_arguments = targets_arguments, 
           callr_function = callr_function, callr_arguments = callr_arguments)
       invisible(out)
   })()
15f6baa06e6a043d7aa930f8ba23778e2d740162"
@wlandau
Copy link
Member

wlandau commented Jul 1, 2021

I assume you mean a pipeline like this?

library(targets)
tar_script({
  library(survival)
  library(survminer)
  library(targets)
  list(
    tar_target(test, lung),
    tar_target(survival, survfit(Surv(time, status) ~ 1, data = test)),
    tar_target(plot, ggsurvplot(survival))
  )
})

tar_make()
#> Loading required package: ggplot2
#> Loading required package: ggpubr
#> • start target test
#> • built target test
#> • start target survival
#> • built target survival
#> • start target plot
#> x error target plot
#> • end pipeline
#> Error : object 'test' not found
#> Error: callr subprocess failed: object 'test' not found
#> Visit https://books.ropensci.org/targets/debugging.html for debugging advice.

Created on 2021-07-01 by the reprex package (v2.0.0)

This is an instance of #160 and not a bug in targets. For reproducibility purposes, targets runs each target in a non-global environment that inherits from globalenv(). Some packages like lme4 and apparently survminer/survival expect data to be in globalenv() despite claims in the docs that data = NULL means the data is taken from the fit object. I can reproduce this error without targets, so I recommend following up with an issue in the survminer or survival.

library(survival)
library(survminer)
#> Loading required package: ggplot2
#> Loading required package: ggpubr
envir <- new.env(parent = globalenv())
evalq({
  test <- lung
  survival <- survfit(Surv(time, status) ~ 1, data = test)
  ggsurvplot(survival)
}, envir = envir)
#> Error in eval(fit$call$data): object 'test' not found

Created on 2021-07-01 by the reprex package (v2.0.0)

ggsurvplot() takes a data argument, so I recommend passing the data explicitly. The pipeline works if I do that.

library(targets)
tar_script({
  library(survival)
  library(survminer)
  library(targets)
  list(
    tar_target(test, lung),
    tar_target(survival, survfit(Surv(time, status) ~ 1, data = test)),
    tar_target(plot, ggsurvplot(survival, data = test))
  )
})

tar_make()
#> Loading required package: ggplot2
#> Loading required package: ggpubr
#> • start target test
#> • built target test
#> • start target survival
#> • built target survival
#> • start target plot
#> • built target plot
#> • end pipeline

Created on 2021-07-01 by the reprex package (v2.0.0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants