Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: a new function ggcoxadjustedcurves to fix #67 #133

Closed
pbiecek opened this issue Feb 1, 2017 · 11 comments
Closed

Suggestion: a new function ggcoxadjustedcurves to fix #67 #133

pbiecek opened this issue Feb 1, 2017 · 11 comments

Comments

@pbiecek
Copy link
Contributor

@pbiecek pbiecek commented Feb 1, 2017

In #67 @markdanese proposed to add survival curves for cox model.
Not curves for average values of variables but curves averages over the distribution of variables observed in the data (like here: https://cran.r-project.org/web/packages/survival/vignettes/adjcurve.pdf, or in Marks' code here: https://gist.github.com/markdanese/d1ddfdb2f618373c719bdb444ca34be9)

Suggestion: add a ggcoxadjustedcurves() function, that will plot survival curve for the whole dataset (average over all individual curves) or for subsets generated by a given variable.

In addition, al individual curves may be also added to the plot.

Some examples:
For whole dataset

screen shot 2017-02-01 at 23 39 12

For selected variables

screen shot 2017-02-01 at 23 25 35

screen shot 2017-02-01 at 23 25 17

btw: I am not familiar with data.table, so this function uses dplyr and tidyr

pbiecek added a commit to pbiecek/survminer that referenced this issue Feb 1, 2017
kassambara added a commit that referenced this issue Feb 2, 2017
Fix for #133 and #64 - new function ggcoxadjustedcurves()
kassambara added a commit that referenced this issue Feb 2, 2017
@kassambara
Copy link
Owner

@kassambara kassambara commented Feb 2, 2017

R CMD check:

checking R code for possible problems ... NOTE
ggcoxadjustedcurves: no visible binding for global variable ‘time’
ggcoxadjustedcurves: no visible binding for global variable ‘value’
ggcoxadjustedcurves: no visible binding for global variable ‘.id’
Undefined global functions or variables:
.id time value
Consider adding
importFrom("stats", "time")
to your NAMESPACE file.

kassambara added a commit that referenced this issue Feb 2, 2017
@kassambara
Copy link
Owner

@kassambara kassambara commented Feb 2, 2017

In the survminer functions eval(fit$call$data) is commonly used to get the model data. This lead to repetitive issues, for example #125 & #9.

It might be better, if we add an extra argument, say data, so that user can provide the model data.

If yes, then we should update ggsurvplot(), ggcoxfunctional() & ggcoxadjustedcurves(). The R code to be added in these functions would look like:

if(missing(data)){
    warning ("...warning message...")
   data <- eval(fit$call$data)
}

What do you think @pbiecek , @MarcinKosinski ?

@kassambara
Copy link
Owner

@kassambara kassambara commented Feb 2, 2017

An error is generated with the R code below:

library(survival)
library(survminer)

fit <- coxph(Surv(time, status) ~ age + ph.ecog+sex, data =  lung)
ggcoxadjustedcurves(fit)
geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?
Warning message:
Removed 185 rows containing missing values (geom_path). 

But, this works:

fit <- coxph(Surv(time, status) ~ age + sex, data =  lung)
ggcoxadjustedcurves(fit)
@pbiecek
Copy link
Contributor Author

@pbiecek pbiecek commented Feb 2, 2017

Regarding the eval(fit$call$data)
I had an issue when trying to use load() function with Shiny. Then the data is loaded to an other environment and the ggsurv returns and error 'data not found'.
I've checked these alternatives:
all.vars(formula(fit$call) - returns all variables from formula
colnames(broom::augment(fit)) - returns all variables and mode
broom::augment(fit) - returns an extended dataset, for some functions it may be enough.

But I agree, that the best way is to add additional data= argument.
Maybe this is a place for .get_data() function.

pbiecek added a commit to pbiecek/survminer that referenced this issue Feb 2, 2017
@pbiecek
Copy link
Contributor Author

@pbiecek pbiecek commented Feb 2, 2017

Regarding NOTEs in CHECK,
they are caused by non standard evaluation used in dplyr/tidyr.
To deal with them I've used a hack as suggested in http://stackoverflow.com/questions/9439256/how-can-i-handle-r-cmd-check-no-visible-binding-for-global-variable-notes-when (solution 2).

pbiecek added a commit to pbiecek/survminer that referenced this issue Feb 2, 2017
pbiecek added a commit to pbiecek/survminer that referenced this issue Feb 2, 2017
* 'master' of https://github.com/pbiecek/survminer:
  revert
  fix for kassambara#133, now the average survival curve is calculated even in some data is missing
pbiecek added a commit to pbiecek/survminer that referenced this issue Feb 2, 2017
…al with conflicts with NULL assignments in ggcoxadjustedcurves
@pbiecek
Copy link
Contributor Author

@pbiecek pbiecek commented Feb 2, 2017

The problem with
fit <- coxph(Surv(time, status) ~ age + ph.ecog+sex, data = lung)
was caused by missing values. Now I've added na.rm=TRUE to each mean() function and both curves are calculated properly.

@kassambara
Copy link
Owner

@kassambara kassambara commented Feb 2, 2017

oh, great!!!

@MarcinKosinski
Copy link
Contributor

@MarcinKosinski MarcinKosinski commented Feb 2, 2017

Good job with that function and with missing data solution.

Can we stick to one naming conventions for parameters? Instead of plot_all can we have plot.all or just all? The same with split_vars? It might be split.vars or just split / vars. Most of the parameters in functions from survminer don't use _ as we stick to . convention which isn't bad for parameters names : )

@MarcinKosinski
Copy link
Contributor

@MarcinKosinski MarcinKosinski commented Feb 2, 2017

@pbiecek can you also consider updating this file https://github.com/kassambara/survminer/blob/master/_pkgdown.yml with every new functionality? This is the configuration for the package site http://www.sthda.com/english/rpkgs/survminer/

@pbiecek
Copy link
Contributor Author

@pbiecek pbiecek commented Feb 2, 2017

@MarcinKosinski updated var names are in 7f5d590

@pbiecek
Copy link
Contributor Author

@pbiecek pbiecek commented Feb 2, 2017

regarding the convention for names
in the _pkgdown you have functions like 'surv_summary', 'surv_cutpoint', 'pairwise_survdiff', 'theme_classic2' so still some work to do to have clean names

pbiecek added a commit to pbiecek/survminer that referenced this issue Feb 2, 2017
kassambara added a commit that referenced this issue Feb 2, 2017
fix for #133, sorry for the mess with reverts
@pbiecek pbiecek closed this Feb 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.