survival probabilities for glmnet proportional hazards model #33

topepo · 2021-03-15T15:20:10Z

We want to get survival probabilities for glmnet objects that use family = "cox". This page has details on the Cox model with glmnet and describes the special approach that is required for this model implementation.

There are a few complications that arise when building a wrapper around this model:

No formulas

glmnet() does not use a formula method so something simple like Surv(time, event) ~ x_1 + x_2 isn't possible. In their examples, they has the contents of the Surv() object as a matrix (or an object of class Surv) to the glmnet() function. I believe that the censored package already deals with this.

The main consequence of no formula method is that, for stratification, we cannot use the canonical function (e.g., Surv(time, event) ~ x_1 + x_2 + strata(var)). There is a different function that is used on the outcome object to store the stratification variable. The syntax looks like stratifySurv(surv_object, strata_var).

Retain the training set

Like the survival package, survival probabilities for the Cox model are best computed by using the survfit() method on the model object. While glmnet does have a survfit() method for their data, it requires the original training set data to work.

We'll have to attach x and y data to the fitted glmnet object when the model is fit.

Predictions over penalty values

Like other glmnet objects, we can make predictions over many values of lambda for the same model object. For survival probabilities, this is also the case. When making such predictions, we also need to specify the time points. As a result, the standard nested tibble that censored produces will have a row for each combination of .time and lambda and would look something like:

# A tibble: 4 x 3
  .time .pred_survival penalty
  <dbl>          <dbl>   <dbl>
1     1          0.966    0.01
2    10          0.448    0.01
3     1          0.951    0.1 
4    10          0.421    0.1

The initial work here is to make a function similar to censored::cph_survival_prob() to get the data in this format. It looks like survival:::survfit.coxph() produces a list of survfit objects for each value of lambda. It may not bee too complex to get these predictions and then reformat them (which we have code to do for the results of survival:::survfit.coxph()).

Note: recall that, when a glmnet model is fit via parsnip, we require a single penalty value (even though the model produces all of the coefficients for the entire path of penalty values). For this reason, predict.model_fit() will only produce predictions for a single penalty value. This function should produce the above output without the penalty column. The multi_predict() method will have an argument for the penalty values and its results will look like the tibble above.

The text was updated successfully, but these errors were encountered:

hfrick · 2021-04-16T15:13:54Z

see also #29 on multi_predict()

hfrick · 2021-07-07T09:56:13Z

Closed via #46, #61, and #70

github-actions · 2021-11-05T00:32:28Z

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

hfrick added the feature a feature request or enhancement label Apr 16, 2021

hfrick mentioned this issue Apr 21, 2021

survival probabilities for non-stratified Cox models via glmnet #46

Merged

hfrick mentioned this issue Jun 7, 2021

Survival probabilities for stratified Cox models via glmnet #61

Merged

hfrick mentioned this issue Jun 29, 2021

multi_predict() for coxnet models #70

Merged

hfrick closed this as completed Jul 7, 2021

github-actions bot locked and limited conversation to collaborators Nov 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

survival probabilities for glmnet proportional hazards model #33

survival probabilities for glmnet proportional hazards model #33

topepo commented Mar 15, 2021

hfrick commented Apr 16, 2021

hfrick commented Jul 7, 2021

github-actions bot commented Nov 5, 2021

survival probabilities for glmnet proportional hazards model #33

survival probabilities for glmnet proportional hazards model #33

Comments

topepo commented Mar 15, 2021

No formulas

Retain the training set

Predictions over penalty values

hfrick commented Apr 16, 2021

hfrick commented Jul 7, 2021

github-actions bot commented Nov 5, 2021