Add flexsurvspline support for survival_reg model #213

mattwarkentin · 2022-09-01T20:33:01Z

This PR is a start for addressing #115.

I have marked the PR as a draft because I think some changes to {parsnip} are required before this PR should be merged into {censored}.

I look forward to your feedback and hopefully getting this integrated into censored eventually.

hfrick · 2022-09-02T15:51:46Z

Hi @mattwarkentin , thanks for the PR!

Which changes in parsnip are you referring to? Let me know when you want feedback on the PR! (Now?)

mattwarkentin · 2022-09-02T15:59:46Z

I am somewhat new to contributing to parsnip-adjacent packages, but I thought that an update to the parsnip documentation would be important for showing the flexsurvspline engine and tunable arguments in the documentation (i.e., num_knots, survival_link).

Basically flexsurvspline versions of:

But maybe I'm wrong.

Sure, happy to receive feedback on the PR any time!

hfrick

This looks good already! Can you add tests though, please? Re parsnip changes: yes, the documentation updates need to happen in parsnip, like you highlighted already 👍

Next steps (at least what I can think of right now)

add unit tests to censored
PR to parsnip with documentation and making the engine args tunable (I left a more detailed comment on this on the code directly)
Update vignette/articles/examples.Rmd
Update README.Rmd
todo for myself: add tests on case weights
update NEWS, including PR number and contributor github name

R/survival_reg-data.R

mattwarkentin · 2022-10-20T16:50:12Z

Sorry for dropping the ball on this. I was moving across the country for a new job and just totally got sidetracked. Want me to get this back on track, @hfrick?

hfrick · 2022-10-25T14:25:35Z

No worries, there is no rush! If you want, that'd be great! The main things are the unit tests and the documentation over in parsnip.

mattwarkentin · 2022-10-27T16:35:12Z

For me:

add unit tests to censored
PR to parsnip with documentation and making the engine args tunable (I left a more detailed comment on this on the code directly)
Update vignette/articles/examples.Rmd
Update README.Rmd
update NEWS, including PR number and contributor github name

mattwarkentin · 2022-10-27T17:25:14Z

Okay @hfrick, we are getting close. The one thing I'm stumbling on is that there surely needs to be somewhere where we connect the fact that OUR parameter is called num_knots, while flexsurv::flexsurvspline() uses k.

I thought maybe I handled this when I made the PR to dials, but it doesn't look like it. Where do we make that mapping?

This seems like something handled by parsnip::set_model_arg(), but decided to remove that above...how do we do this??

hfrick · 2022-10-28T17:31:09Z

Awesome!

If the argument you want to make tunable is a main argument, ie an argument directly to survival_reg(), you use set_model_arg() like you initially tried. Since k is an engine-specific argument, it's more subtle. This happens in the tunable() method for survival_reg() (which sits in parsnip) - you already modified the right things, the last missing bit was just to use the name of the arg as it is used in flexsurv. Then you have your link to dials and the tidymodels machinery can work. The changes you made in dials were to create the parameter object (tidymodels uses that to get possible parameter values for tuning). With the change in parsnip you link that to the engine argument and the machinery is connected.

Thanks for adding the tests! Could you update them so that they make use of the spline functionality? Then we know that this aspect also works! For survival probability and hazard, we don't need to test against rms if we test against flexsurv.

Re predictions of the linear predictor: I noticed in the test example that flexsurv returns negative values and censored then tries to log those, resulting in NaN. For flexsurvreg(), predictions by flexsurv are exp(x * beta), which is why censored logs them before returning them as predictions of type linear_pred (so that it's x * beta). How is that with flexsurvspline()? What exactly does it return, does it make sense to log here? See reprex below.

The main branch has changed a lot since this branch got started but we are not seeing any merge conflicts - maybbe that's because it's a draft PR so just as a heads up that we might still encounter some.

library(flexsurv)
#> Loading required package: survival

spline_fit <- flexsurvspline(
  Surv(time, status) ~ age + sex,
  data = lung
)
predict(spline_fit, lung[1:5,], type = "linear")
#> # A tibble: 5 × 2
#>   .time .pred_link
#>   <dbl>      <dbl>
#> 1     0      -7.63
#> 2     0      -7.72
#> 3     0      -7.92
#> 4     0      -7.90
#> 5     0      -7.85

non_spline_fit <- flexsurvreg(
  Surv(time, status) ~ age + sex,
  data = lung, dist = "weibull"
)
predict(non_spline_fit, lung[1:5,], type = "linear")
#> # A tibble: 5 × 2
#>   .time .pred_link
#>   <dbl>      <dbl>
#> 1     0       314.
#> 2     0       338.
#> 3     0       392.
#> 4     0       387.
#> 5     0       373.

^{Created on 2022-10-28 with reprex v2.0.2}

mattwarkentin · 2022-10-31T17:06:23Z

Could you update them so that they make use of the spline functionality? Then we know that this aspect also works! For survival probability and hazard, we don't need to test against rms if we test against flexsurv.

Tests are updated and removed the rms comparison.

How is that with flexsurvspline()?

A flexsurvspline model is just a flexsurvreg model with a different distribution so the predictions are made the same way with predict.flexsurvreg() which ultimately relies on the machinery of summary.flexsurvreg(). So unless something is wrong, they should be returning comparable statistics, just on a different scale. Am I misunderstanding?

mattwarkentin · 2022-10-31T17:29:21Z

For flexsurvreg(), predictions by flexsurv are exp(x * beta), which is why censored logs them before returning them as predictions of type linear_pred (so that it's x * beta).

Are we sure this is the case? Maybe we should loop in @chjackson and get his input.

chjackson · 2022-10-31T19:39:19Z

Yes in flexsurv, predict(...,type="linear") returns the "fitted values of the location parameter" - understood as being on the natural scale, not logged. Perhaps I should disambiguate this doc.

hfrick · 2022-11-02T12:19:47Z

Thanks @chjackson! So just to be super clear: this applies to the predict methods for both the models from flexsurvreg() and flexsurvspline()?

Where I'm coming from with this question: For the existing engine, which uses flexsurvreg(), censored logs the prediction values returned from flexsurv because it standardizes across engines and most other engines return on that scale. So now I'm asking if censored needs to do the same for this new flexsurvspline engine?

chjackson · 2022-11-02T12:55:35Z

Both flexsurvspline and flexsurvreg return objects of class "flexsurvreg", so the same predict method will be used.

Is there a confusion here since flexsurv models can be based on different parametric survival distributions? The meaning of "location parameter" depends on the distribution. Sometimes this parameter is defined to be positive (such as rate and scale parameters in the exponential, Weibull or gamma), and sometimes it is unrestricted (such as meanlog in the log-normal, and the gamma0 parameter in flexsurvspline). So the "natural scale" of the parameter could either be positive or unrestricted. The "transformed scale" used for estimation is the log scale for positive parameters, and the same as the natural scale for unrestricted parameters.

I guess flexsurv conflicts here with user expectations that predict(...,type="linear") methods should always return quantities on the unrestricted scale?

chjackson · 2022-11-02T13:39:32Z

FYI if this is helpful, for a parameter named "parname" in a fitted flexsurv model x, the function x$dlist$transforms[["parname"]]() transforms a parameter value from its natural scale to the unrestricted scale. This is nearly always either log or the identity transformation, depending on the model. x$dlist$location contains the name of the location parameter.

mattwarkentin · 2022-11-02T18:02:09Z

@hfrick You may wish to use the functions in x$dlist$inv.transforms to get location parameters on the unrestricted scale. As Chris mentioned, it is often either identity() or log().

chjackson · 2022-11-02T18:09:41Z

No it's x$dlist$transforms to go from the natural scale to the unrestricted scale, and x$dlist$inv.transforms to go the other way.

mattwarkentin · 2022-11-02T18:25:13Z

Oops, my mistake. Please ignore my previous comment.

hfrick · 2022-11-09T15:40:18Z

@mattwarkentin @chjackson Thank you both; that's really helpful to know! I've now changed the transformation for the link/linear predictor in this commit 31fd6b9 - could you confirm that this is correct now?

Other than that, I think this PR is ready!

chjackson · 2022-11-09T16:22:56Z

Can't see anything wrong with the transformation procedure there

mattwarkentin · 2022-11-09T16:27:13Z

LGTM!

hfrick · 2022-11-09T18:56:08Z

Great, thanks so much both!

github-actions · 2022-11-24T00:59:59Z

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

Add flexsurvspline support for survival_reg model

50134a4

hfrick reviewed Sep 7, 2022

View reviewed changes

R/survival_reg-data.R Outdated Show resolved Hide resolved

R/survival_reg-data.R Show resolved Hide resolved

hfrick mentioned this pull request Sep 7, 2022

Rename flexsurv engine to flexsurvreg? #214

Closed

mattwarkentin mentioned this pull request Oct 27, 2022

Adds support for flexsurvspline engine for survival_reg model spec tidymodels/parsnip#831

Merged

mattwarkentin and others added 6 commits October 27, 2022 11:53

Remove code for setting model args

4861084

Update README with new engine

0d53528

Add NEWS bullet point with PR number

ac38a54

Add test coverage and example to vignette

c84c44f

no dist arg for the spline version

58af398

add github user name for Matt

ed9649e

simonpcouch mentioned this pull request Oct 28, 2022

Creating Parsnip Model Functions tidymodels/parsnip#832

Closed

Removed rms tests

20dec28

hfrick added 2 commits November 9, 2022 10:20

Merge commit '4043ce647abfb222ce03f9bdd4ffcb9f38d2e54e'

3b410e9

test against results directly from the engine

81d51e6

hfrick added 5 commits November 9, 2022 10:55

make use of k > 0 to use the splines

2c4b14e

use transformation dependent on location parameter

31fd6b9

test on case weights

d315b47

test matrix interface via fit_xy()

7bfa6ab

fix NEWS

4957763

hfrick marked this pull request as ready for review November 9, 2022 15:38

hfrick mentioned this pull request Nov 9, 2022

Transform linear_pred depending on the distribution #229

Merged

hfrick merged commit 04478f1 into tidymodels:main Nov 9, 2022

hfrick mentioned this pull request Nov 9, 2022

Support spline-based distribution for flexsurv #115

Closed

github-actions bot locked and limited conversation to collaborators Nov 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add flexsurvspline support for survival_reg model #213

Add flexsurvspline support for survival_reg model #213

mattwarkentin commented Sep 1, 2022

hfrick commented Sep 2, 2022

mattwarkentin commented Sep 2, 2022 •

edited

hfrick left a comment •

edited

mattwarkentin commented Oct 20, 2022

hfrick commented Oct 25, 2022

mattwarkentin commented Oct 27, 2022 •

edited

mattwarkentin commented Oct 27, 2022 •

edited

hfrick commented Oct 28, 2022

mattwarkentin commented Oct 31, 2022

mattwarkentin commented Oct 31, 2022 •

edited

chjackson commented Oct 31, 2022

hfrick commented Nov 2, 2022 •

edited

chjackson commented Nov 2, 2022

chjackson commented Nov 2, 2022

mattwarkentin commented Nov 2, 2022

chjackson commented Nov 2, 2022

mattwarkentin commented Nov 2, 2022

hfrick commented Nov 9, 2022

chjackson commented Nov 9, 2022

mattwarkentin commented Nov 9, 2022

hfrick commented Nov 9, 2022

github-actions bot commented Nov 24, 2022

Add flexsurvspline support for survival_reg model #213

Add flexsurvspline support for survival_reg model #213

Conversation

mattwarkentin commented Sep 1, 2022

hfrick commented Sep 2, 2022

mattwarkentin commented Sep 2, 2022 • edited

hfrick left a comment • edited

Choose a reason for hiding this comment

mattwarkentin commented Oct 20, 2022

hfrick commented Oct 25, 2022

mattwarkentin commented Oct 27, 2022 • edited

mattwarkentin commented Oct 27, 2022 • edited

hfrick commented Oct 28, 2022

mattwarkentin commented Oct 31, 2022

mattwarkentin commented Oct 31, 2022 • edited

chjackson commented Oct 31, 2022

hfrick commented Nov 2, 2022 • edited

chjackson commented Nov 2, 2022

chjackson commented Nov 2, 2022

mattwarkentin commented Nov 2, 2022

chjackson commented Nov 2, 2022

mattwarkentin commented Nov 2, 2022

hfrick commented Nov 9, 2022

chjackson commented Nov 9, 2022

mattwarkentin commented Nov 9, 2022

hfrick commented Nov 9, 2022

github-actions bot commented Nov 24, 2022

mattwarkentin commented Sep 2, 2022 •

edited

hfrick left a comment •

edited

mattwarkentin commented Oct 27, 2022 •

edited

mattwarkentin commented Oct 27, 2022 •

edited

mattwarkentin commented Oct 31, 2022 •

edited

hfrick commented Nov 2, 2022 •

edited