Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predict survival probabilities at given times #58

Closed
gunesevitan opened this issue Jan 5, 2021 · 4 comments
Closed

Predict survival probabilities at given times #58

gunesevitan opened this issue Jan 5, 2021 · 4 comments

Comments

@gunesevitan
Copy link

Hi, I'm trying to predict survival probabilities of a population at 6, 12 and 24 months. lifelines.CoxPHFitter has that functionality. I am able to specify a times parameter for predict survival function. I wonder if it is also possible with DeepSurv model. If not, how can I achieve similar results?

@havakv
Copy link
Owner

havakv commented Jan 8, 2021

Hi! There isn't an option to get that directly for DeepSurv here, but if use model.predict_surv_df (like in this notebook) you get predictions in the form of a data frame where the index represents the time. To get predictions for a given point in time, you just have to look at that index. Remember that the survival prediction of DeepSurv are a step function, so to get the survival prediction for time_t, you need something like

surv = model.predict_surv_df(x_test)
preds = surv[surv.index <= time_t].iloc[-1]

Does this makes sense, or do you want a more detailed explanation?

@gunesevitan
Copy link
Author

gunesevitan commented Jan 19, 2021

Yeah, it makes perfect sense, but I needed the exact probabilities of the given timesteps because I was trying evaluate my results with AUC metric. I solved it by adding new timesteps as index to surv_df, then I used linear interpolation for missing rows. The results were acceptable.

@havakv
Copy link
Owner

havakv commented Jan 20, 2021

That should probably work fine. The downside of Cox model is that it only provides estimates at the event times used in the training set (so the hazard is zero between these event times). While the Cox partial likelihood was really genius for other statistical purposes than prediction, it is not obvious how to best do prediction between the event times of the training set.

@gunesevitan
Copy link
Author

I see, that makes more sense now. Since the hazard is zero between event times, it is not possible to calculate survival probabilities directly. There are several workarounds to do it like linear interpolation and regression, but they are probably on the outside of this package's scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants