predictions on the scale of the response #17

univ12 · 2016-10-12T12:57:18Z

I am kind of "abusing" your package for fitting a GLM using the generalized gamma distribution you define in the package. To do this, I consider all observations to be events, no censoring. This works quite well in practice.
Example:

df <- data.frame(y = runif(100, 1, 10), x1 = rnorm(100), x2 = rnorm(100))
m <- flexsurvreg(Surv(y) ~ x1 + x2, data = df, dist = "gengamma")

but I have difficulties predicting on new data, as the predictions are not on the scale of the response y.
I predict using

df.new <- data.frame(x1 = rnorm(100), x2 = rnorm(100))
preds <- as.matrix(df[,2:3]) %*% m$res[-c(1:3), "est"]

But I want something like predict(m, df.new, type="response"). I could not find any information in the vignette.

The text was updated successfully, but these errors were encountered:

chjackson · 2016-10-12T14:20:59Z

A couple of people have also suggested this feature to me. I'll put it on the long term todo list (I'm pretty busy in the next few months) but for reference, what exactly would you want this to do? For example it could behave like predict.survreg, and give a point estimate of survival time for someone with the given covariates. In survreg, the point estimate is defined by the location/scale/link parameterisation. In flexsurvreg, that could be the mean or the median of the fitted survival distribution, but a problem might be that these are not always available analytically, and numerical integration would be needed. Generalized gamma: I think the mean has a complicated form, but you could get the median with qgengamma(0.5, )

univ12 · 2016-10-12T14:32:15Z

As mentioned, I use this in a slightly different manner. In my case, y is say insurance premium, i.e. dollars or pounds. I have two covariates x1 and x2, e.g. age and income. I model as above with the generalized gamma and flexsurv.
I now want to predict the premium for new customers. When I do this as above with the matrix multiplication of the covariates I get a prediction not on the scale of the response, it is not dollars. I actually have no clue what that really is. How can I make it to real dollars?
That would be some point estimate given the covariates, as you say.

jrdnmdhl · 2017-02-12T15:11:53Z

If I understand univ12 correctly as wanting the ability to calculate the predicted mean response, then #23 would implement this in summary.flexsurvreg.

chjackson · 2017-02-13T12:56:19Z

Yes. It would also be nice to have a predict() method to wrap this, which behaves like glm and / or survival

jrdnmdhl · 2017-02-13T13:00:25Z

Perhaps summary should wrap the predict method instead? Predict usually just returns a vector of predicted values, so it is a subset of what is needed for summary.flexsurvreg as it currently operates. Though right now, summary.flexsurvreg operates a bit outside of the norm of what summary functions do by effectively generated predicted values rather than providing a more general summary of the model. Of course, changing that behavior would be a breaking change… From: chjackson <notifications@github.com> Reply-To: chjackson/flexsurv-dev <reply@reply.github.com> Date: Monday, February 13, 2017 at 7:56 AM To: chjackson/flexsurv-dev <flexsurv-dev@noreply.github.com> Cc: Jordan Amdahl <jrdnmdhl@gmail.com>, Comment <comment@noreply.github.com> Subject: Re: [chjackson/flexsurv-dev] predictions on the scale of the response (#17) Yes. It would also be nice to have a predict() method to wrap this, which behaves like glm and / or survival — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

chjackson · 2017-02-13T13:19:01Z

print.flexsurvreg already presents the estimates, CIs, SEs, likelihood and the things that I think people would expect from a model summary. I never saw the value in having shorter output in the print method, e.g. why would you not want the standard errors alongside the coefficients? I think some of those methods are a relic from the S-Plus days when most people would work on text terminals! We have much better tools these days for auto-generation of reports and so on, where people can arrange the output in a way that's useful to them.

jrdnmdhl · 2017-02-20T03:22:52Z

I think this is also sufficiently addressed w/ #27, both in terms of means and medians.

chjackson · 2020-05-16T16:42:30Z

Merge issue with #39

chjackson closed this as completed May 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

predictions on the scale of the response #17

predictions on the scale of the response #17

univ12 commented Oct 12, 2016

chjackson commented Oct 12, 2016

univ12 commented Oct 12, 2016

jrdnmdhl commented Feb 12, 2017

chjackson commented Feb 13, 2017

jrdnmdhl commented Feb 13, 2017 via email

chjackson commented Feb 13, 2017

jrdnmdhl commented Feb 20, 2017 •

edited

Loading

chjackson commented May 16, 2020

predictions on the scale of the response #17

predictions on the scale of the response #17

Comments

univ12 commented Oct 12, 2016

chjackson commented Oct 12, 2016

univ12 commented Oct 12, 2016

jrdnmdhl commented Feb 12, 2017

chjackson commented Feb 13, 2017

jrdnmdhl commented Feb 13, 2017 via email

chjackson commented Feb 13, 2017

jrdnmdhl commented Feb 20, 2017 • edited Loading

chjackson commented May 16, 2020

jrdnmdhl commented Feb 20, 2017 •

edited

Loading