Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predictions on the scale of the response #17

Closed
univ12 opened this issue Oct 12, 2016 · 8 comments
Closed

predictions on the scale of the response #17

univ12 opened this issue Oct 12, 2016 · 8 comments

Comments

@univ12
Copy link

univ12 commented Oct 12, 2016

I am kind of "abusing" your package for fitting a GLM using the generalized gamma distribution you define in the package. To do this, I consider all observations to be events, no censoring. This works quite well in practice.
Example:

df <- data.frame(y = runif(100, 1, 10), x1 = rnorm(100), x2 = rnorm(100))
m <- flexsurvreg(Surv(y) ~ x1 + x2, data = df, dist = "gengamma")

but I have difficulties predicting on new data, as the predictions are not on the scale of the response y.
I predict using

df.new <- data.frame(x1 = rnorm(100), x2 = rnorm(100))
preds <- as.matrix(df[,2:3]) %*% m$res[-c(1:3), "est"]

But I want something like predict(m, df.new, type="response"). I could not find any information in the vignette.

@chjackson
Copy link
Owner

A couple of people have also suggested this feature to me. I'll put it on the long term todo list (I'm pretty busy in the next few months) but for reference, what exactly would you want this to do? For example it could behave like predict.survreg, and give a point estimate of survival time for someone with the given covariates. In survreg, the point estimate is defined by the location/scale/link parameterisation. In flexsurvreg, that could be the mean or the median of the fitted survival distribution, but a problem might be that these are not always available analytically, and numerical integration would be needed. Generalized gamma: I think the mean has a complicated form, but you could get the median with qgengamma(0.5, )

@univ12
Copy link
Author

univ12 commented Oct 12, 2016

As mentioned, I use this in a slightly different manner. In my case, y is say insurance premium, i.e. dollars or pounds. I have two covariates x1 and x2, e.g. age and income. I model as above with the generalized gamma and flexsurv.
I now want to predict the premium for new customers. When I do this as above with the matrix multiplication of the covariates I get a prediction not on the scale of the response, it is not dollars. I actually have no clue what that really is. How can I make it to real dollars?
That would be some point estimate given the covariates, as you say.

@jrdnmdhl
Copy link
Contributor

If I understand univ12 correctly as wanting the ability to calculate the predicted mean response, then #23 would implement this in summary.flexsurvreg.

@chjackson
Copy link
Owner

Yes. It would also be nice to have a predict() method to wrap this, which behaves like glm and / or survival

@jrdnmdhl
Copy link
Contributor

jrdnmdhl commented Feb 13, 2017 via email

@chjackson
Copy link
Owner

print.flexsurvreg already presents the estimates, CIs, SEs, likelihood and the things that I think people would expect from a model summary. I never saw the value in having shorter output in the print method, e.g. why would you not want the standard errors alongside the coefficients? I think some of those methods are a relic from the S-Plus days when most people would work on text terminals! We have much better tools these days for auto-generation of reports and so on, where people can arrange the output in a way that's useful to them.

@jrdnmdhl
Copy link
Contributor

jrdnmdhl commented Feb 20, 2017

I think this is also sufficiently addressed w/ #27, both in terms of means and medians.

@chjackson
Copy link
Owner

Merge issue with #39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants