Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predict does not attempt to invert transformations to left hand side #1449

Open
cancan101 opened this issue Mar 6, 2014 · 11 comments
Open

Comments

@cancan101
Copy link

Let's say I have a model like this:

model = "log(y) ~ s"
fit = ols(model, data=data).fit()

Presumably I am trying to predict y from s. I am taking the log of the left hand side so that I believe that I now have a linear model. Ultimately though, I want the forecast value of y for a given value of s.

If I use fit.predict(s), what I am given are predictions for log(y) as opposed to y itself. Is there any way (for some subset of transformations for which the inverse is known) to tell the predict that I would like to predict y? Something like: fit.predict(s, 'y')

@josef-pkt
Copy link
Member

good question,
but I think we need something like the Margins to handle this case to get the extras (confint, ...).
I think it also breaks with our interpretation that predict returns the expected value.
a new predict_transformed ?

Do we get anything that can help out of patsy?

In general we might need to have predefined function something like link functions that also specify the inverse link function.
another case: support for Box-Cox transformation.

@jseabold
Copy link
Member

jseabold commented Mar 6, 2014

Stata handles this well by taking the approach I did in Margins for predict. You can use certain pre-defined transforms or pass your own IIRC.

@josef-pkt
Copy link
Member

It's similar to what we do with "linear=True" in discrete models. but there we don't use any inference properties for the linear prediction, and the non-linear model is what is modeled and estimated.

Since it's related https://groups.google.com/d/msg/pystatsmodels/51f6ZLErP8A/wySASdAk3iQJ
what are the stochastic assumptions on the model.

@josef-pkt
Copy link
Member

Stata's predictnl? http://www.stata.com/manuals13/rpredictnl.pdf

That looks like a general approach.
The question is if there is anything extra that we get if the transform has already been used in estimation.
I guess margins could just use something like predictnl when it exists. Maybe not, Margins calculate the derivative or differences, but the basics are the same.

@jseabold
Copy link
Member

jseabold commented Mar 6, 2014

Yeah that sounds maybe right. I don't recall well though.

My idea was always just thought do take a general transformation.

mod.predict(s, transform=np.exp)

Or using pre-defined like stata (somewhere I might be misremembering)

mod.predict(s, transform="eydx")

Which is like margins for a linear model.

@cancan101
Copy link
Author

What is Margins?
On Mar 6, 2014 11:35 AM, "Skipper Seabold" notifications@github.com wrote:

Yeah that sounds maybe right. I don't recall well though.

My idea was always just thought do take a general transformation.

mod.predict(s, transform=np.exp)

Or using pre-defined like stata (somewhere I might be misremembering)

mod.predict(s, transform="eydx")

Which is like margins for a linear model.

Reply to this email directly or view it on GitHubhttps://github.com//issues/1449#issuecomment-36906829
.

@josef-pkt
Copy link
Member

Just another thought:

I prefer to keep it in a separate method like Stata, predict_nonlin (?) because we might want to have different ways of approximating the nonlinear prediction, and it's distribution.

linear transformation don't change the distribution assuming the underlying parameters are normal or t distributed.
For non-linear transformation we only have some approximation, first order approximation with delta method, we could have higher order with bias correction. In several cases, Stata calculates the confidence intervals by transforming the confidence interval of the (original) parameters.

(aside: I'm bumping into bias and bias correction for non-linear transformation in M-estimators for robust)

a semi random reference http://oregonstate.edu/instruct/fw431/sampson/LectureNotes/04-DeltaMethod.pdf

side note:
transformation of y_hat is just a one-dimensional transformation of a univariate normal distribution
general nonlinear transformation of params would be on a multivariate normal distribution. R^k -> R^q

@josef-pkt
Copy link
Member

related question (I never looked at this)
do we have prediction intervals and standard errors for the mean, expectation in discrete models? e.g. Poisson E(y|x) = lambda_hat = exp(x params)
what's the estimation uncertainty on the resulting Poisson distribution of endog/y?

@jseabold
Copy link
Member

jseabold commented Mar 8, 2014

I'm 100% sure I've written this code, but I have no idea where it is. I thought I included it at some point. I'll look around.

@josef-pkt
Copy link
Member

for poisson prediction of the actual y values http://stackoverflow.com/questions/17922637/prediction-intervals-for-poisson-regression-on-r

and to get confidence intervals on predicted probabilities logit, probit, poisson, ...
http://scholar.google.ca/scholar?cluster=4184155442623977717&hl=en&as_sdt=0,5&sciodt=0,5

@article{xu2005confidence,
  title={Confidence intervals for predicted outcomes in regression models for categorical outcomes},
  author={Xu, Jun and Long, J Scott},
  journal={Stata Journal},
  volume={5},
  number={4},
  pages={537--559},
  year={2005},
  publisher={StataCorp LP}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants