Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SUMM/Follow-up: GAM and penalized splines #2580

Open
5 tasks
josef-pkt opened this issue Aug 11, 2015 · 4 comments
Open
5 tasks

SUMM/Follow-up: GAM and penalized splines #2580

josef-pkt opened this issue Aug 11, 2015 · 4 comments

Comments

@josef-pkt
Copy link
Member

josef-pkt commented Aug 11, 2015

#2435 PR with development discussion

update
#5370 latest version of GAM PR
#5296 previous PR with comments on most of the changes to get it to work correctly

TODO:

  • k-fold cross-validation needs more careful review and refactoring

  • check and enable recreating model, e.g. llnull currently fails in GAM for discrete Logit (generic failure in penal ?) We also might want to have a new instance of the model for cross-validation or other penalization selection.
    old item, I'm not sure what the status of this is

Extension

  • tensor splines
  • Wood's confidence intervals
  • simultaneous confidence intervals (Kerby general PR and comment)
@josef-pkt
Copy link
Member Author

scaling of penalty

I don't see an obvious range for the pen_weight/alpha, especially after using a centering transform.
e.g. in a city_mpg bspline example: gcv select alpha = 15 Million, k-fold cv selects 9 Million (using a logspace grid), in one of these cases we have 12 basis functions and edf=6.5 (with constant included)
In other cases the alpha is much smaller.

What would be nice is to get an estimate of alpha as function of edf, so that we have at least an idea about a plausible range of alpha/pen_weight, and possibly a starting value for the gcv optimization.
I have seen that in other packages, but because it is a nonlinear function, there is no immediate inverse of edf as function of alpha.

@josef-pkt
Copy link
Member Author

Some of the underlying code still requires numpy arrays, and does not support pandas
#6487 (comment)

@josef-pkt
Copy link
Member Author

smoke test for llnull works, but correctness is not verified.

@josef-pkt
Copy link
Member Author

gam does not extend _init_keys.
There should be problems creating a new model.

maybe not, GAM inherits from PenalizedMixin, which extends it
self._init_keys.extend(['penal', 'pen_weight'])

However, problem is inconsistent naming, 'pen_weight' is alpha in GLMGAM.__init__

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

1 participant