Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type safe regularization path segment selection #29

Merged
merged 4 commits into from
May 4, 2019

Conversation

AsafManela
Copy link
Collaborator

The current approach to selecting a regularization path segment is using the select::Symbol keyword argument as in coef(path; select=:AIC) or predict(path, newX; select=:AIC).
Because one of the possibilities is to select coefficients from all segments with select=:all (the default), these methods are not type safe.

This PR deprecates these methods, and replaces them with similar methods that take a path and a SegSelect struct, which takes care of the logic of selecting a particular segment.
It implements MinAIC, MinAICc, MinBIC, MinCVmse, and MinCV1se segment selectors, and makes it easy to create new ones by defining a new SegSelect struct and implementing its segselect() method.

This PR also provides a simpler interface for fitting a lasso model and selecting the segment all in one call with a fit(RegularizedModel, X, y, dist, link; <kwargs>) method.
It returns a LinearModel or GeneralizedLinearModel representing the selected
segment of a regularization path.

For example,

fit(LassoModel, X, y; select=MinBIC()) # BIC minimizing LinearModel 
fit(LassoModel, X, y, Binomial(), Logit(); 
    select=MinCVmse(path, 5)) # 5-fold CV mse minimizing model

This approach has the advantage that the model can be described (with coef) and used for prediction (with predict), without rerunning the selector, which can be expensive for cross-validating selectors.

@coveralls
Copy link

Pull Request Test Coverage Report for Build 160

  • 34 of 58 (58.62%) changed or added relevant lines in 4 files are covered.
  • 5 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+1.5%) to 88.399%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/segselect.jl 29 32 90.63%
src/deprecated.jl 0 21 0.0%
Files with Coverage Reduction New Missed Lines %
src/Lasso.jl 5 62.91%
Totals Coverage Status
Change from base Build 159: 1.5%
Covered Lines: 541
Relevant Lines: 612

💛 - Coveralls

1 similar comment
@coveralls
Copy link

coveralls commented Apr 20, 2019

Pull Request Test Coverage Report for Build 160

  • 34 of 58 (58.62%) changed or added relevant lines in 4 files are covered.
  • 5 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+1.5%) to 88.399%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/segselect.jl 29 32 90.63%
src/deprecated.jl 0 21 0.0%
Files with Coverage Reduction New Missed Lines %
src/Lasso.jl 5 62.91%
Totals Coverage Status
Change from base Build 159: 1.5%
Covered Lines: 541
Relevant Lines: 612

💛 - Coveralls

@AsafManela AsafManela merged commit 92a07c9 into JuliaStats:master May 4, 2019
@AsafManela AsafManela deleted the typesafe_segselect branch May 4, 2019 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants