Skip to content

CV Attributes #2733

Open
wants to merge 3 commits into from

4 participants

@eshilts
eshilts commented Jan 9, 2014

Partially completed #2709. I'm opening this pull request to get feedback on the implementation of best_*_ attributes for LassoCV and ElasticNetCV and ideas for how to test this further. I noticed some inconsistencies between LassoCV/ElasticNetCV and GridCV with Lasso/ElasticNet which makes it difficult to validate against GridCV.

I'll continue with RidgeCV, LassoLarsCV, etc assuming this looks good so far.

@eshilts eshilts commented on the diff Jan 9, 2014
sklearn/linear_model/coordinate_descent.py
@@ -1003,6 +1005,7 @@ def fit(self, X, y):
self.coef_ = model.coef_
self.intercept_ = model.intercept_
self.dual_gap_ = model.dual_gap_
+ self.best_estimator_ = model
@eshilts
eshilts added a note Jan 9, 2014

Note that I'm always returning ElasticNet(...), even for Lasso models. This shouldn't make a difference as far as actual predictions go, but it may lead to confusion for a user who expects back a Lasso(...) model from LassoCV.

@agramfort
scikit-learn member
agramfort added a note Jan 9, 2014

agreed

we should sync with #2598 as it contains fixes too.

I need to find the time to review all this. Sorry guys

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@eshilts eshilts commented on the diff Jan 9, 2014
sklearn/linear_model/coordinate_descent.py
@@ -1080,6 +1083,20 @@ class LassoCV(LinearModelCV, RegressorMixin):
The dual gap at the end of the optimization for the optimal alpha
(``alpha_``).
+ ``best_estimator_`` : estimator
+ Estimator that was chosen by the search, i.e. estimator
+ which gave the lowest mean squared error on the left out data. The
+ estimator will be of type ElasticNet.
+
+ ``best_score_`` : float
+ Score of ``best_estimator_`` on the left out data
+ (i.e. best mean squared error). Note that by default GridSearchCV
@eshilts
eshilts added a note Jan 9, 2014

I note the default differences with GridCV. It's a little confusing if someone wants to compare the results of, say, LassoCV and GridCV of an SVM. It'd be nice to have a consistent default metric.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@coveralls

Coverage Status

Coverage remained the same when pulling 670cfb8 on eshilts:CV-attributes into 7de3d96 on scikit-learn:master.

@MechCoder
scikit-learn member

@eshilts What is the status on this PR? I would like to see this merged

@eshilts
eshilts commented Oct 11, 2014

Sorry for the delay. I'll get on it this week.

@MechCoder
scikit-learn member

Hi, if you are busy. I could cherry-pick your commits onto a new branch and finish this up. Should not take me much time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.