Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

CV Attributes #2733

Open
wants to merge 3 commits into from

4 participants

@eshilts

Partially completed #2709. I'm opening this pull request to get feedback on the implementation of best_*_ attributes for LassoCV and ElasticNetCV and ideas for how to test this further. I noticed some inconsistencies between LassoCV/ElasticNetCV and GridCV with Lasso/ElasticNet which makes it difficult to validate against GridCV.

I'll continue with RidgeCV, LassoLarsCV, etc assuming this looks good so far.

@eshilts eshilts commented on the diff
sklearn/linear_model/coordinate_descent.py
@@ -1003,6 +1005,7 @@ def fit(self, X, y):
self.coef_ = model.coef_
self.intercept_ = model.intercept_
self.dual_gap_ = model.dual_gap_
+ self.best_estimator_ = model
@eshilts
eshilts added a note

Note that I'm always returning ElasticNet(...), even for Lasso models. This shouldn't make a difference as far as actual predictions go, but it may lead to confusion for a user who expects back a Lasso(...) model from LassoCV.

@agramfort Owner

agreed

we should sync with #2598 as it contains fixes too.

I need to find the time to review all this. Sorry guys

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@eshilts eshilts commented on the diff
sklearn/linear_model/coordinate_descent.py
@@ -1080,6 +1083,20 @@ class LassoCV(LinearModelCV, RegressorMixin):
The dual gap at the end of the optimization for the optimal alpha
(``alpha_``).
+ ``best_estimator_`` : estimator
+ Estimator that was chosen by the search, i.e. estimator
+ which gave the lowest mean squared error on the left out data. The
+ estimator will be of type ElasticNet.
+
+ ``best_score_`` : float
+ Score of ``best_estimator_`` on the left out data
+ (i.e. best mean squared error). Note that by default GridSearchCV
@eshilts
eshilts added a note

I note the default differences with GridCV. It's a little confusing if someone wants to compare the results of, say, LassoCV and GridCV of an SVM. It'd be nice to have a consistent default metric.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@coveralls

Coverage Status

Coverage remained the same when pulling 670cfb8 on eshilts:CV-attributes into 7de3d96 on scikit-learn:master.

@MechCoder
Owner

@eshilts What is the status on this PR? I would like to see this merged

@eshilts

Sorry for the delay. I'll get on it this week.

@MechCoder
Owner

Hi, if you are busy. I could cherry-pick your commits onto a new branch and finish this up. Should not take me much time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
View
35 sklearn/linear_model/coordinate_descent.py
@@ -608,7 +608,7 @@ def fit(self, X, y):
if not self.warm_start or self.coef_ is None:
coef_ = np.zeros((n_targets, n_features), dtype=np.float64,
- order='F')
+ order='F')
else:
coef_ = self.coef_
if coef_.ndim == 1:
@@ -992,6 +992,8 @@ def fit(self, X, y):
self.alpha_ = best_alpha
self.alphas_ = np.asarray(alphas)
self.mse_path_ = np.squeeze(all_mse_paths)
+ self.best_score_ = best_mse
+ self.best_params_ = {'alpha': best_alpha, 'l1_ratio': best_l1_ratio}
# Refit the model with the parameters selected
model = ElasticNet()
@@ -1006,6 +1008,7 @@ def fit(self, X, y):
self.coef_ = model.coef_
self.intercept_ = model.intercept_
self.dual_gap_ = model.dual_gap_
+ self.best_estimator_ = model
@eshilts
eshilts added a note

Note that I'm always returning ElasticNet(...), even for Lasso models. This shouldn't make a difference as far as actual predictions go, but it may lead to confusion for a user who expects back a Lasso(...) model from LassoCV.

@agramfort Owner

agreed

we should sync with #2598 as it contains fixes too.

I need to find the time to review all this. Sorry guys

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
return self
@@ -1091,6 +1094,20 @@ class LassoCV(LinearModelCV, RegressorMixin):
The dual gap at the end of the optimization for the optimal alpha
(``alpha_``).
+ ``best_estimator_`` : estimator
+ Estimator that was chosen by the search, i.e. estimator
+ which gave the lowest mean squared error on the left out data. The
+ estimator will be of type ElasticNet.
+
+ ``best_score_`` : float
+ Score of ``best_estimator_`` on the left out data
+ (i.e. best mean squared error). Note that by default GridSearchCV
@eshilts
eshilts added a note

I note the default differences with GridCV. It's a little confusing if someone wants to compare the results of, say, LassoCV and GridCV of an SVM. It'd be nice to have a consistent default metric.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+ computes ``best_score_`` based on the R^2, not the mean squared error,
+ which will account for differences between the two.
+
+ ``best_params_`` : dict
+ Parameter setting that gave the best results on the left out data.
+
Notes
-----
See examples/linear_model/lasso_path_with_crossvalidation.py
@@ -1106,6 +1123,8 @@ class LassoCV(LinearModelCV, RegressorMixin):
LassoLars
Lasso
LassoLarsCV
+ ElasticNetCV
+ ElasticNet
"""
path = staticmethod(lasso_path)
@@ -1200,6 +1219,19 @@ class ElasticNetCV(LinearModelCV, RegressorMixin):
Mean square error for the test set on each fold, varying l1_ratio and
alpha.
+ ``best_estimator_`` : estimator
+ Estimator that was chosen by the search, i.e. estimator
+ which gave the lowest mean squared error on the left out data.
+
+ ``best_score_`` : float
+ Score of ``best_estimator_`` on the left out data
+ (i.e. best mean squared error). Note that by default GridSearchCV
+ computes ``best_score_`` based on the R^2, not the mean squared error,
+ which will account for differences between the two.
+
+ ``best_params_`` : dict
+ Parameter setting that gave the best results on the left out data.
+
Notes
-----
See examples/linear_model/lasso_path_with_crossvalidation.py
@@ -1229,6 +1261,7 @@ class ElasticNetCV(LinearModelCV, RegressorMixin):
--------
enet_path
ElasticNet
+ LassoCV
"""
path = staticmethod(enet_path)
View
34 sklearn/linear_model/least_angle.py
@@ -882,9 +882,18 @@ class LarsCV(Lars):
the mean square error on left-out for each fold along the path
(alpha values given by ``cv_alphas``)
+ ``best_score_`` : float
+ Best cross validation score on the left out data
+ (i.e. best mean squared error). Note that by default GridSearchCV
+ computes ``best_score_`` based on the R^2, not the mean squared error,
+ which will account for differences between the two.
+
+ ``best_params_`` : dict
+ Parameter setting that gave the best results on the left out data.
+
See also
--------
- lars_path, LassoLars, LassoLarsCV
+ lars_path, LassoLars, LassoLarsCV, Lasso, LassoCV
"""
method = 'lar'
@@ -965,11 +974,14 @@ def fit(self, X, y):
# Select the alpha that minimizes left-out error
i_best_alpha = np.argmin(mse_path.mean(axis=-1))
best_alpha = all_alphas[i_best_alpha]
+ best_mse = mse_path[i_best_alpha]
# Store our parameters
self.alpha_ = best_alpha
self.cv_alphas_ = all_alphas
self.cv_mse_path_ = mse_path
+ self.best_score_ = best_mse
+ self.best_params_ = {'alpha': best_alpha}
# Now compute the full model
# it will call a lasso internally when self if LassoLarsCV
@@ -1055,6 +1067,15 @@ class LassoLarsCV(LarsCV):
the mean square error on left-out for each fold along the path
(alpha values given by ``cv_alphas``)
+ ``best_score_`` : float
+ Best cross validation score on the left out data
+ (i.e. best mean squared error). Note that by default GridSearchCV
+ computes ``best_score_`` based on the R^2, not the mean squared error,
+ which will account for differences between the two.
+
+ ``best_params_`` : dict
+ Parameter setting that gave the best results on the left out data.
+
Notes
-----
@@ -1135,6 +1156,15 @@ class LassoLarsIC(LassoLars):
``alpha_`` : float
the alpha parameter chosen by the information criterion
+ ``best_score_`` : float
+ Best cross validation score on the left out data
+ (i.e. best mean squared error). Note that by default GridSearchCV
+ computes ``best_score_`` based on the R^2, not the mean squared error,
+ which will account for differences between the two.
+
+ ``best_params_`` : dict
+ Parameter setting that gave the best results on the left out data.
+
Examples
--------
>>> from sklearn import linear_model
@@ -1234,4 +1264,6 @@ def fit(self, X, y, copy_X=True):
self.alpha_ = alphas_[n_best]
self.coef_ = coef_path_[:, n_best]
self._set_intercept(Xmean, ymean, Xstd)
+ self.best_score_ = mean_squared_error
+ self.best_params_ = {'alpha': self.alpha_}
return self
View
31 sklearn/linear_model/tests/test_coordinate_descent.py
@@ -17,6 +17,10 @@
from sklearn.utils.testing import assert_warns
from sklearn.utils.testing import ignore_warnings
+from sklearn.cross_validation import KFold
+
+from sklearn.grid_search import GridSearchCV
+
from sklearn.linear_model.coordinate_descent import Lasso, \
LassoCV, ElasticNet, ElasticNetCV, MultiTaskLasso, MultiTaskElasticNet, \
lasso_path
@@ -370,6 +374,33 @@ def test_multioutput_enetcv_error():
assert_raises(ValueError, clf.fit, X, y)
+def test_lasso_cv_best_score():
+ X, y, X_test, y_test = build_dataset(n_features=20)
+ grid_cv = GridSearchCV(
+ Lasso(), {'alpha': [0.1, 1., 10.]}, scoring='mean_squared_error',
+ cv=KFold(3)
+ ).fit(X, y)
+ lasso_cv = LassoCV(alphas=[0.1, 1., 10.], cv=KFold(3)).fit(X, y)
+ assert_equal(lasso_cv.best_score_, -1 * grid_cv.best_score_)
+
+
+def test_elasticnet_cv_best_score():
+ X, y, X_test, y_test = build_dataset(n_features=20)
+ grid_cv = GridSearchCV(
+ ElasticNet(),
+ {'alpha': [0.1, 1., 10.], 'l1_ratio': [0.1, 0.5, 0.9, 0.95, 0.99]},
+ scoring='mean_squared_error', cv=KFold(3)
+ ).fit(X, y)
+ enet_cv = ElasticNetCV(alphas=[0.1, 1., 10.], cv=KFold(3)).fit(X, y)
+ assert_equal(enet_cv.best_score_, -1 * grid_cv.best_score_)
+
+
+def test_lasso_cv_l1ratio():
+ X, y, X_test, y_test = build_dataset(n_features=20)
+ lasso_cv = LassoCV().fit(X, y)
+ assert_equal(lasso_cv.best_params_['l1_ratio'], 1)
+
+
if __name__ == '__main__':
import nose
nose.runmodule()
View
2  sklearn/linear_model/tests/test_ridge.py
@@ -461,7 +461,7 @@ def test_ridgecv_store_cv_values():
"""
Test _RidgeCV's store_cv_values attribute.
"""
- rng = rng = np.random.RandomState(42)
+ rng = np.random.RandomState(42)
n_samples = 8
n_features = 5
Something went wrong with that request. Please try again.