New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nonnegative lars #5131

Merged
merged 25 commits into from Aug 28, 2015

Conversation

Projects
None yet
5 participants
@michigraber
Contributor

michigraber commented Aug 18, 2015

I extended the lars_path, Lars and LarsLasso estimators in the scikit-learn least_angle.py module with the possibility to restrict coefficients to be >= 0 using the method described in the original paper by Efron et al, 2004, chapter 3.4.

@michigraber

This comment has been minimized.

Contributor

michigraber commented Aug 18, 2015

I realize it's necessary to extend the positivity parameter to the other estimators in this module as well ..

@jnothman

This comment has been minimized.

@amueller

This comment has been minimized.

Member

amueller commented Aug 18, 2015

Thanks. I'm sure @agramfort is interested (and can review ;)

@michigraber

This comment has been minimized.

Contributor

michigraber commented Aug 18, 2015

will do so!
mi
On Aug 18, 2015 5:56 PM, "jnothman" notifications@github.com wrote:

Please add tests comparable to
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/tests/test_coordinate_descent.py#L179


Reply to this email directly or view it on GitHub
#5131 (comment)
.

@@ -235,7 +247,10 @@ def lars_path(X, y, Xy=None, Gram=None, max_iter=500,
# #
##########################################################
sign_active[n_active] = np.sign(C_)
if positive:
sign_active[n_active] = np.ones(C_.shape)

This comment has been minimized.

@agramfort

agramfort Aug 24, 2015

Member

np.ones(C_.shape) -> np.ones_like(C_)

@@ -354,7 +369,10 @@ def lars_path(X, y, Xy=None, Gram=None, max_iter=500,
g1 = arrayfuncs.min_pos((C - Cov) / (AA - corr_eq_dir + tiny))
g2 = arrayfuncs.min_pos((C + Cov) / (AA + corr_eq_dir + tiny))

This comment has been minimized.

@agramfort

agramfort Aug 24, 2015

Member

you don't need g2 if positive

@@ -978,8 +1018,9 @@ class LarsCV(Lars):
def __init__(self, fit_intercept=True, verbose=False, max_iter=500,
normalize=True, precompute='auto', cv=None,
max_n_alphas=1000, n_jobs=1, eps=np.finfo(np.float).eps,
copy_X=True):
copy_X=True, positive=False):

This comment has been minimized.

@agramfort

agramfort Aug 24, 2015

Member

position of the positive param is not consistent for both classes. I would put it always as last param to avoid breaking code.

@@ -772,13 +801,15 @@ class LassoLars(Lars):
def __init__(self, alpha=1.0, fit_intercept=True, verbose=False,
normalize=True, precompute='auto', max_iter=500,
eps=np.finfo(np.float).eps, copy_X=True, fit_path=True):
positive=False, eps=np.finfo(np.float).eps, copy_X=True,
fit_path=True):

This comment has been minimized.

@agramfort

agramfort Aug 24, 2015

Member

here positive is in the middle.

@agramfort

This comment has been minimized.

Member

agramfort commented Aug 24, 2015

you need to add tests. See how the positive option is tested for Lasso class.

@agramfort

This comment has been minimized.

Member

agramfort commented Aug 24, 2015

ping me when done. thanks @michigraber

@michigraber

This comment has been minimized.

Contributor

michigraber commented Aug 25, 2015

thanks for the input @agramfort ! it's ingested.

@michigraber

This comment has been minimized.

Contributor

michigraber commented Aug 25, 2015

@agramfort and @jnothman, I added tests to check for positivity of coefficients when using the positive option using the diabetes dataset (there are negative coefficients without the option, it's also tested) for the lars_path method and all estimator classes.

self.alpha = alpha
self.fit_intercept = fit_intercept
self.max_iter = max_iter
self.verbose = verbose
self.normalize = normalize
self.method = 'lasso'
self.positive=positive

This comment has been minimized.

@agramfort

agramfort Aug 25, 2015

Member

pep8 : space around =

estimator = getattr(linear_model, estname)(positive=True, **params)
estimator.fit(diabetes['data'], diabetes['target'])
assert_true(min(estimator.coef_) >= 0)

This comment has been minimized.

@agramfort

agramfort Aug 25, 2015

Member

how long does it take to run?

can you add a test that compares the resets of Lasso and LassoLars with positive == True? there is already a test. Just update it to also test with positive=True.

This comment has been minimized.

@michigraber

michigraber Aug 26, 2015

Contributor

how long does it take to run?

In [2]: %timeit test_least_angle.test_estimatorclasses_positive_constraint()
10 loops, best of 3: 33 ms per loop

@michigraber

This comment has been minimized.

Contributor

michigraber commented Aug 26, 2015

@agramfort : About the test comparing Lasso and LarsLasso with the positive option:

I had to disentangle this one a bit because the tests do not trivially pass. So I added two more functions instead of simply extending the existing one.

Two parts validate without problems, they're in test_lasso_lars_vs_lasso_cd_positive().
The middle part however, where a range of alphas is used leads to too big coefficient differences.

I isolated this test in another function called evaluate_lasso_lars_vs_lasso_cd_positive_middle_part() to get an idea of why it fails. Here we have some printed output:

In [19]: test_least_angle.evaluate_lasso_lars_vs_lasso_cd_positive_middle_part()
alpha=0.01 err=25.7243223179 num_coeffs=(5/5) (LassoLars / Lasso)
alpha=0.0615789473684 err=18.2545548874 num_coeffs=(5/5) (LassoLars / Lasso)
alpha=0.113157894737 err=10.7847873501 num_coeffs=(5/5) (LassoLars / Lasso)
alpha=0.164736842105 err=3.31501982266 num_coeffs=(5/5) (LassoLars / Lasso)
alpha=0.216315789474 err=3.69727628935e-06 num_coeffs=(4/4) (LassoLars / Lasso)
alpha=0.267894736842 err=3.56502617927e-06 num_coeffs=(4/4) (LassoLars / Lasso)
alpha=0.319473684211 err=3.43302342569e-06 num_coeffs=(4/4) (LassoLars / Lasso)
alpha=0.371052631579 err=1.05509699804e-06 num_coeffs=(3/3) (LassoLars / Lasso)
alpha=0.422631578947 err=5.9269674114e-07 num_coeffs=(3/3) (LassoLars / Lasso)
alpha=0.474210526316 err=1.56511823248e-06 num_coeffs=(3/3) (LassoLars / Lasso)
alpha=0.525789473684 err=1.28253471214e-06 num_coeffs=(3/3) (LassoLars / Lasso)
alpha=0.577368421053 err=1.00002954531e-06 num_coeffs=(3/3) (LassoLars / Lasso)
alpha=0.628947368421 err=7.38618104242e-07 num_coeffs=(3/3) (LassoLars / Lasso)
alpha=0.680526315789 err=6.78234533087e-07 num_coeffs=(3/3) (LassoLars / Lasso)
alpha=0.732105263158 err=2.25315273789e-06 num_coeffs=(3/3) (LassoLars / Lasso)
alpha=0.783684210526 err=1.98825668767e-06 num_coeffs=(3/3) (LassoLars / Lasso)
alpha=0.835263157895 err=1.72877180183e-06 num_coeffs=(3/3) (LassoLars / Lasso)
alpha=0.886842105263 err=1.58242049123e-06 num_coeffs=(3/3) (LassoLars / Lasso)
alpha=0.938421052632 err=1.50923193078e-06 num_coeffs=(3/3) (LassoLars / Lasso)
alpha=0.99 err=1.43604831986e-06 num_coeffs=(3/3) (LassoLars / Lasso)

The number of regressors in the diabetes dataset btw is 10, but even for very small alphas not more than 5 are selected with the positive option in the case of both estimators.

We find that the comparison fails for small alphas. This is in congruence with a statement in the original publication of the least angle regression algorithm by Efron et al. 2004:

  • The positive Lasso usually does not converge to the full OLS solution beta_{m}, even for very large choices of t.*

This is kind of a 'flaw' of the method. For 'reasonable' alphas (ie also the default value alpha=1.0) however, the algorithm is of value ..

How shall we account for this? A comment in the docstring?

@agramfort

This comment has been minimized.

Member

agramfort commented Aug 26, 2015

@michigraber

This comment has been minimized.

Contributor

michigraber commented Aug 27, 2015

I made a ipython notebook gist for this: https://gist.github.com/michigraber/7e7d7c75eca694c7a6ff

For the range of available lassolars alphas/solutions there is very good congruence of the two solutions. After the last lassolars alpha step (smallest alpha) has been reached, the lasso solution continues to evolve and diverges from the last lassolars solution.

Ultimately the lassolars path is very accurate up to the smallest alpha value it reaches but not below ..

@ogrisel

This comment has been minimized.

Member

ogrisel commented Aug 27, 2015

How shall we account for this? A comment in the docstring?

+1 for a comment in the docstring and an inline comment in the tests that explains why we don't check the equality of the coefficients for small values of alpha.

@michigraber

This comment has been minimized.

Contributor

michigraber commented Aug 28, 2015

@ogrisel and @agramfort : I added considerations about the usage of the positive option to the functions and estimators in question. Please let me know if you do not agree or understand them.

Also I updated the test function to now only compare a restricted range of alpha values and added some inline comments.

@agramfort

This comment has been minimized.

Member

agramfort commented Aug 28, 2015

thanks @michigraber merging.

I'll push a pep8 cosmit in master

agramfort added a commit that referenced this pull request Aug 28, 2015

@agramfort agramfort merged commit 4170473 into scikit-learn:master Aug 28, 2015

2 checks passed

continuous-integration/appveyor AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@agramfort

This comment has been minimized.

Member

agramfort commented Aug 28, 2015

cosmit pushed.

next time avoid merging master on the way. It prevented me from doing a clean rebase.

@agramfort

This comment has been minimized.

Member

agramfort commented Aug 28, 2015

for the record a8585a4

@michigraber

This comment has been minimized.

Contributor

michigraber commented Aug 29, 2015

cool! thanks for supporting guys!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment