Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG + 1] Fix numerical instability in LassoLars when alpha=0 (#7778) #7849

Merged
merged 10 commits into from Nov 12, 2016

Conversation

@jmontoyam
Copy link

@jmontoyam jmontoyam commented Nov 9, 2016

This pull request fixes issue #7778: sklearn LassoLars implementation does not give the same result that the LassoLars implementation available in R (lars library).

What does this implement/fix? Explain your changes.

This mismatch is due to a bug in the file least_angle.py, in lines 406 and 407 (lars_path method).
The bug is related to the way these two lines use the np.resize function.
According to the docs of the np.resize function, If the new array is larger than the original array, then the new array is filled with repeated copies of a, which causes a subtle error, because in the lars_path implementation those new values are used as if they were equal to zero.

jmontoyam added 2 commits Nov 9, 2016
@jmontoyam
Copy link
Author

@jmontoyam jmontoyam commented Nov 9, 2016

@agramfort, @tguillemot, thank you very much for all your help and advices. This is my first pull request to sklearn, I'm very happy to start contributing to this amazing project!

@raghavrv raghavrv changed the title Fix bug 7778 [MRG] Fix numerical instability in LassoLars when alpha=0 (#7778) Nov 9, 2016
@raghavrv raghavrv added this to the 0.19 milestone Nov 9, 2016
@raghavrv raghavrv added the Bug label Nov 9, 2016
@raghavrv
Copy link
Member

@raghavrv raghavrv commented Nov 9, 2016

@jmontoyam Yohooo! Looking forward to more such awesome PRs :)

coefs = np.resize(coefs, (n_iter + add_features, n_features))
alphas = np.resize(alphas, n_iter + add_features)
coefs = np.concatenate((coefs,
np.zeros((add_features, n_features))),

This comment has been minimized.

@raghavrv

raghavrv Nov 9, 2016
Member

Could you add a space before np... to match the indentation

This comment has been minimized.

@jmontoyam

jmontoyam Nov 9, 2016
Author

@raghavrv Thank you very much!, I have added the space you suggested me ;)

Copy link
Author

@jmontoyam jmontoyam left a comment

@raghavrv Thank you very much!, I have added the space you suggested me ;)

coefs = np.resize(coefs, (n_iter + add_features, n_features))
alphas = np.resize(alphas, n_iter + add_features)
coefs = np.concatenate((coefs,
np.zeros((add_features, n_features))),

This comment has been minimized.

@jmontoyam

jmontoyam Nov 9, 2016
Author

@raghavrv Thank you very much!, I have added the space you suggested me ;)

alphas = np.resize(alphas, n_iter + add_features)
coefs = np.concatenate((coefs,
np.zeros((add_features, n_features))),
axis=0)

This comment has been minimized.

@agramfort

agramfort Nov 10, 2016
Member

can you bench if this is faster:

coefs = np.resize(coefs, (n_iter + add_features, n_features))
coefs[:-add_features, :] = 0.

This comment has been minimized.

@jmontoyam

jmontoyam Nov 10, 2016
Author

I think you mean:
coefs[-add_features:] = 0

The version you propose is a little bit faster (but the difference is very tiny, it is approx. 0.2ms faster according to my toy benchmark).

I will modify the code following your advice ;)

Thanks!

@agramfort
Copy link
Member

@agramfort agramfort commented Nov 10, 2016

thx @jmontoyam

can you check why the tests don't pass?

@agramfort
Copy link
Member

@agramfort agramfort commented Nov 10, 2016

LGTM

can you just update what's new page to document the bug fix?

thanks heaps !

@raghavrv raghavrv changed the title [MRG] Fix numerical instability in LassoLars when alpha=0 (#7778) [MRG + 1] Fix numerical instability in LassoLars when alpha=0 (#7778) Nov 10, 2016
@srmcc
Copy link

@srmcc srmcc commented Nov 10, 2016

Thank you for figuring this out! It's so much help...

jmontoyam
@jmontoyam
Copy link
Author

@jmontoyam jmontoyam commented Nov 11, 2016

@agramfort @raghavrv
Please help me:
Conflicting files: doc/whats_new.rst is
This is because this file has changed since the time I made the PR.
Please excuse me!, I know this is a very beginner question: what do you suggest me to do to solve this error?
If I update the upstream and at the same time some else modify again this file, there will be a conflict again, right?

@jmontoyam
Copy link
Author

@jmontoyam jmontoyam commented Nov 11, 2016

@agramfort @raghavrv
I think I have solved the conflict...but I don't know if my solution was the cleanest one :)

@NelleV NelleV dismissed raghavrv’s stale review Nov 12, 2016

Review is out of date.

@NelleV NelleV merged commit 5230382 into scikit-learn:master Nov 12, 2016
3 checks passed
3 checks passed
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@NelleV
Copy link
Member

@NelleV NelleV commented Nov 12, 2016

Thanks!

@raghavrv raghavrv added this to the 0.18.1 milestone Nov 12, 2016
@raghavrv raghavrv removed this from the 0.19 milestone Nov 12, 2016
@raghavrv
Copy link
Member

@raghavrv raghavrv commented Nov 12, 2016

@amueller Maybe you can sqeeze this into 0.18.1

@tguillemot
Copy link
Contributor

@tguillemot tguillemot commented Nov 13, 2016

@jmontoyam Congrats for your first PR !!!

@jmontoyam
Copy link
Author

@jmontoyam jmontoyam commented Nov 13, 2016

@tguillemot , thank you very much for all the advices and suggestions!, and for answering very kindly all the beginner questions that I asked you :)

sergeyf added a commit to sergeyf/scikit-learn that referenced this pull request Feb 28, 2017
…-learn#7778) (scikit-learn#7849)

* Fix bug 7778

* Add test_lasso_lars_vs_R_implementation

* Add a space to match the indentation

* Solve E501 line too long (80 > 79 characters)

* assert_array_almost_equal up to 12 decimals

* Tiny modification for increasing performance

* Update what's new page

* Trying to solve conflicts

* Solve conflict in doc/whats_new.rst
Sundrique added a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
…-learn#7778) (scikit-learn#7849)

* Fix bug 7778

* Add test_lasso_lars_vs_R_implementation

* Add a space to match the indentation

* Solve E501 line too long (80 > 79 characters)

* assert_array_almost_equal up to 12 decimals

* Tiny modification for increasing performance

* Update what's new page

* Trying to solve conflicts

* Solve conflict in doc/whats_new.rst
NelleV added a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017
…-learn#7778) (scikit-learn#7849)

* Fix bug 7778

* Add test_lasso_lars_vs_R_implementation

* Add a space to match the indentation

* Solve E501 line too long (80 > 79 characters)

* assert_array_almost_equal up to 12 decimals

* Tiny modification for increasing performance

* Update what's new page

* Trying to solve conflicts

* Solve conflict in doc/whats_new.rst
paulha added a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
…-learn#7778) (scikit-learn#7849)

* Fix bug 7778

* Add test_lasso_lars_vs_R_implementation

* Add a space to match the indentation

* Solve E501 line too long (80 > 79 characters)

* assert_array_almost_equal up to 12 decimals

* Tiny modification for increasing performance

* Update what's new page

* Trying to solve conflicts

* Solve conflict in doc/whats_new.rst
maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
…-learn#7778) (scikit-learn#7849)

* Fix bug 7778

* Add test_lasso_lars_vs_R_implementation

* Add a space to match the indentation

* Solve E501 line too long (80 > 79 characters)

* assert_array_almost_equal up to 12 decimals

* Tiny modification for increasing performance

* Update what's new page

* Trying to solve conflicts

* Solve conflict in doc/whats_new.rst
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

6 participants