New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+2]: Fix Ridge GCV with centered data #6178

Merged
merged 11 commits into from Oct 5, 2016

Conversation

Projects
None yet
5 participants
@bthirion
Member

bthirion commented Jan 18, 2016

Addresses #1807
The following code solves it correctly as far as I can see. I've tried to deal cleanly with sample weight and sparse matrices, as well as the direct/kernel fomulations.

Comments welcome !

@@ -854,7 +859,11 @@ def _errors_and_values_helper(self, alpha, y, v, Q, QT_y):
-----
We don't construct matrix G, instead compute action on y & diagonal.
"""
w = 1.0 / (v + alpha)
w = 1. / (v + alpha)
dummy_column = np.var(Q, 0) < 1.e-12

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Jan 27, 2016

Member

I think that I would call this "constant_column" rather than "dummy_column", no?

if centered_kernel:
X_ = np.hstack((X, np.ones((X.shape[0], 1))))
else:
X_ = X

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Jan 27, 2016

Member

You don't need the intermediate variable "X_". You can just do "X = np.hstack(...", it won't modify the original X that will keep living outside the function.

c = np.dot(U, self._diag_dot(w, UT_y)) + y
G_diag = self._decomp_diag(w, U) + 1
c /= alpha # for compatibility with textbook version
G_diag /= alpha # for compatibility with textbook version

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Jan 27, 2016

Member

You have transferred the scaling in alpha from scaling w to scaling G_diag after. While at first glance it gives the same result, could you tell use why you did that?

This comment has been minimized.

@bthirion

bthirion Feb 1, 2016

Member

At some point, I was fighting with numerical issues when one of the singular values v is 0 and alpha very small. I am not sure whether this is actually the case. I am sure, however, that the result is identical to the previous version.

This comment has been minimized.

@bthirion

bthirion Feb 1, 2016

Member

Actually, I can revert that change: this will reduce the number lines.

weighted_alpha = (sample_weight * alpha
if sample_weight is not None
else alpha)
weighted_alpha = alpha

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Jan 30, 2016

Member

If we are no longer reweighting alpha (as it seems to be the case if I read this right), shouldn't we avoid the renaming, and keep the name "alpha"?

@GaelVaroquaux

This comment has been minimized.

Member

GaelVaroquaux commented Jan 30, 2016

A few minor comments that I would like addressed :)

centered_kernel = True
if sparse.issparse(X) or not self.fit_intercept:
centered_kernel = False

This comment has been minimized.

@ogrisel

ogrisel Sep 13, 2016

Member

This could be written:

centered_kernel = not sparse.issparse(X) and self.fit_intercept
@ogrisel

This comment has been minimized.

Member

ogrisel commented Sep 13, 2016

LGTM as well. @mblondel do you want to have a look? Otherwise we can merge.

@ogrisel ogrisel changed the title from MRG: Fix gcv to [MRG+2]: Fix Ridge GCV with centered data Sep 13, 2016

@bthirion

This comment has been minimized.

Member

bthirion commented Sep 13, 2016

@ogrisel, addressed your comments and rebased on master.

@mblondel

This comment has been minimized.

Member

mblondel commented Sep 13, 2016

Thanks for the fix! Feel free to merge without my review.

@amueller

This comment has been minimized.

Member

amueller commented Sep 13, 2016

flake8 is complaining ;)

@bthirion

This comment has been minimized.

Member

bthirion commented Sep 14, 2016

Fixed.

@ogrisel ogrisel merged commit b32897f into scikit-learn:master Oct 5, 2016

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@ogrisel

This comment has been minimized.

Member

ogrisel commented Oct 5, 2016

Merged. I will add a whats new to master directly.

ogrisel added a commit that referenced this pull request Oct 5, 2016

amueller added a commit to amueller/scikit-learn that referenced this pull request Oct 14, 2016

amueller added a commit to amueller/scikit-learn that referenced this pull request Oct 14, 2016

DOC add what's new for scikit-learn#6178
# Conflicts:
#	doc/whats_new.rst

afiodorov added a commit to unravelin/scikit-learn that referenced this pull request Apr 25, 2017

afiodorov added a commit to unravelin/scikit-learn that referenced this pull request Apr 25, 2017

Sundrique added a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

Sundrique added a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

paulha added a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

paulha added a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment