New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Fix omp non normalize #10071

Merged
merged 5 commits into from Nov 16, 2017

Conversation

Projects
None yet
3 participants
@agramfort
Member

agramfort commented Nov 5, 2017

Reference Issues/PRs

there was no open issue. I fixed directly. Basically OMP was returning
garbage as soon as norms of each column of X was not 1. This happens
when setting normalize=False as option.

What does this implement/fix? Explain your changes.

cf above

Any other comments?

thanks @josephsalmon for pointing me the bug.

@@ -91,7 +92,6 @@ def _cholesky_omp(X, y, n_nonzero_coefs, tol=None, copy_X=True,
# old scipy, we need the garbage upper triangle to be non-Inf
L = np.zeros((max_features, max_features), dtype=X.dtype)
L[0, 0] = 1.

This comment has been minimized.

@agramfort

agramfort Nov 5, 2017

Member

L is the cholesky of the gram matrix. It's only 1 for normalized columns

@agramfort

agramfort Nov 5, 2017

Member

L is the cholesky of the gram matrix. It's only 1 for normalized columns

@@ -110,17 +111,23 @@ def _cholesky_omp(X, y, n_nonzero_coefs, tol=None, copy_X=True,
overwrite_b=True,
**solve_triangular_args)
v = nrm2(L[n_active, :n_active]) ** 2
if 1 - v <= min_float: # selected atoms are dependent
Lkk = linalg.norm(X[:, lam]) ** 2 - v

This comment has been minimized.

@agramfort

agramfort Nov 5, 2017

Member

same here. It's not 1 - v
but ||Xk||^2 - v

@agramfort

agramfort Nov 5, 2017

Member

same here. It's not 1 - v
but ||Xk||^2 - v

@@ -22,6 +22,9 @@
n_samples, n_features, n_nonzero_coefs, n_targets = 20, 30, 5, 3
y, X, gamma = make_sparse_coded_signal(n_targets, n_features, n_samples,
n_nonzero_coefs, random_state=0)
# Make X not of norm 1 for testing
X *= 10
y *= 10

This comment has been minimized.

@agramfort

agramfort Nov 5, 2017

Member

now all tests are ran with non-normalized X

@agramfort

agramfort Nov 5, 2017

Member

now all tests are ran with non-normalized X

coef_normalized = omp.coef_[0].copy()
omp.set_params(fit_intercept=True, normalize=False)
omp.fit(X, y[:, 0])
assert_array_almost_equal(coef_normalized, omp.coef_)

This comment has been minimized.

@agramfort

agramfort Nov 5, 2017

Member

here I check that coef_ are the same if you normalize during fit or not

@agramfort

agramfort Nov 5, 2017

Member

here I check that coef_ are the same if you normalize during fit or not

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Nov 8, 2017

Member

maybe @vene can you have a look?

Member

agramfort commented Nov 8, 2017

maybe @vene can you have a look?

@TomDLT TomDLT changed the title from Fix omp non normalize to [MRG+1] Fix omp non normalize Nov 15, 2017

@TomDLT

TomDLT approved these changes Nov 15, 2017

LGTM

@jnothman

I don't know OMP or the linalg well, but from a non-expert perspective, this looks well-justified and tested. LGTM

@@ -120,6 +120,10 @@ Classifiers and regressors
error for prior list which summed to 1.
:issue:`10005` by :user:`Gaurav Dhingra <gxyd>`.
- Fixed a bug in :class:`linear_model.OrthogonalMatchingPursuit` that was

This comment has been minimized.

@jnothman

jnothman Nov 15, 2017

Member

I suppose this should be mentioned in changed models above so that users expect different results without reading the full changelog.

@jnothman

jnothman Nov 15, 2017

Member

I suppose this should be mentioned in changed models above so that users expect different results without reading the full changelog.

Show outdated Hide outdated doc/whats_new/v0.20.rst Outdated
@codecov

This comment has been minimized.

Show comment
Hide comment
@codecov

codecov bot Nov 16, 2017

Codecov Report

Merging #10071 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #10071      +/-   ##
==========================================
+ Coverage   96.19%   96.19%   +<.01%     
==========================================
  Files         336      336              
  Lines       62739    62749      +10     
==========================================
+ Hits        60353    60363      +10     
  Misses       2386     2386
Impacted Files Coverage Δ
sklearn/linear_model/omp.py 95.95% <100%> (+0.06%) ⬆️
sklearn/linear_model/tests/test_omp.py 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update abb43c1...7fdff59. Read the comment docs.

codecov bot commented Nov 16, 2017

Codecov Report

Merging #10071 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #10071      +/-   ##
==========================================
+ Coverage   96.19%   96.19%   +<.01%     
==========================================
  Files         336      336              
  Lines       62739    62749      +10     
==========================================
+ Hits        60353    60363      +10     
  Misses       2386     2386
Impacted Files Coverage Δ
sklearn/linear_model/omp.py 95.95% <100%> (+0.06%) ⬆️
sklearn/linear_model/tests/test_omp.py 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update abb43c1...7fdff59. Read the comment docs.

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Nov 16, 2017

Member

@jnothman I rephrased the what's new entry.

Member

agramfort commented Nov 16, 2017

@jnothman I rephrased the what's new entry.

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Nov 16, 2017

Member

I'd not meant to "request changes". Thanks.

Member

jnothman commented Nov 16, 2017

I'd not meant to "request changes". Thanks.

@jnothman jnothman merged commit f8b09ce into scikit-learn:master Nov 16, 2017

3 of 6 checks passed

codecov/patch CI failed.
Details
codecov/project CI failed.
Details
continuous-integration/appveyor/pr AppVeyor build failed
Details
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
lgtm analysis: Python No alert changes
Details

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

Fix omp non normalize (scikit-learn#10071)
* fix OMP when columns of X are not normalized
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment