[MRG+2] Relax assumption on the data for the SkewedChi2Sampler. #7573

Merged
merged 1 commit into from May 18, 2017

Conversation

Projects
None yet
5 participants
@RomainBrault
Contributor

RomainBrault commented Oct 4, 2016

What does this implement/fix? Explain your changes.

Since the Skewed-Chi^2 kernel is defined on the open interval (-self.skewedness; +\infty)^d, the transform function should not check whether X < 0 but whether X < -self.skewedness.

The line after the check does X += self.skewedness.
So if X >= self.skewedness then (X += self.skewedness.) >= 0. Then we can take the logarithm of this without any trouble.

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Oct 4, 2016

Member

can you add a test?

Member

agramfort commented Oct 4, 2016

can you add a test?

@RomainBrault

This comment has been minimized.

Show comment
Hide comment
@RomainBrault

RomainBrault Oct 5, 2016

Contributor

Sure!

I just saw I need to change the error message raised.

Contributor

RomainBrault commented Oct 5, 2016

Sure!

I just saw I need to change the error message raised.

@RomainBrault

This comment has been minimized.

Show comment
Hide comment
@RomainBrault

RomainBrault Oct 5, 2016

Contributor

I added tests checking:

  • if X is negative and greater than -skewedness should pass
  • if X is negative and greater than -skewedness then the transform should not contain any NaN
  • if X is negative and smaller than -skewedness should raise
Contributor

RomainBrault commented Oct 5, 2016

I added tests checking:

  • if X is negative and greater than -skewedness should pass
  • if X is negative and greater than -skewedness then the transform should not contain any NaN
  • if X is negative and smaller than -skewedness should raise
@@ -1,7 +1,8 @@
import numpy as np
from scipy.sparse import csr_matrix
-from sklearn.utils.testing import assert_array_equal, assert_equal, assert_true
+from sklearn.utils.testing import assert_array_equal, assert_equal, \

This comment has been minimized.

@amueller

amueller Oct 5, 2016

Member

parenthesis are better ;)

@amueller

amueller Oct 5, 2016

Member

parenthesis are better ;)

sklearn/utils/testing.py
@@ -149,6 +149,13 @@ def assert_greater_equal(a, b, msg=None):
assert a >= b, message
+def assert_no_nan(a, msg=None):

This comment has been minimized.

@amueller

amueller Oct 5, 2016

Member

there's assert_all_finite, doesn't that work?

@amueller

amueller Oct 5, 2016

Member

there's assert_all_finite, doesn't that work?

sklearn/utils/testing.py
@@ -149,6 +149,13 @@ def assert_greater_equal(a, b, msg=None):
assert a >= b, message
+def assert_no_nan(a, msg=None):

This comment has been minimized.

@amueller

amueller Oct 5, 2016

Member

there's assert_all_finite, doesn't that work?

@amueller

amueller Oct 5, 2016

Member

there's assert_all_finite, doesn't that work?

This comment has been minimized.

@RomainBrault

RomainBrault Oct 6, 2016

Contributor

I think if we check with assert_all_finite then we should check whether
X <= -self.skewedness
instead of
X < -self.skewedness

To avoid having some log(0)=infty. I don't have any preferences though.

@RomainBrault

RomainBrault Oct 6, 2016

Contributor

I think if we check with assert_all_finite then we should check whether
X <= -self.skewedness
instead of
X < -self.skewedness

To avoid having some log(0)=infty. I don't have any preferences though.

This comment has been minimized.

@jnothman

jnothman Oct 6, 2016

Member

assert_all_finite raises ValueError not AssertionError. As such it is a validation, not a testing, helper...

@jnothman

jnothman Oct 6, 2016

Member

assert_all_finite raises ValueError not AssertionError. As such it is a validation, not a testing, helper...

This comment has been minimized.

@RomainBrault

RomainBrault Oct 6, 2016

Contributor

So I should keep the function assert_no_nan?

@RomainBrault

RomainBrault Oct 6, 2016

Contributor

So I should keep the function assert_no_nan?

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Oct 5, 2016

Member

lgtm apart from comments

Member

amueller commented Oct 5, 2016

lgtm apart from comments

sklearn/kernel_approximation.py
@@ -181,7 +181,8 @@ def transform(self, X, y=None):
----------
X : array-like, shape (n_samples, n_features)
New data, where n_samples in the number of samples
- and n_features is the number of features.
+ and n_features is the number of features. All values of X must be
+ greater than minus the "skewedness" parameter of the kernel.

This comment has been minimized.

@jnothman

jnothman Oct 6, 2016

Member

I think this is awkward. I think the following is clear enough:

greater than `-skewedness`.
@jnothman

jnothman Oct 6, 2016

Member

I think this is awkward. I think the following is clear enough:

greater than `-skewedness`.

This comment has been minimized.

@RomainBrault

RomainBrault Oct 6, 2016

Contributor

ok

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Nov 6, 2016

Member

LGTM. Please update master and rebase.

Member

jnothman commented Nov 6, 2016

LGTM. Please update master and rebase.

@jnothman jnothman changed the title from [MRG] Relax assumption on the data for the SkewedChi2Sampler. to [MRG+2] Relax assumption on the data for the SkewedChi2Sampler. Nov 6, 2016

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Nov 6, 2016

Member

Also please note this enhancement in whats_new.rst

Member

jnothman commented Nov 6, 2016

Also please note this enhancement in whats_new.rst

@ngoix

This comment has been minimized.

Show comment
Hide comment
@ngoix

ngoix Apr 20, 2017

Contributor

Looks good to me, +1 for merge after rebase

Contributor

ngoix commented Apr 20, 2017

Looks good to me, +1 for merge after rebase

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Apr 22, 2017

Member

@RomainBrault you need to rebase

Member

agramfort commented Apr 22, 2017

@RomainBrault you need to rebase

doc/whats_new.rst
@@ -86,6 +86,12 @@ Enhancements
attribute of ``best_estimator_``. :issue:`7661` and :issue:`8295`
by :user:`Alyssa Batula <abatula>`, :user:`Dylan Werner-Meier <unautre>`,
and :user:`Stephen Hoover <stephen-hoover>`.

This comment has been minimized.

@agramfort

agramfort May 11, 2017

Member

insert empty line

@agramfort

agramfort May 11, 2017

Member

insert empty line

doc/whats_new.rst
@@ -86,6 +86,12 @@ Enhancements
attribute of ``best_estimator_``. :issue:`7661` and :issue:`8295`
by :user:`Alyssa Batula <abatula>`, :user:`Dylan Werner-Meier <unautre>`,
and :user:`Stephen Hoover <stephen-hoover>`.
+ - Relax assumption on the data for the SkewedChi2Sampler. Since the
+ Skewed-Chi^2 kernel is defined on the open interval (-self.skewedness;
+ +\infty)^d, the transform function should not check whether X < 0 but

This comment has been minimized.

@agramfort

agramfort May 11, 2017

Member

use :math: so this renders fine

use also double `` around code

@agramfort

agramfort May 11, 2017

Member

use :math: so this renders fine

use also double `` around code

sklearn/utils/testing.py
@@ -96,6 +96,44 @@
assert_raises_regexp = assert_raises_regex
+<<<<<<< HEAD
+=======

This comment has been minimized.

@agramfort

agramfort May 11, 2017

Member

oups :)

@agramfort

agramfort May 11, 2017

Member

oups :)

@RomainBrault

This comment has been minimized.

Show comment
Hide comment
@RomainBrault

RomainBrault May 12, 2017

Contributor

Done.

Contributor

RomainBrault commented May 12, 2017

Done.

cythonize.dat
@@ -0,0 +1,41 @@
+sklearn/neighbors/dist_metrics.pyx 836b25ae304f8ccbcb2e93f8e9fe990c20b55fd3 62b33194eb7aac6260b247fd66e0a8e51b744c3d 5b41e65f5c233a93deb8862126e8c2edc8c35742

This comment has been minimized.

@agramfort

agramfort May 12, 2017

Member

this file should not be there

@agramfort

agramfort May 12, 2017

Member

this file should not be there

@jnothman

looks good otherwise

@@ -103,10 +109,12 @@ def test_skewed_chi2_sampler():
kernel_approx = np.dot(X_trans, Y_trans.T)
assert_array_almost_equal(kernel, kernel_approx, 1)
+ assert_no_nan(kernel, 'NaNs found in the Gram matrix')

This comment has been minimized.

@jnothman

jnothman May 17, 2017

Member

Is assert_true(np.isfinite(kernel)) sufficient?

@jnothman

jnothman May 17, 2017

Member

Is assert_true(np.isfinite(kernel)) sufficient?

RomainBrault
relax skewness assumption
relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

relax skewness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

remove cythonize.dat and merge

simplify tests

simplify tests

@amueller amueller merged commit 15b7d47 into scikit-learn:master May 18, 2017

5 checks passed

ci/circleci Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 95.53%)
Details
codecov/project 95.53% (+<.01%) compared to 33f50ad
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller May 18, 2017

Member

thanks :)

Member

amueller commented May 18, 2017

thanks :)

Sundrique added a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

relax skewness assumption (#7573)
relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

relax skewness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

remove cythonize.dat and merge

simplify tests

simplify tests

dmohns added a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017

relax skewness assumption (#7573)
relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

relax skewness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

remove cythonize.dat and merge

simplify tests

simplify tests

dmohns added a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017

relax skewness assumption (#7573)
relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

relax skewness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

remove cythonize.dat and merge

simplify tests

simplify tests

NelleV added a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017

relax skewness assumption (#7573)
relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

relax skewness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

remove cythonize.dat and merge

simplify tests

simplify tests

paulha added a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

relax skewness assumption (#7573)
relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

relax skewness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

remove cythonize.dat and merge

simplify tests

simplify tests

AishwaryaRK added a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017

relax skewness assumption (#7573)
relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

relax skewness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

remove cythonize.dat and merge

simplify tests

simplify tests

maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

relax skewness assumption (#7573)
relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

relax skewness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

remove cythonize.dat and merge

simplify tests

simplify tests

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

Romain Brault Jeremiah Johnson
relax skewness assumption (#7573)
relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

relax skewness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

relax skewedness assumption

last corrections

whats_new

merge

complying whats_new

removed unnecessary _assert_X

increased coverage

remove cythonize.dat and merge

simplify tests

simplify tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment