Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Unstable test_common.test_transformers under Windows with Python 32-bit for some estimators #3255

Closed
kastnerkyle opened this Issue · 6 comments

4 participants

@kastnerkyle
Owner

I am seeing failing tests with both python 2.7 and python 3.4 for Windows

scipy 0.14
numpy 1.8.1
all 32 bit

CCA, LLE, and KernelPCA seem to be the primary culprits. Here is a sample traceback

======================================================================
FAIL: sklearn.tests.test_common.test_transformers('KernelPCA', <class 'sklearn.decomposition.kernel_pca.KernelPCA
ay([[ 2.51189522,  2.6430893 ,  2.54847718],
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Python27\lib\site-packages\nose\case.py", line 197, in runTest
    self.test(*self.arg)
  File "C:\Python27\lib\site-packages\sklearn\tests\test_common.py", line 269, in check_transformer
    % Transformer)
  File "C:\Python27\lib\site-packages\numpy\testing\utils.py", line 811, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "C:\Python27\lib\site-packages\numpy\testing\utils.py", line 599, in assert_array_compare
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 2 decimals
consecutive fit_transform outcomes not consistent in <class 'sklearn.decomposition.kernel_pca.KernelPCA'>
(shapes (30, 15), (30, 14) mismatch)
 x: array([[  1.87664949e+00,   8.57398986e-02,   4.20312700e-02,
          3.31837404e-08,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   2.24505178e-08,   0.00000000e+00,...
 y: array([[  1.87664949e+00,   8.57398986e-02,   4.20312700e-02,
          3.31844220e-08,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   2.24490761e-08,   0.00000000e+00,...

======================================================================
FAIL: sklearn.tests.test_common.test_transformers('LocallyLinearEmbedding', <class 'sklearn.manifold.locally_line
llyLinearEmbedding'>, array([[ 2.51189522,  2.6430893 ,  2.54847718],
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Python27\lib\site-packages\nose\case.py", line 197, in runTest
    self.test(*self.arg)
  File "C:\Python27\lib\site-packages\sklearn\tests\test_common.py", line 269, in check_transformer
    % Transformer)
  File "C:\Python27\lib\site-packages\numpy\testing\utils.py", line 811, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "C:\Python27\lib\site-packages\numpy\testing\utils.py", line 644, in assert_array_compare
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 2 decimals
consecutive fit_transform outcomes not consistent in <class 'sklearn.manifold.locally_linear.LocallyLinearEmbeddi
(mismatch 25.0%)
 x: array([[ -2.27507872e-01,   2.98382398e-01],
       [  1.22093549e-01,  -1.92026395e-11],
       [  1.22093549e-01,  -2.04742612e-11],...
 y: array([[ -2.35941411e-01,   2.98382398e-01],
       [  1.04872863e-01,  -1.23895338e-12],
       [  1.04872863e-01,   1.95896077e-12],...

----------------------------------------------------------------------
Ran 3257 tests in 279.742s

Names of estimators that cause the failure:

  • KernelPCA
  • LocallyLinearEmbedding
  • CCA
@ogrisel
Owner

Note that those failures are random. I saw them both with the numpy + atlas package of numpy.org and the numpy + MKL package of Christoph Gohlke.

I tried to change the data of the test_commons:test_transformers test to have well conditioned input matrix and it does not seem to stabilize the test.

I am not sure whether this is a numpy bug under windows or a real stability bug in those algorithms that is only triggered under windows for some reason.

@ogrisel ogrisel added the Bug label
@ogrisel ogrisel added this to the 0.15 milestone
@kastnerkyle
Owner

Depending on luck, different tests fail out of (it appears) 3 possibilities. Most interestingly, the transformed shape of these is different! I have been investigating the KernelPCA case so far

The lines that make me most suspicious here are:

if self.remove_zero_eig or self.n_components is None:
            self.alphas_ = self.alphas_[:, self.lambdas_ > 0]
            self.lambdas_ = self.lambdas_[self.lambdas_ > 0]

If the values returned for self.lambdas_ (eigenvalues) were on the edge of stability, and went between slightly positive and slightly negative, this could cause the sign to change and shrink the returned array (I think). This is using arpack eigsh or linalg.eigh, depending on some heuristic unless explicitly set if K.shape[0] > 200 and n_components < 10:.

Also, I see fit_transform for KernelPCA has a **params argument ... should this be changed?

Error in question...

AssertionError:
Arrays are not almost equal to 2 decimals
consecutive fit_transform outcomes not consistent in <class 'sklearn.decomposition.kernel_pca.KernelPCA'>
(shapes (30, 17), (30, 13) mismatch)
 x: array([[  1.87664949e+00,   8.57398986e-02,   4.20312700e-02,
          3.13413026e-08,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,...
 y: array([[  1.87664949e+00,   8.57398986e-02,   4.20312700e-02,
          3.26217481e-08,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,...
>>  raise AssertionError("\nArrays are not almost equal to 2 decimals\nconsecutive fit_transform outcomes not consistent
 in <class 'sklearn.decomposition.kernel_pca.KernelPCA'>\n(shapes (30, 17), (30, 13) mismatch)\n x: array([[  1.87664949
e+00,   8.57398986e-02,   4.20312700e-02,\n          3.13413026e-08,   0.00000000e+00,   0.00000000e+00,\n          0.00
000000e+00,   0.00000000e+00,   0.00000000e+00,...\n y: array([[  1.87664949e+00,   8.57398986e-02,   4.20312700e-02,\n
         3.26217481e-08,   0.00000000e+00,   0.00000000e+00,\n          0.00000000e+00,   0.00000000e+00,   0.00000000e+
00,...")
@GaelVaroquaux
@ogrisel
Owner

I investigated further and I confirm that this only happens on 32 bit Python. All tests pass on 64 bit Python. I will work on a PR to skip those tests when run on a 32 bit Python.

@ogrisel ogrisel changed the title from Failing tests in Windows to Unstable test_common.test_transformers under Windows for some estimators
@ogrisel ogrisel changed the title from Unstable test_common.test_transformers under Windows for some estimators to Unstable test_common.test_transformers under Windows with Python 32-bit for some estimators
@amueller amueller modified the milestone: 0.15.1, 0.15
@amueller
Owner

@ogrisel should we close this one?

@ogrisel
Owner

Yes. Closing it.

@ogrisel ogrisel closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.