Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lasso_cd with multiprocessing fails on large dataset #4772

Closed
arthurmensch opened this issue May 27, 2015 · 17 comments
Closed

lasso_cd with multiprocessing fails on large dataset #4772

arthurmensch opened this issue May 27, 2015 · 17 comments

Comments

@arthurmensch
Copy link
Contributor

From sklearn.decomposition.dict_learning, using sparse_encode triggers error when using a large input X with algorithm 'lasso_cd'. I was not able to reproduce this bug using simple examples, as it seems to come from a memory mapping error, owing to X being large.

Call line :

            maps = sparse_encode(data_flat.T, U, alpha=self.alpha, algorithm='lasso_'+self.method, n_jobs=3)

Output :

Traceback (most recent call last):
  File "/media/data/Documents/Work/parietal-python/examples/plot_dict_learning.py", line 64, in <module>
    model.fit(data)
  File "/media/data/Documents/Work/parietal-python/parietal/learn/decompositions/dictionary_model.py", line 180, in fit
    maps = sparse_encode(data_flat.T, U, alpha=self.alpha, algorithm='lasso_'+self.method, n_jobs=self.n_jobs)
  File "/home/arthur/.virtualenvs/parietal-python/local/lib/python2.7/site-packages/sklearn/decomposition/dict_learning.py", line 256, in sparse_encode
    for this_slice in slices)
  File "/home/arthur/.virtualenvs/parietal-python/local/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 666, in __call__
    self.retrieve()
  File "/home/arthur/.virtualenvs/parietal-python/local/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 549, in retrieve
    raise exception_type(report)
sklearn.externals.joblib.my_exceptions.JoblibValueError: JoblibValueError
___________________________________________________________________________
Multiprocessing exception:
    ...........................................................................
/media/data/Documents/Work/parietal-python/examples/plot_dict_learning.py in <module>()
     59 #                                 reduction='encode', reduction_factor=1.2*np.sqrt(N_SUBJECTS))
     60 models = [dict_model_none]
     61 
     62 # Fit models
     63 for model in models:
---> 64     model.fit(data)
     65     # maps_ = model.maps_
     66     # n_maps_, _ = maps_.shape
     67     # # if n_maps_ != N_SUBJECT:
     68     # #     maps_ = np.concatenate((maps_, np.zeros((n_maps-n_maps_, n_voxels))), axis=0)

...........................................................................
/media/data/Documents/Work/parietal-python/parietal/learn/decompositions/dictionary_model.py in fit(self=DictModel(alpha=3, maps_only=False, mem=Memory(c...,
     reduction_factor=1, tol=0.0001, verbose=2), data=array([[[-0.78251106, -0.40772033, -0.03013193, ...0.60377657,
         -0.58464795, -0.88078177]]]), data_flat=array([[-0.6502697 , -0.28052581,  0.03125997, .... -0.60377831,
        -0.58465951, -0.88077778]]), V_init=array([[-0.63966742,  0.        ,  0.        , ....  0.        ,
         0.        ,  0.        ]]), pre_fit=True)
    175                 data_flat = pca.components_.dot(data_flat)
    176                 self.reduction_time_ += time.time() - t0
    177             if self.verbose >= 1:
    178                 print 'Learning code :'
    179             t0 = time.time()
--> 180             maps = sparse_encode(data_flat.T, U, alpha=self.alpha, algorithm='lasso_'+self.method, n_jobs=self.n_jobs)
        maps = undefined
        data_flat.T = array([[-0.6502697 ,  1.26799078,  0.40838867, .... -1.05512315,
        -1.60722143, -0.88077778]])
        U = array([[-0.00236816,  0.02024865, -0.00139323, .... -0.0444645 ,
        -0.03973441, -0.01998085]])
        self.alpha = 3
        self.method = 'cd'
        self.n_jobs = 3
    181             self.encoding_time_ += time.time() - t0
    182             if self.verbose >= 1:
    183                 print 'Done'
    184         else:

...........................................................................
/home/arthur/.virtualenvs/parietal-python/local/lib/python2.7/site-packages/sklearn/decomposition/dict_learning.py in sparse_encode(X=array([[-0.6502697 ,  1.26799078,  0.40838867, .... -1.05512315,
        -1.60722143, -0.88077778]]), dictionary=array([[-0.00236816,  0.02024865, -0.00139323, .... -0.0444645 ,
        -0.03973441, -0.01998085]]), gram=array([[ 1.        , -0.09361734, -0.1616115 , ....  0.59338606,
         0.04236759,  1.        ]]), cov=array([[ -3.62598394,  -2.31830304,  -2.34356363...1.67244792,
         -0.50818722,  -5.37480025]]), algorithm='lasso_cd', n_nonzero_coefs=None, alpha=3, copy_cov=False, init=None, max_iter=1000, n_jobs=3)
    251         delayed(_sparse_encode)(
    252             X[this_slice], dictionary, gram, cov[:, this_slice], algorithm,
    253             regularization=regularization, copy_cov=copy_cov,
    254             init=init[this_slice] if init is not None else None,
    255             max_iter=max_iter)
--> 256         for this_slice in slices)
        this_slice = undefined
        slices = [slice(0, 22124, None), slice(22124, 44248, None), slice(44248, 66371, None)]
    257     for this_slice, this_view in zip(slices, code_views):
    258         code[this_slice] = this_view
    259     return code
    260 

...........................................................................
/home/arthur/.virtualenvs/parietal-python/local/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py in __call__(self=Parallel(n_jobs=3), iterable=<generator object <genexpr>>)
    661             if pre_dispatch == "all" or n_jobs == 1:
    662                 # The iterable was consumed all at once by the above for loop.
    663                 # No need to wait for async callbacks to trigger to
    664                 # consumption.
    665                 self._iterating = False
--> 666             self.retrieve()
        self.retrieve = <bound method Parallel.retrieve of Parallel(n_jobs=3)>
    667             # Make sure that we get a last message telling us we are done
    668             elapsed_time = time.time() - self._start_time
    669             self._print('Done %3i out of %3i | elapsed: %s finished',
    670                         (len(self._output),

    ---------------------------------------------------------------------------
    Sub-process traceback:
    ---------------------------------------------------------------------------
    ValueError                                         Wed May 27 10:03:03 2015
PID: 14053Python 2.7.6: /home/arthur/.virtualenvs/parietal-python/bin/python
...........................................................................
/home/arthur/.virtualenvs/parietal-python/local/lib/python2.7/site-packages/sklearn/decomposition/dict_learning.pyc in _sparse_encode(X=memmap([[-0.6502697 ,  1.26799078,  0.40838867, ...  2.16620431,
         0.26053854, -0.7225649 ]]), dictionary=array([[-0.00236816,  0.02024865, -0.00139323, .... -0.0444645 ,
        -0.03973441, -0.01998085]]), gram=array([[ 1.        , -0.09361734, -0.1616115 , ....  0.59338606,
         0.04236759,  1.        ]]), cov=array([[ -3.62598394,  -2.31830304,  -2.34356363...5.80205829,
          4.13207823,  -3.81881073]]), algorithm='lasso_cd', regularization=3, copy_cov=False, init=None, max_iter=1000)
    109         alpha = float(regularization) / n_features  # account for scaling
    110         clf = Lasso(alpha=alpha, fit_intercept=False, precompute=gram,
    111                     max_iter=max_iter, warm_start=True)
    112         clf.coef_ = init
    113         # Copying X to avoid buffer source-array read only error
--> 114         clf.fit(dictionary.T, X.T)
    115         new_code = clf.coef_
    116 
    117     elif algorithm == 'lars':
    118         try:

...........................................................................
/home/arthur/.virtualenvs/parietal-python/local/lib/python2.7/site-packages/sklearn/linear_model/coordinate_descent.pyc in fit(self=Lasso(alpha=0.00483870967742, copy_X=True, fit_i... selection='cyclic', tol=0.0001, warm_start=True), X=array([[-0.00236816,  0.06091933,  0.0514941 , .... -0.0432963 ,
        -0.02150148, -0.01998085]]), y=array([[-0.6502697 , -0.28052581,  0.03125997, .... -0.25777117,
        -0.40672367, -0.7225649 ]]))
    668                           fit_intercept=False, normalize=False, copy_X=True,
    669                           verbose=False, tol=self.tol, positive=self.positive,
    670                           X_mean=X_mean, X_std=X_std, return_n_iter=True,
    671                           coef_init=coef_[k], max_iter=self.max_iter,
    672                           random_state=self.random_state,
--> 673                           selection=self.selection)
    674             coef_[k] = this_coef[:, 0]
    675             dual_gaps_[k] = this_dual_gap[0]
    676             self.n_iter_.append(this_iter[0])
    677 

...........................................................................
/home/arthur/.virtualenvs/parietal-python/local/lib/python2.7/site-packages/sklearn/linear_model/coordinate_descent.pyc in enet_path(X=array([[-0.00236816,  0.06091933,  0.0514941 , .... -0.0432963 ,
        -0.02150148, -0.01998085]]), y=array([ -6.50269696e-01,   1.26799078e+00,   4.0...e+00,
         1.35652531e+00,   4.22439627e-01]), l1_ratio=1.0, eps=None, n_alphas=1, alphas=array([ 0.00483871]), precompute=array([[ 1.        , -0.09361734, -0.1616115 , ....  0.59338606,
         0.04236759,  1.        ]]), Xy=array([ -3.62598394,  -6.28233266,   6.36262432,... -7.18900381,
        -5.29046887,  -7.51199686]), copy_X=True, coef_init=array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  ...0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]), verbose=False, return_n_iter=True, positive=False, **params={'X_mean': array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  ...0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]), 'X_std': array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  ...1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.]), 'fit_intercept': False, 'max_iter': 1000, 'normalize': False, 'random_state': None, 'selection': 'cyclic', 'tol': 0.0001})
    423                 coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random)
    424         elif isinstance(precompute, np.ndarray):
    425             precompute = check_array(precompute, 'csc', dtype=np.float64, order='F')
    426             model = cd_fast.enet_coordinate_descent_gram(
    427                 coef_, l1_reg, l2_reg, precompute, Xy, y, max_iter,
--> 428                 tol, rng, random, positive)
    429         elif precompute is False:
    430             model = cd_fast.enet_coordinate_descent(
    431                 coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random,
    432                 positive)

...........................................................................
/home/arthur/.virtualenvs/parietal-python/local/lib/python2.7/site-packages/sklearn/linear_model/cd_fast.so in sklearn.linear_model.cd_fast.enet_coordinate_descent_gram (sklearn/linear_model/cd_fast.c:5584)()
    485 
    486 
    487 
    488 
    489 
--> 490 
    491 
    492 
    493 
    494 

...........................................................................
/home/arthur/.virtualenvs/parietal-python/local/lib/python2.7/site-packages/sklearn/linear_model/cd_fast.so in View.MemoryView.memoryview_cwrapper (sklearn/linear_model/cd_fast.c:15515)()
    609 
    610 
    611 
    612 
    613 
--> 614 
    615 
    616 
    617 
    618 

...........................................................................
/home/arthur/.virtualenvs/parietal-python/local/lib/python2.7/site-packages/sklearn/linear_model/cd_fast.so in View.MemoryView.memoryview.__cinit__ (sklearn/linear_model/cd_fast.c:12061)()
    316 
    317 
    318 
    319 
    320 
--> 321 
    322 
    323 
    324 
    325 

ValueError: buffer source array is read-only
___________________________________________________________________________
@agramfort
Copy link
Member

agramfort commented May 27, 2015 via email

@arthurmensch
Copy link
Contributor Author

No, only with n_jobs > 1. A simple fix (though ugly), is to copy X in _sparse_encode (where X is a slice of the design matrix). You may see pull request #4773

@arthurmensch
Copy link
Contributor Author

Bug can be reproduced using this example :

https://gist.github.com/arthurmensch/f099ddf2eb6e1afaa68a

@arthurmensch
Copy link
Contributor Author

I trie to design a test that would currently fail : using synthetic data of the same size as data in previous example, I am not able to reproduce the error.

@ogrisel
Copy link
Member

ogrisel commented May 27, 2015

This is a limitation of cython typed memory views that do not work with readonly numpy arrays such as np.memmap instance with mode='r'.

I fixed a similar issue in pandas with a workaround here: pandas-dev/pandas#10070

@agramfort
Copy link
Member

agramfort commented May 27, 2015 via email

@ogrisel
Copy link
Member

ogrisel commented May 27, 2015

We have to investigate which array is in read-only mode. From the traceback this is ambiguous. If this is Q and the data data arrays, we don't need to modify then and we can keep using the numpy memmap in readonly mode without copy. We could investigate whether there is a performance penalty to use ndarray type instead of typed memory view such as:

diff --git a/sklearn/linear_model/cd_fast.pyx b/sklearn/linear_model/cd_fast.pyx
index ad2fd0d..9c85c76 100644
--- a/sklearn/linear_model/cd_fast.pyx
+++ b/sklearn/linear_model/cd_fast.pyx
@@ -487,7 +487,9 @@ def sparse_enet_coordinate_descent(double[:] w,
 @cython.wraparound(False)
 @cython.cdivision(True)
 def enet_coordinate_descent_gram(double[:] w, double alpha, double beta,
-                                 double[:, :] Q, double[:] q, double[:] y,
+                                 np.ndarray[double, ndim=2] Q,
+                                 np.ndarray[double, ndim=1] q,
+                                 np.ndarray[double, ndim=1] y,
                                  int max_iter, double tol, object rng,
                                  bint random=0, bint positive=0):
     """Cython version of the coordinate descent algorithm

@ogrisel
Copy link
Member

ogrisel commented May 27, 2015

I had a quick chat IRL with @arthurmensch. He is going to do a benchmark and if successful will write a non regression test with non-writeable inputs + the fix itself in a PR.

@arthurmensch
Copy link
Contributor Author

I adpated bench_lasso to benchmark cd_fast using double[:] and ndarray in cd_fast.pyx. Here are the results : it looks like we do not loose performance (surprisingly enough we even seem to gain for large number of samples).

benchmark_ctype

Source code (as to be run on both version of cd_fast.pyx) :

https://gist.github.com/arthurmensch/0aa4f7dce4f3c8cef29d

@ogrisel
Copy link
Member

ogrisel commented May 27, 2015

Thanks @arthurmensch, this looks good. Please open a PR with the fix and non-regression test on non-writeable arrays.

@amueller
Copy link
Member

Did anyone try to submit a bugfix to cython btw?

@amueller
Copy link
Member

I am very surprised by the benchmark. Where it the time spend?

@amueller
Copy link
Member

Never mind, the typed memory views are unpacked anyhow.

@ogrisel
Copy link
Member

ogrisel commented May 28, 2015

Did anyone try to submit a bugfix to cython btw?

That would be nice but it seems quite complicated:

https://mail.python.org/pipermail/cython-devel/2015-February/004316.html

@lesteve
Copy link
Member

lesteve commented Dec 5, 2016

Closed by #4775.

@ShichengChen
Copy link

scikit-learn==0.19 works well

@lesteve
Copy link
Member

lesteve commented Sep 14, 2017

@ShichengChen there is no need on comment on closed issues, find a tutorial about github if you are not sure how this works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants