New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: assignment destination is read-only, when paralleling with n_jobs > 1 #5956

Closed
coldfog opened this Issue Dec 4, 2015 · 22 comments

Comments

@coldfog

coldfog commented Dec 4, 2015

When I run SparseCoder with n_jobs > 1, there is a chance to raise exception ValueError: assignment destination is read-only. The code is shown as follow:

from sklearn.decomposition import SparseCoder
import numpy as np

data_dims = 4103
init_dict = np.random.rand(500, 64)
data = np.random.rand(data_dims, 64)
c = SparseCoder(init_dict , transform_algorithm='omp', n_jobs=8).fit_transform(data)

The bigger data_dims is, the higher chance get. When data_dims is small (lower than 2000, I verified), everything works fine. Once data_dims is bigger than 2000, there is a chance to get the exception. When data_dims is bigger than 5000, it is 100% raised.

My version infor:

OS: OS X 10.11.1
python: Python 2.7.10 |Anaconda 2.2.0
numpy: 1.10.1
sklearn: 0.17

The full error information is shown as follow

---------------------------------------------------------------------------
JoblibValueError                          Traceback (most recent call last)
<ipython-input-24-d745e5de1eae> in <module>()
----> 1 learned_dict = dict_learn(init_dict, patches)

<ipython-input-23-50e8dab30ec4> in dict_learn(dictionary, data)
      6         # Sparse coding stage
      7         coder = SparseCoder(dictionary, transform_algorithm='omp', n_jobs=8, transform_n_nonzero_coefs=3)
----> 8         code = coder.fit_transform(data)
      9         #print iteration, ' ', linalg.norm(data - np.dot(code, dictionary)), ' +',
     10         # update stage

/Users/fengyuyao/anaconda/lib/python2.7/site-packages/sklearn/base.pyc in fit_transform(self, X, y, **fit_params)
    453         if y is None:
    454             # fit method of arity 1 (unsupervised transformation)
--> 455             return self.fit(X, **fit_params).transform(X)
    456         else:
    457             # fit method of arity 2 (supervised transformation)

/Users/fengyuyao/anaconda/lib/python2.7/site-packages/sklearn/decomposition/dict_learning.pyc in transform(self, X, y)
    816             X, self.components_, algorithm=self.transform_algorithm,
    817             n_nonzero_coefs=self.transform_n_nonzero_coefs,
--> 818             alpha=self.transform_alpha, n_jobs=self.n_jobs)
    819 
    820         if self.split_sign:

/Users/fengyuyao/anaconda/lib/python2.7/site-packages/sklearn/decomposition/dict_learning.pyc in sparse_encode(X, dictionary, gram, cov, algorithm, n_nonzero_coefs, alpha, copy_cov, init, max_iter, n_jobs, check_input, verbose)
    298             max_iter=max_iter,
    299             check_input=False)
--> 300         for this_slice in slices)
    301     for this_slice, this_view in zip(slices, code_views):
    302         code[this_slice] = this_view

/Users/fengyuyao/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self, iterable)
    810                 # consumption.
    811                 self._iterating = False
--> 812             self.retrieve()
    813             # Make sure that we get a last message telling us we are done
    814             elapsed_time = time.time() - self._start_time

/Users/fengyuyao/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in retrieve(self)
    760                         # a working pool as they expect.
    761                         self._initialize_pool()
--> 762                 raise exception
    763 
    764     def __call__(self, iterable):

JoblibValueError: JoblibValueError
___________________________________________________________________________
Multiprocessing exception:
...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/runpy.py in _run_module_as_main(mod_name='IPython.kernel.__main__', alter_argv=1)
    157     pkg_name = mod_name.rpartition('.')[0]
    158     main_globals = sys.modules["__main__"].__dict__
    159     if alter_argv:
    160         sys.argv[0] = fname
    161     return _run_code(code, main_globals, None,
--> 162                      "__main__", fname, loader, pkg_name)
        fname = '/Users/fengyuyao/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py'
        loader = <pkgutil.ImpLoader instance>
        pkg_name = 'IPython.kernel'
    163 
    164 def run_module(mod_name, init_globals=None,
    165                run_name=None, alter_sys=False):
    166     """Execute a module's code without importing it

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/runpy.py in _run_code(code=<code object <module> at 0x10596bdb0, file "/Use...ite-packages/IPython/kernel/__main__.py", line 1>, run_globals={'__builtins__': <module '__builtin__' (built-in)>, '__doc__': None, '__file__': '/Users/fengyuyao/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py', '__loader__': <pkgutil.ImpLoader instance>, '__name__': '__main__', '__package__': 'IPython.kernel', 'app': <module 'IPython.kernel.zmq.kernelapp' from '/Us.../site-packages/IPython/kernel/zmq/kernelapp.pyc'>}, init_globals=None, mod_name='__main__', mod_fname='/Users/fengyuyao/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py', mod_loader=<pkgutil.ImpLoader instance>, pkg_name='IPython.kernel')
     67         run_globals.update(init_globals)
     68     run_globals.update(__name__ = mod_name,
     69                        __file__ = mod_fname,
     70                        __loader__ = mod_loader,
     71                        __package__ = pkg_name)
---> 72     exec code in run_globals
        code = <code object <module> at 0x10596bdb0, file "/Use...ite-packages/IPython/kernel/__main__.py", line 1>
        run_globals = {'__builtins__': <module '__builtin__' (built-in)>, '__doc__': None, '__file__': '/Users/fengyuyao/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py', '__loader__': <pkgutil.ImpLoader instance>, '__name__': '__main__', '__package__': 'IPython.kernel', 'app': <module 'IPython.kernel.zmq.kernelapp' from '/Us.../site-packages/IPython/kernel/zmq/kernelapp.pyc'>}
     73     return run_globals
     74 
     75 def _run_module_code(code, init_globals=None,
     76                     mod_name=None, mod_fname=None,

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py in <module>()
      1 
      2 
----> 3 
      4 if __name__ == '__main__':
      5     from IPython.kernel.zmq import kernelapp as app
      6     app.launch_new_instance()
      7 
      8 
      9 
     10 

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/IPython/config/application.py in launch_instance(cls=<class 'IPython.kernel.zmq.kernelapp.IPKernelApp'>, argv=None, **kwargs={})
    569         
    570         If a global instance already exists, this reinitializes and starts it
    571         """
    572         app = cls.instance(**kwargs)
    573         app.initialize(argv)
--> 574         app.start()
        app.start = <bound method IPKernelApp.start of <IPython.kernel.zmq.kernelapp.IPKernelApp object>>
    575 
    576 #-----------------------------------------------------------------------------
    577 # utility functions, for convenience
    578 #-----------------------------------------------------------------------------

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/IPython/kernel/zmq/kernelapp.py in start(self=<IPython.kernel.zmq.kernelapp.IPKernelApp object>)
    369     def start(self):
    370         if self.poller is not None:
    371             self.poller.start()
    372         self.kernel.start()
    373         try:
--> 374             ioloop.IOLoop.instance().start()
    375         except KeyboardInterrupt:
    376             pass
    377 
    378 launch_new_instance = IPKernelApp.launch_instance

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/zmq/eventloop/ioloop.py in start(self=<zmq.eventloop.ioloop.ZMQIOLoop object>)
    146             PollIOLoop.configure(ZMQIOLoop)
    147         return PollIOLoop.instance()
    148     
    149     def start(self):
    150         try:
--> 151             super(ZMQIOLoop, self).start()
        self.start = <bound method ZMQIOLoop.start of <zmq.eventloop.ioloop.ZMQIOLoop object>>
    152         except ZMQError as e:
    153             if e.errno == ETERM:
    154                 # quietly return on ETERM
    155                 pass

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/tornado/ioloop.py in start(self=<zmq.eventloop.ioloop.ZMQIOLoop object>)
    835                 self._events.update(event_pairs)
    836                 while self._events:
    837                     fd, events = self._events.popitem()
    838                     try:
    839                         fd_obj, handler_func = self._handlers[fd]
--> 840                         handler_func(fd_obj, events)
        handler_func = <function null_wrapper>
        fd_obj = <zmq.sugar.socket.Socket object>
        events = 1
    841                     except (OSError, IOError) as e:
    842                         if errno_from_exception(e) == errno.EPIPE:
    843                             # Happens when the client closes the connection
    844                             pass

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/tornado/stack_context.py in null_wrapper(*args=(<zmq.sugar.socket.Socket object>, 1), **kwargs={})
    270         # Fast path when there are no active contexts.
    271         def null_wrapper(*args, **kwargs):
    272             try:
    273                 current_state = _state.contexts
    274                 _state.contexts = cap_contexts[0]
--> 275                 return fn(*args, **kwargs)
        args = (<zmq.sugar.socket.Socket object>, 1)
        kwargs = {}
    276             finally:
    277                 _state.contexts = current_state
    278         null_wrapper._wrapped = True
    279         return null_wrapper

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py in _handle_events(self=<zmq.eventloop.zmqstream.ZMQStream object>, fd=<zmq.sugar.socket.Socket object>, events=1)
    428             # dispatch events:
    429             if events & IOLoop.ERROR:
    430                 gen_log.error("got POLLERR event on ZMQStream, which doesn't make sense")
    431                 return
    432             if events & IOLoop.READ:
--> 433                 self._handle_recv()
        self._handle_recv = <bound method ZMQStream._handle_recv of <zmq.eventloop.zmqstream.ZMQStream object>>
    434                 if not self.socket:
    435                     return
    436             if events & IOLoop.WRITE:
    437                 self._handle_send()

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py in _handle_recv(self=<zmq.eventloop.zmqstream.ZMQStream object>)
    460                 gen_log.error("RECV Error: %s"%zmq.strerror(e.errno))
    461         else:
    462             if self._recv_callback:
    463                 callback = self._recv_callback
    464                 # self._recv_callback = None
--> 465                 self._run_callback(callback, msg)
        self._run_callback = <bound method ZMQStream._run_callback of <zmq.eventloop.zmqstream.ZMQStream object>>
        callback = <function null_wrapper>
        msg = [<zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>]
    466                 
    467         # self.update_state()
    468         
    469 

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py in _run_callback(self=<zmq.eventloop.zmqstream.ZMQStream object>, callback=<function null_wrapper>, *args=([<zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>],), **kwargs={})
    402         close our socket."""
    403         try:
    404             # Use a NullContext to ensure that all StackContexts are run
    405             # inside our blanket exception handler rather than outside.
    406             with stack_context.NullContext():
--> 407                 callback(*args, **kwargs)
        callback = <function null_wrapper>
        args = ([<zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>],)
        kwargs = {}
    408         except:
    409             gen_log.error("Uncaught exception, closing connection.",
    410                           exc_info=True)
    411             # Close the socket on an uncaught exception from a user callback

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/tornado/stack_context.py in null_wrapper(*args=([<zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>],), **kwargs={})
    270         # Fast path when there are no active contexts.
    271         def null_wrapper(*args, **kwargs):
    272             try:
    273                 current_state = _state.contexts
    274                 _state.contexts = cap_contexts[0]
--> 275                 return fn(*args, **kwargs)
        args = ([<zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>],)
        kwargs = {}
    276             finally:
    277                 _state.contexts = current_state
    278         null_wrapper._wrapped = True
    279         return null_wrapper

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/IPython/kernel/zmq/kernelbase.py in dispatcher(msg=[<zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>])
    247         if self.control_stream:
    248             self.control_stream.on_recv(self.dispatch_control, copy=False)
    249 
    250         def make_dispatcher(stream):
    251             def dispatcher(msg):
--> 252                 return self.dispatch_shell(stream, msg)
        msg = [<zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>, <zmq.sugar.frame.Frame object>]
    253             return dispatcher
    254 
    255         for s in self.shell_streams:
    256             s.on_recv(make_dispatcher(s), copy=False)

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/IPython/kernel/zmq/kernelbase.py in dispatch_shell(self=<IPython.kernel.zmq.ipkernel.IPythonKernel object>, stream=<zmq.eventloop.zmqstream.ZMQStream object>, msg={'buffers': [], 'content': {u'allow_stdin': True, u'code': u'learned_dict = dict_learn(init_dict, patches)', u'silent': False, u'stop_on_error': True, u'store_history': True, u'user_expressions': {}}, 'header': {u'msg_id': u'D61C0C0F1F89441EB2C232BAE352E9B6', u'msg_type': u'execute_request', u'session': u'21C58290AD9A4368BCFCB05D17E87C41', u'username': u'username', u'version': u'5.0'}, 'metadata': {}, 'msg_id': u'D61C0C0F1F89441EB2C232BAE352E9B6', 'msg_type': u'execute_request', 'parent_header': {}})
    208         else:
    209             # ensure default_int_handler during handler call
    210             sig = signal(SIGINT, default_int_handler)
    211             self.log.debug("%s: %s", msg_type, msg)
    212             try:
--> 213                 handler(stream, idents, msg)
        handler = <bound method IPythonKernel.execute_request of <IPython.kernel.zmq.ipkernel.IPythonKernel object>>
        stream = <zmq.eventloop.zmqstream.ZMQStream object>
        idents = ['21C58290AD9A4368BCFCB05D17E87C41']
        msg = {'buffers': [], 'content': {u'allow_stdin': True, u'code': u'learned_dict = dict_learn(init_dict, patches)', u'silent': False, u'stop_on_error': True, u'store_history': True, u'user_expressions': {}}, 'header': {u'msg_id': u'D61C0C0F1F89441EB2C232BAE352E9B6', u'msg_type': u'execute_request', u'session': u'21C58290AD9A4368BCFCB05D17E87C41', u'username': u'username', u'version': u'5.0'}, 'metadata': {}, 'msg_id': u'D61C0C0F1F89441EB2C232BAE352E9B6', 'msg_type': u'execute_request', 'parent_header': {}}
    214             except Exception:
    215                 self.log.error("Exception in message handler:", exc_info=True)
    216             finally:
    217                 signal(SIGINT, sig)

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/IPython/kernel/zmq/kernelbase.py in execute_request(self=<IPython.kernel.zmq.ipkernel.IPythonKernel object>, stream=<zmq.eventloop.zmqstream.ZMQStream object>, ident=['21C58290AD9A4368BCFCB05D17E87C41'], parent={'buffers': [], 'content': {u'allow_stdin': True, u'code': u'learned_dict = dict_learn(init_dict, patches)', u'silent': False, u'stop_on_error': True, u'store_history': True, u'user_expressions': {}}, 'header': {u'msg_id': u'D61C0C0F1F89441EB2C232BAE352E9B6', u'msg_type': u'execute_request', u'session': u'21C58290AD9A4368BCFCB05D17E87C41', u'username': u'username', u'version': u'5.0'}, 'metadata': {}, 'msg_id': u'D61C0C0F1F89441EB2C232BAE352E9B6', 'msg_type': u'execute_request', 'parent_header': {}})
    357         if not silent:
    358             self.execution_count += 1
    359             self._publish_execute_input(code, parent, self.execution_count)
    360         
    361         reply_content = self.do_execute(code, silent, store_history,
--> 362                                         user_expressions, allow_stdin)
        user_expressions = {}
        allow_stdin = True
    363 
    364         # Flush output before sending the reply.
    365         sys.stdout.flush()
    366         sys.stderr.flush()

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/IPython/kernel/zmq/ipkernel.py in do_execute(self=<IPython.kernel.zmq.ipkernel.IPythonKernel object>, code=u'learned_dict = dict_learn(init_dict, patches)', silent=False, store_history=True, user_expressions={}, allow_stdin=True)
    176 
    177         reply_content = {}
    178         # FIXME: the shell calls the exception handler itself.
    179         shell._reply_content = None
    180         try:
--> 181             shell.run_cell(code, store_history=store_history, silent=silent)
        shell.run_cell = <bound method ZMQInteractiveShell.run_cell of <I....kernel.zmq.zmqshell.ZMQInteractiveShell object>>
        code = u'learned_dict = dict_learn(init_dict, patches)'
        store_history = True
        silent = False
    182         except:
    183             status = u'error'
    184             # FIXME: this code right now isn't being used yet by default,
    185             # because the run_cell() call above directly fires off exception

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/IPython/core/interactiveshell.py in run_cell(self=<IPython.kernel.zmq.zmqshell.ZMQInteractiveShell object>, raw_cell=u'learned_dict = dict_learn(init_dict, patches)', store_history=True, silent=False, shell_futures=True)
   2863                 self.displayhook.exec_result = result
   2864 
   2865                 # Execute the user code
   2866                 interactivity = "none" if silent else self.ast_node_interactivity
   2867                 self.run_ast_nodes(code_ast.body, cell_name,
-> 2868                    interactivity=interactivity, compiler=compiler, result=result)
        interactivity = 'last_expr'
        compiler = <IPython.core.compilerop.CachingCompiler instance>
   2869 
   2870                 # Reset this so later displayed values do not modify the
   2871                 # ExecutionResult
   2872                 self.displayhook.exec_result = None

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/IPython/core/interactiveshell.py in run_ast_nodes(self=<IPython.kernel.zmq.zmqshell.ZMQInteractiveShell object>, nodelist=[<_ast.Assign object>], cell_name='<ipython-input-24-d745e5de1eae>', interactivity='none', compiler=<IPython.core.compilerop.CachingCompiler instance>, result=<IPython.core.interactiveshell.ExecutionResult object>)
   2967 
   2968         try:
   2969             for i, node in enumerate(to_run_exec):
   2970                 mod = ast.Module([node])
   2971                 code = compiler(mod, cell_name, "exec")
-> 2972                 if self.run_code(code, result):
        self.run_code = <bound method ZMQInteractiveShell.run_code of <I....kernel.zmq.zmqshell.ZMQInteractiveShell object>>
        code = <code object <module> at 0x10abcef30, file "<ipython-input-24-d745e5de1eae>", line 1>
        result = <IPython.core.interactiveshell.ExecutionResult object>
   2973                     return True
   2974 
   2975             for i, node in enumerate(to_run_interactive):
   2976                 mod = ast.Interactive([node])

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/IPython/core/interactiveshell.py in run_code(self=<IPython.kernel.zmq.zmqshell.ZMQInteractiveShell object>, code_obj=<code object <module> at 0x10abcef30, file "<ipython-input-24-d745e5de1eae>", line 1>, result=<IPython.core.interactiveshell.ExecutionResult object>)
   3027         outflag = 1  # happens in more places, so it's easier as default
   3028         try:
   3029             try:
   3030                 self.hooks.pre_run_code_hook()
   3031                 #rprint('Running code', repr(code_obj)) # dbg
-> 3032                 exec(code_obj, self.user_global_ns, self.user_ns)
        code_obj = <code object <module> at 0x10abcef30, file "<ipython-input-24-d745e5de1eae>", line 1>
        self.user_global_ns = {'In': ['', u'import skimage\nimport skimage.data as data\ni...klearn.preprocessing import normalize\nimport os', u"get_ipython().magic(u'matplotlib inline')", u"data_path = '/Users/fengyuyao/Research/experim...th) if '.png' in i])\n\ndata = data.mean(axis=3)", u'img = data[0, ...]\n#img = sktrans.resize(img, (150, 150))', u"pimg = extract_patches_2d(img, (8,8))\nnimg = ...#ccc =reconstruct_from_patches_2d(bbb, (50, 50))", u'pimg = normalize(pimg)\nnpimg = normalize(npimg)', u"pimg = extract_patches_2d(img, (8,8))\nnimg = ...#ccc =reconstruct_from_patches_2d(bbb, (50, 50))", u'patches = np.array([extract_patches_2d(d, (8,8)) for d in data[:10,...]]).reshape(-1, 64)', u'init_dict = patches[np.random.choice(np.arange... = np.ones(64)\ninit_dict = normalize(init_dict)', u"def dict_learn(dictionary, data):\n    diction...        #yield dictionary\n    return dictionary", u'learned_dict = dict_learn(init_dict, patches)', u"def dict_learn(dictionary, data):\n    diction...        #yield dictionary\n    return dictionary", u'init_dict = patches[np.random.choice(np.arange... = np.ones(64)\ninit_dict = normalize(init_dict)', u'learned_dict = dict_learn(init_dict, patches)', u"def dict_learn(dictionary, data):\n    diction...        #yield dictionary\n    return dictionary", u'learned_dict = dict_learn(init_dict, patches)', u'patches = np.array([extract_patches_2d(d, (8,8)) for d in data[:10,...]]).reshape(-1, 64)', u'index = np.arange(patches.shape[0])\nnp.random.shuffle(index)\nindex = index[:20000]', u'patches = patches[index]', ...], 'Out': {20: (20000, 64)}, 'SparseCoder': <class 'sklearn.decomposition.dict_learning.SparseCoder'>, '_': (20000, 64), '_20': (20000, 64), '__': '', '___': '', '__builtin__': <module '__builtin__' (built-in)>, '__builtins__': <module '__builtin__' (built-in)>, '__doc__': 'Automatically created module for IPython interactive environment', ...}
        self.user_ns = {'In': ['', u'import skimage\nimport skimage.data as data\ni...klearn.preprocessing import normalize\nimport os', u"get_ipython().magic(u'matplotlib inline')", u"data_path = '/Users/fengyuyao/Research/experim...th) if '.png' in i])\n\ndata = data.mean(axis=3)", u'img = data[0, ...]\n#img = sktrans.resize(img, (150, 150))', u"pimg = extract_patches_2d(img, (8,8))\nnimg = ...#ccc =reconstruct_from_patches_2d(bbb, (50, 50))", u'pimg = normalize(pimg)\nnpimg = normalize(npimg)', u"pimg = extract_patches_2d(img, (8,8))\nnimg = ...#ccc =reconstruct_from_patches_2d(bbb, (50, 50))", u'patches = np.array([extract_patches_2d(d, (8,8)) for d in data[:10,...]]).reshape(-1, 64)', u'init_dict = patches[np.random.choice(np.arange... = np.ones(64)\ninit_dict = normalize(init_dict)', u"def dict_learn(dictionary, data):\n    diction...        #yield dictionary\n    return dictionary", u'learned_dict = dict_learn(init_dict, patches)', u"def dict_learn(dictionary, data):\n    diction...        #yield dictionary\n    return dictionary", u'init_dict = patches[np.random.choice(np.arange... = np.ones(64)\ninit_dict = normalize(init_dict)', u'learned_dict = dict_learn(init_dict, patches)', u"def dict_learn(dictionary, data):\n    diction...        #yield dictionary\n    return dictionary", u'learned_dict = dict_learn(init_dict, patches)', u'patches = np.array([extract_patches_2d(d, (8,8)) for d in data[:10,...]]).reshape(-1, 64)', u'index = np.arange(patches.shape[0])\nnp.random.shuffle(index)\nindex = index[:20000]', u'patches = patches[index]', ...], 'Out': {20: (20000, 64)}, 'SparseCoder': <class 'sklearn.decomposition.dict_learning.SparseCoder'>, '_': (20000, 64), '_20': (20000, 64), '__': '', '___': '', '__builtin__': <module '__builtin__' (built-in)>, '__builtins__': <module '__builtin__' (built-in)>, '__doc__': 'Automatically created module for IPython interactive environment', ...}
   3033             finally:
   3034                 # Reset our crash handler in place
   3035                 sys.excepthook = old_excepthook
   3036         except SystemExit as e:

...........................................................................
/Users/fengyuyao/Research/ppts/dictionary_learning_2015.11.25/code/<ipython-input-24-d745e5de1eae> in <module>()
----> 1 
      2 
      3 
      4 
      5 
      6 learned_dict = dict_learn(init_dict, patches)
      7 
      8 
      9 
     10 

...........................................................................
/Users/fengyuyao/Research/ppts/dictionary_learning_2015.11.25/code/<ipython-input-23-50e8dab30ec4> in dict_learn(dictionary=array([[ 0.125     ,  0.125     ,  0.125     , ....  0.10416518,
         0.06896773,  0.0757119 ]]), data=array([[ 0.50559053,  0.49227671,  0.48265361, ....  0.15035063,
         0.1782305 ,  0.19739984]]))
      3     iteration = 0
      4     last_iter_norm = 1e5
      5     while True:
      6         # Sparse coding stage
      7         coder = SparseCoder(dictionary, transform_algorithm='omp', n_jobs=8, transform_n_nonzero_coefs=3)
----> 8         code = coder.fit_transform(data)
      9         #print iteration, ' ', linalg.norm(data - np.dot(code, dictionary)), ' +',
     10         # update stage
     11         for i in range(dictionary.shape[0]):
     12             _dictionary = dictionary.copy()

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/sklearn/base.py in fit_transform(self=SparseCoder(dictionary=None, n_jobs=8, split_sig...rm_alpha=None,
      transform_n_nonzero_coefs=3), X=array([[ 0.50559053,  0.49227671,  0.48265361, ....  0.15035063,
         0.1782305 ,  0.19739984]]), y=None, **fit_params={})
    450         """
    451         # non-optimized default implementation; override when a better
    452         # method is possible for a given clustering algorithm
    453         if y is None:
    454             # fit method of arity 1 (unsupervised transformation)
--> 455             return self.fit(X, **fit_params).transform(X)
        self.fit = <bound method SparseCoder.fit of SparseCoder(dic...m_alpha=None,
      transform_n_nonzero_coefs=3)>
        X = array([[ 0.50559053,  0.49227671,  0.48265361, ....  0.15035063,
         0.1782305 ,  0.19739984]])
        fit_params.transform = undefined
    456         else:
    457             # fit method of arity 2 (supervised transformation)
    458             return self.fit(X, y, **fit_params).transform(X)
    459 

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/sklearn/decomposition/dict_learning.py in transform(self=SparseCoder(dictionary=None, n_jobs=8, split_sig...rm_alpha=None,
      transform_n_nonzero_coefs=3), X=array([[ 0.50559053,  0.49227671,  0.48265361, ....  0.15035063,
         0.1782305 ,  0.19739984]]), y=None)
    813         n_samples, n_features = X.shape
    814 
    815         code = sparse_encode(
    816             X, self.components_, algorithm=self.transform_algorithm,
    817             n_nonzero_coefs=self.transform_n_nonzero_coefs,
--> 818             alpha=self.transform_alpha, n_jobs=self.n_jobs)
        self.transform_alpha = None
        self.n_jobs = 8
    819 
    820         if self.split_sign:
    821             # feature vector is split into a positive and negative side
    822             n_samples, n_features = code.shape

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/sklearn/decomposition/dict_learning.py in sparse_encode(X=array([[ 0.50559053,  0.49227671,  0.48265361, ....  0.15035063,
         0.1782305 ,  0.19739984]]), dictionary=array([[ 0.125     ,  0.125     ,  0.125     , ....  0.10416518,
         0.06896773,  0.0757119 ]]), gram=array([[ 1.        ,  0.99706708,  0.8669373 , ....  0.94511259,
         0.93221472,  1.        ]]), cov=array([[ 3.49867539,  1.93651123,  2.05015994, ....  4.82561002,
         0.62133361,  2.87358633]]), algorithm='omp', n_nonzero_coefs=3, alpha=None, copy_cov=False, init=None, max_iter=1000, n_jobs=8, check_input=True, verbose=0)
    295             algorithm,
    296             regularization=regularization, copy_cov=copy_cov,
    297             init=init[this_slice] if init is not None else None,
    298             max_iter=max_iter,
    299             check_input=False)
--> 300         for this_slice in slices)
        this_slice = undefined
        slices = [slice(0, 2500, None), slice(2500, 5000, None), slice(5000, 7500, None), slice(7500, 10000, None), slice(10000, 12500, None), slice(12500, 15000, None), slice(15000, 17500, None), slice(17500, 20000, None)]
    301     for this_slice, this_view in zip(slices, code_views):
    302         code[this_slice] = this_view
    303     return code
    304 

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py in __call__(self=Parallel(n_jobs=8), iterable=<generator object <genexpr>>)
    807             if pre_dispatch == "all" or n_jobs == 1:
    808                 # The iterable was consumed all at once by the above for loop.
    809                 # No need to wait for async callbacks to trigger to
    810                 # consumption.
    811                 self._iterating = False
--> 812             self.retrieve()
        self.retrieve = <bound method Parallel.retrieve of Parallel(n_jobs=8)>
    813             # Make sure that we get a last message telling us we are done
    814             elapsed_time = time.time() - self._start_time
    815             self._print('Done %3i out of %3i | elapsed: %s finished',
    816                         (len(self._output), len(self._output),

---------------------------------------------------------------------------
Sub-process traceback:
---------------------------------------------------------------------------
ValueError                                         Fri Dec  4 10:21:33 2015
PID: 35032              Python 2.7.10: /Users/fengyuyao/anaconda/bin/python
...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self=<sklearn.externals.joblib.parallel.BatchedCalls object>)
     67     def __init__(self, iterator_slice):
     68         self.items = list(iterator_slice)
     69         self._size = len(self.items)
     70 
     71     def __call__(self):
---> 72         return [func(*args, **kwargs) for func, args, kwargs in self.items]
     73 
     74     def __len__(self):
     75         return self._size
     76 

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/sklearn/decomposition/dict_learning.pyc in _sparse_encode(X=memmap([[ 0.50559053,  0.49227671,  0.48265361, ...  0.99596078,
         0.99738562,  1.        ]]), dictionary=array([[ 0.125     ,  0.125     ,  0.125     , ....  0.10416518,
         0.06896773,  0.0757119 ]]), gram=memmap([[ 1.        ,  0.99706708,  0.8669373 , ...  0.94511259,
         0.93221472,  1.        ]]), cov=memmap([[ 3.49867539,  1.93651123,  2.05015994, ...  5.77883725,
         3.55803798,  7.21968383]]), algorithm='omp', regularization=3, copy_cov=False, init=None, max_iter=1000, check_input=False, verbose=0)
    147     elif algorithm == 'omp':
    148         # TODO: Should verbose argument be passed to this?
    149         new_code = orthogonal_mp_gram(
    150             Gram=gram, Xy=cov, n_nonzero_coefs=int(regularization),
    151             tol=None, norms_squared=row_norms(X, squared=True),
--> 152             copy_Xy=copy_cov).T
        algorithm = 'omp'
        alpha = undefined
    153     else:
    154         raise ValueError('Sparse coding method must be "lasso_lars" '
    155                          '"lasso_cd",  "lasso", "threshold" or "omp", got %s.'
    156                          % algorithm)

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/sklearn/linear_model/omp.pyc in orthogonal_mp_gram(Gram=array([[ 1.        ,  0.99706708,  0.8669373 , ....  0.94511259,
         0.93221472,  1.        ]]), Xy=array([[ 3.49867539,  1.93651123,  2.05015994, ....  5.77883725,
         3.55803798,  7.21968383]]), n_nonzero_coefs=3, tol=None, norms_squared=array([ 12.37032493,   4.36747488,   4.2134112 ,... 37.00901994,
        16.6505497 ,  58.97107498]), copy_Gram=True, copy_Xy=False, return_path=False, return_n_iter=False)
    518     for k in range(Xy.shape[1]):
    519         out = _gram_omp(
    520             Gram, Xy[:, k], n_nonzero_coefs,
    521             norms_squared[k] if tol is not None else None, tol,
    522             copy_Gram=copy_Gram, copy_Xy=copy_Xy,
--> 523             return_path=return_path)
    524         if return_path:
    525             _, idx, coefs, n_iter = out
    526             coef = coef[:, :, :len(idx)]
    527             for n_active, x in enumerate(coefs.T):

...........................................................................
/Users/fengyuyao/anaconda/lib/python2.7/site-packages/sklearn/linear_model/omp.pyc in _gram_omp(Gram=array([[ 1.        ,  0.99010866,  0.82197346, ....  0.94511259,
         0.93221472,  1.        ]]), Xy=array([ 3.49867539,  3.48729003,  2.91977933,  3...4,  3.39029937,
        3.45356109,  3.35550344]), n_nonzero_coefs=3, tol_0=None, tol=None, copy_Gram=True, copy_Xy=False, return_path=False)
    240                 break
    241             L[n_active, n_active] = np.sqrt(1 - v)
    242         Gram[n_active], Gram[lam] = swap(Gram[n_active], Gram[lam])
    243         Gram.T[n_active], Gram.T[lam] = swap(Gram.T[n_active], Gram.T[lam])
    244         indices[n_active], indices[lam] = indices[lam], indices[n_active]
--> 245         Xy[n_active], Xy[lam] = Xy[lam], Xy[n_active]
        return_path = False
    246         n_active += 1
    247         # solves LL'x = y as a composition of two triangular systems
    248         gamma, _ = potrs(L[:n_active, :n_active], Xy[:n_active], lower=True,
    249                          overwrite_b=False)

ValueError: assignment destination is read-only
___________________________________________________________________________

@vighneshbirodkar

This comment has been minimized.

Show comment
Hide comment
@vighneshbirodkar

vighneshbirodkar Dec 8, 2015

Contributor

I am taking a look at this

Contributor

vighneshbirodkar commented Dec 8, 2015

I am taking a look at this

@amueller amueller added the Bug label Dec 8, 2015

@amueller amueller added this to the 0.18 milestone Dec 8, 2015

@lesteve

This comment has been minimized.

Show comment
Hide comment
@lesteve

lesteve Dec 9, 2015

Member

Is it not related to #5481, which seems more generic?

Member

lesteve commented Dec 9, 2015

Is it not related to #5481, which seems more generic?

@vighneshbirodkar

This comment has been minimized.

Show comment
Hide comment
@vighneshbirodkar

vighneshbirodkar Dec 9, 2015

Contributor

It is, but SparseEncoder is not an estimator

Contributor

vighneshbirodkar commented Dec 9, 2015

It is, but SparseEncoder is not an estimator

@lesteve

This comment has been minimized.

Show comment
Hide comment
@lesteve

lesteve Dec 10, 2015

Member

It is, but SparseEncoder is not an estimator

Not that it matters but SparseCoder is an estimator:

from sklearn.base import BaseEstimator
from sklearn.decomposition import SparseCoder

issubclass(SparseCoder, BaseEstimator)  # True
Member

lesteve commented Dec 10, 2015

It is, but SparseEncoder is not an estimator

Not that it matters but SparseCoder is an estimator:

from sklearn.base import BaseEstimator
from sklearn.decomposition import SparseCoder

issubclass(SparseCoder, BaseEstimator)  # True
@arthurmensch

This comment has been minimized.

Show comment
Hide comment
@arthurmensch

arthurmensch Dec 11, 2015

Contributor

I guess the error wasn't detected in #4807 as it is raised only when using algorithm='omp'. It should be raised when testing read only data on OrthogonalMatchingPursuit though.

Contributor

arthurmensch commented Dec 11, 2015

I guess the error wasn't detected in #4807 as it is raised only when using algorithm='omp'. It should be raised when testing read only data on OrthogonalMatchingPursuit though.

@alichaudry

This comment has been minimized.

Show comment
Hide comment
@alichaudry

alichaudry Apr 8, 2016

Was there a resolution to this bug? I've run into something similar while doing n_jobs=-1 on RandomizedLogisticRegression, and didn't know whether I should open a new issue here. Here's the top of my stack:

/Users/ali/.pyenv/versions/mvenv/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.py in _randomized_logistic(X=memmap([[ -4.24636666e-03,  -5.10115749e-03,  -1...920913e-03,  -1.46599832e-03,   2.91083847e-03]]), y=array([1, 0, 0, ..., 0, 0, 0]), weights=array([ 0. ,  0.5,  0.5,  0. ,  0. ,  0. ,  0.5,... 0.5,  0.5,  0. ,  0. ,  0.5,  0. ,
        0.5]), mask=array([ True,  True, False, ...,  True,  True,  True], dtype=bool), C=1.5, verbose=0, fit_intercept=True, tol=0.001)
    351     if issparse(X):
    352         size = len(weights)
    353         weight_dia = sparse.dia_matrix((1 - weights, 0), (size, size))
    354         X = X * weight_dia
    355     else:
--> 356         X *= (1 - weights)
        X = memmap([[ -4.24636666e-03,  -5.10115749e-03,  -1...920913e-03,  -1.46599832e-03,   2.91083847e-03]])
        weights = array([ 0. ,  0.5,  0.5,  0. ,  0. ,  0. ,  0.5,... 0.5,  0.5,  0. ,  0. ,  0.5,  0. ,
        0.5])
    357 
    358     C = np.atleast_1d(np.asarray(C, dtype=np.float))
    359     scores = np.zeros((X.shape[1], len(C)), dtype=np.bool)
    360 

ValueError: output array is read-only

Someone ran into the same exact problem on StackOverflow - ValueError: output array is read-only. Both provided solutions on SO are useless (the first one doesn't even bother solving the problem, and the second one is solving the problem by bypassing joblib completely).

alichaudry commented Apr 8, 2016

Was there a resolution to this bug? I've run into something similar while doing n_jobs=-1 on RandomizedLogisticRegression, and didn't know whether I should open a new issue here. Here's the top of my stack:

/Users/ali/.pyenv/versions/mvenv/lib/python2.7/site-packages/sklearn/linear_model/randomized_l1.py in _randomized_logistic(X=memmap([[ -4.24636666e-03,  -5.10115749e-03,  -1...920913e-03,  -1.46599832e-03,   2.91083847e-03]]), y=array([1, 0, 0, ..., 0, 0, 0]), weights=array([ 0. ,  0.5,  0.5,  0. ,  0. ,  0. ,  0.5,... 0.5,  0.5,  0. ,  0. ,  0.5,  0. ,
        0.5]), mask=array([ True,  True, False, ...,  True,  True,  True], dtype=bool), C=1.5, verbose=0, fit_intercept=True, tol=0.001)
    351     if issparse(X):
    352         size = len(weights)
    353         weight_dia = sparse.dia_matrix((1 - weights, 0), (size, size))
    354         X = X * weight_dia
    355     else:
--> 356         X *= (1 - weights)
        X = memmap([[ -4.24636666e-03,  -5.10115749e-03,  -1...920913e-03,  -1.46599832e-03,   2.91083847e-03]])
        weights = array([ 0. ,  0.5,  0.5,  0. ,  0. ,  0. ,  0.5,... 0.5,  0.5,  0. ,  0. ,  0.5,  0. ,
        0.5])
    357 
    358     C = np.atleast_1d(np.asarray(C, dtype=np.float))
    359     scores = np.zeros((X.shape[1], len(C)), dtype=np.bool)
    360 

ValueError: output array is read-only

Someone ran into the same exact problem on StackOverflow - ValueError: output array is read-only. Both provided solutions on SO are useless (the first one doesn't even bother solving the problem, and the second one is solving the problem by bypassing joblib completely).

@lesteve

This comment has been minimized.

Show comment
Hide comment
@lesteve

lesteve Apr 12, 2016

Member

@alichaudry I just commented on a similar issue here.

Member

lesteve commented Apr 12, 2016

@alichaudry I just commented on a similar issue here.

@fkrasnov

This comment has been minimized.

Show comment
Hide comment
@fkrasnov

fkrasnov Apr 19, 2016

I confirm that there is an error and it is floating in nature.

sklearn.decomposition.SparseCoder(D, transform_algorithm = 'omp', n_jobs=64).transform(X)

if X.shape[0] > 4000 it fails with ValueError: assignment destination is read-only
If X.shape[0] <100 it is ok.

OS: Linux 3.10.0-327.13.1.el7.x86_64
python: Python 2.7.5
numpy: 1.10.1
sklearn: 0.17

fkrasnov commented Apr 19, 2016

I confirm that there is an error and it is floating in nature.

sklearn.decomposition.SparseCoder(D, transform_algorithm = 'omp', n_jobs=64).transform(X)

if X.shape[0] > 4000 it fails with ValueError: assignment destination is read-only
If X.shape[0] <100 it is ok.

OS: Linux 3.10.0-327.13.1.el7.x86_64
python: Python 2.7.5
numpy: 1.10.1
sklearn: 0.17

@zaino

This comment has been minimized.

Show comment
Hide comment
@zaino

zaino Jul 20, 2016

Hi there,
I'm running into the same problem, using MiniBatchDictionaryLearning with jobs>1.
I see a lot of referencing to other issues, but was there ever a solution to this?
Sorry in advance if a solution was mentioned and I missed it.

OS: OSX
python: 3.5
numpy: 1.10.1
sklearn: 0.17

zaino commented Jul 20, 2016

Hi there,
I'm running into the same problem, using MiniBatchDictionaryLearning with jobs>1.
I see a lot of referencing to other issues, but was there ever a solution to this?
Sorry in advance if a solution was mentioned and I missed it.

OS: OSX
python: 3.5
numpy: 1.10.1
sklearn: 0.17

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Jul 27, 2016

Member

The problem is in modifying arrays in-place. @lesteve close as duplicate of #5481?

Member

amueller commented Jul 27, 2016

The problem is in modifying arrays in-place. @lesteve close as duplicate of #5481?

@amueller amueller modified the milestones: 0.18, 0.19 Sep 22, 2016

@amueller amueller modified the milestone: 0.19 Sep 29, 2016

@williamdjones

This comment has been minimized.

Show comment
Hide comment
@williamdjones

williamdjones Jul 13, 2017

currently I am still dealing with this issue and it is nearly a year since. this is still an open issue.

williamdjones commented Jul 13, 2017

currently I am still dealing with this issue and it is nearly a year since. this is still an open issue.

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Jul 14, 2017

Member

If you have a solution, please contribute it, @williamdjones

Member

jnothman commented Jul 14, 2017

If you have a solution, please contribute it, @williamdjones

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Jul 14, 2017

Member

#4807 is probably the more advanced effort to address this.

Member

agramfort commented Jul 14, 2017

#4807 is probably the more advanced effort to address this.

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Jul 15, 2017

Member

@williamdjones I was not suggesting that it's solved, but that it's an issue that is reported at a different place, and having multiple issues related to the same problem makes keeping track of it harder.

Member

amueller commented Jul 15, 2017

@williamdjones I was not suggesting that it's solved, but that it's an issue that is reported at a different place, and having multiple issues related to the same problem makes keeping track of it harder.

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Aug 22, 2017

Not sure where to report this, or if it's related, but I get the ValueError: output array is read-only when using n_jobs > 1 with RandomizedLasso and other functions.

ghost commented Aug 22, 2017

Not sure where to report this, or if it's related, but I get the ValueError: output array is read-only when using n_jobs > 1 with RandomizedLasso and other functions.

@williamdjones

This comment has been minimized.

Show comment
Hide comment
@williamdjones

williamdjones Aug 22, 2017

@JGH1000 NOT A SOLUTION, but I would try using a random forest for feature selection instead since it is stable and has working joblib functionality.

williamdjones commented Aug 22, 2017

@JGH1000 NOT A SOLUTION, but I would try using a random forest for feature selection instead since it is stable and has working joblib functionality.

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Aug 22, 2017

Thanks @williamdjones, I used several different methods but found that RandomizedLasso works best for couple of particular datasets. In any case, it works but a bit slow. Not a deal breaker.

ghost commented Aug 22, 2017

Thanks @williamdjones, I used several different methods but found that RandomizedLasso works best for couple of particular datasets. In any case, it works but a bit slow. Not a deal breaker.

@williamdjones

This comment has been minimized.

Show comment
Hide comment
@williamdjones

williamdjones Aug 22, 2017

@JGH1000 No problem. If you don't mind, I'm curious about the dimensionality of the datasets for which RLasso was useful versus those for which it was not.

williamdjones commented Aug 22, 2017

@JGH1000 No problem. If you don't mind, I'm curious about the dimensionality of the datasets for which RLasso was useful versus those for which it was not.

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Aug 22, 2017

@williamdjones it was a small sample size (40-50), high-dimension (40,000-50,000) dataset. I would not say that other methods were bad, but RLasso provided results/ranking that were much more consistent with several univariate tests + domain knowledge. I guess this might not be the 'right' features but I had more trust in this method. Shame to hear it will be removed from scikit.

ghost commented Aug 22, 2017

@williamdjones it was a small sample size (40-50), high-dimension (40,000-50,000) dataset. I would not say that other methods were bad, but RLasso provided results/ranking that were much more consistent with several univariate tests + domain knowledge. I guess this might not be the 'right' features but I had more trust in this method. Shame to hear it will be removed from scikit.

@garciaev

This comment has been minimized.

Show comment
Hide comment
@garciaev

garciaev Feb 4, 2018

The problem still seems to exist on 24 core Ubuntu processor for RLasso with n_jobs = -1 and sklearn 0.19.1

garciaev commented Feb 4, 2018

The problem still seems to exist on 24 core Ubuntu processor for RLasso with n_jobs = -1 and sklearn 0.19.1

ogrisel added a commit to ogrisel/scikit-learn that referenced this issue Jun 22, 2018

ogrisel added a commit to ogrisel/scikit-learn that referenced this issue Jun 22, 2018

@ogrisel ogrisel modified the milestones: 0.19, 0.20 Jun 23, 2018

@lvermue

This comment has been minimized.

Show comment
Hide comment
@lvermue

lvermue Sep 13, 2018

@coldfog @garciaev I don't know if it is still relevant for you, but I ran into the same problem using joblib without scikit-learn.

The reason is the max_nbytes parameter within the Parallel invocation of the Joblib-library when you set n_jobs>1, which is 1M by default. The definition of this parameter is: "Threshold on the size of arrays passed to the workers that triggers automated memory mapping in temp_folder".
More details can be found here: https://joblib.readthedocs.io/en/latest/generated/joblib.Parallel.html#

So, once the arrays pass the size of 1M, joblib will throw the error "ValueError: assignment destination is read-only". In order to overcome this, the parameter has to be set higher, e.g. max_nbytes='50M'.

If you want a quick-fix, you can add max_nbytes='50M' to the file "sklearn/decomposition/dict_learning.py" at line 297 in the Parallel class initiation to increase the allowed size of temporary files.

lvermue commented Sep 13, 2018

@coldfog @garciaev I don't know if it is still relevant for you, but I ran into the same problem using joblib without scikit-learn.

The reason is the max_nbytes parameter within the Parallel invocation of the Joblib-library when you set n_jobs>1, which is 1M by default. The definition of this parameter is: "Threshold on the size of arrays passed to the workers that triggers automated memory mapping in temp_folder".
More details can be found here: https://joblib.readthedocs.io/en/latest/generated/joblib.Parallel.html#

So, once the arrays pass the size of 1M, joblib will throw the error "ValueError: assignment destination is read-only". In order to overcome this, the parameter has to be set higher, e.g. max_nbytes='50M'.

If you want a quick-fix, you can add max_nbytes='50M' to the file "sklearn/decomposition/dict_learning.py" at line 297 in the Parallel class initiation to increase the allowed size of temporary files.

@chriys

This comment has been minimized.

Show comment
Hide comment
@chriys

chriys Sep 18, 2018

Just to complement @lvermue answer. I did what he suggested, but instead inside sklearn/decomposition/dict_learning.py the Parallel class that is instantiated doesn't have the max_nbytes set. What worked for me was to increase the default max_nbytes inside sklearn/externals/joblib/parallel.py from "1M" to "10M" I think you can put more if needed.

chriys commented Sep 18, 2018

Just to complement @lvermue answer. I did what he suggested, but instead inside sklearn/decomposition/dict_learning.py the Parallel class that is instantiated doesn't have the max_nbytes set. What worked for me was to increase the default max_nbytes inside sklearn/externals/joblib/parallel.py from "1M" to "10M" I think you can put more if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment