Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't run DEAP program with SCOOP #57

Closed
gvdb opened this issue Feb 12, 2015 · 10 comments
Closed

Can't run DEAP program with SCOOP #57

gvdb opened this issue Feb 12, 2015 · 10 comments
Labels

Comments

@gvdb
Copy link

gvdb commented Feb 12, 2015

I'm trying to run Python script that uses DEAP and SCOOP to process evaluations in parallel, however I get the following traceback.

[2015-02-12 15:24:25,021] brokerzmq (127.0.0.1:57627) DEBUG   SHUTDOWN command received.
Traceback (most recent call last):
  File "/Applications/Canopy.app/appdata/canopy-1.1.0.1371.macosx-x86_64/Canopy.app/Contents/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/Applications/Canopy.app/appdata/canopy-1.1.0.1371.macosx-x86_64/Canopy.app/Contents/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
    b.main()
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
    self.run()
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
    futures_startup()
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
    run_name="__main__"
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
    result = _controller.switch(rootFuture, *args, **kargs)
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/_control.py", line 253, in runController
    raise future.exceptionValue
IndexError: string index out of range

I've tried to run it through debugging, but I can never reach the code in _control.py because the program terminates before then with the following traceback.

[2015-02-12 15:51:56,111] scoopzmq  (127.0.0.1:51799) ERROR   An instance could not find its base reference on a worker. Ensure that your objects have their definition available in the root scope of your program.
'module' object has no attribute 'Individual'
[2015-02-12 15:51:56,111] brokerzmq (127.0.0.1:51893) DEBUG   SHUTDOWN command received.
[2015-02-12 15:51:56,113] scoopzmq  (127.0.0.1:52298) ERROR   A worker exited unexpectedly. Read the worker logs for more information. SCOOP pool will now shutdown.
Traceback (most recent call last):
  File "/Applications/Canopy.app/appdata/canopy-1.1.0.1371.macosx-x86_64/Canopy.app/Contents/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/Applications/Canopy.app/appdata/canopy-1.1.0.1371.macosx-x86_64/Canopy.app/Contents/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
    b.main()
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
    self.run()
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
    futures_startup()
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
    run_name="__main__"
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
    result = _controller.switch(rootFuture, *args, **kargs)
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/_control.py", line 207, in runController
    future = execQueue.pop()
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/_types.py", line 320, in pop
    self.updateQueue()
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/_types.py", line 343, in updateQueue
    for future in self.socket.recvFuture():
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 279, in recvFuture
    received = self._recv()
  File "/Users/gvdb/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 197, in _recv
    raise ReferenceBroken(e)
scoop._comm.scoopexceptions.ReferenceBroken: 'module' object has no attribute 'Individual'

I understand that this may not necessarily be an issue with DEAP, so I will also post this query to SCOOP developers for help. Any help or advice you can provide would be appreciated.

@cmd-ntrf
Copy link
Member

This is not an issue with DEAP itself, so I am tagging this as invalid. Please use Github issues to report bugs and DEAP mailing-list to ask question on how to use it. That said, I see that we have omitted to put a link to the mailing-list in the README. Here is the link:
https://groups.google.com/forum/#!forum/deap-users

I will fix that, thank you for the involuntary bug report ;).

Now for your problem, the answer is in the final line of the stacktrace:

scoop._comm.scoopexceptions.ReferenceBroken: 'module' object has no attribute 'Individual'

The module object scoop is refering to is the creator module. The problem is that you probably called creator.create somewhere else than the global scope of your main module. The global scope of your main module is executed by every Scoop worker. Since the creator constructs new classes dynamically, if the creator.create line are not executed on every worker, the definition of your Individual will not be available for the workers.

I cannot blame you, this is a common mistake and as far I as know, we have not documented this behaviour properly. Although all of our examples explictly call creator.create in the top level of the main module. So we have another documentation bug here.

That said, I have recently made a breakthrough in one of my branch that will alleviate DEAP from this infamous pickling problem and will save us from having to write more documentation. I will not cover the details here, you can see the commit here : cmd-ntrf@c71b773

The change has not made its entry in DEAP's master branch yet because I am a bit busy and I still have to work on some details. However, if you want to give it a try, here is the link to my fork:
https://github.com/cmd-ntrf/deap-1

Simply replace your installation of DEAP by this one, and it should solve your problem. If it does not... well I will be glad to hear feedback from you :). Use my fork Issues page to report any problem.

I am closing this issue for now and I will modify our README to link to the mailing-list. Use the mailing-list if you have more questions on DEAP.

@yhamadi75
Copy link

Hello, Is there definitive solution to this? I am running into a similar situation on Windows (with Scoop or Multiprocessing/Pool).
By definitive I mean a new version of deap integrating the changes shown on the deap-1 branch above.
Best,
Youssef

@tdb-alcorn
Copy link

If you want to avoid this problem but don't want to have to move all your DEAP code to your main module, a workaround is to do all your calls to creator.create in the main module (which scoop executes) and then pass that creator down to the lower module where the rest of your DEAP stuff happens.

E.g.

# main.py
from deap import creator, base
from some_module import set_creator
set_creator(creator)
creator.create("Fitness", base.Fitness, weights=(1.0,))
creator.create("Individual", list, fitness=creator.Fitness)

and then in the other module

# some_module.py
creator = None
def set_creator(cr):
    global creator
    creator = cr

from deap import base, tools, algorithms

#... deap stuff

@abast
Copy link

abast commented Apr 18, 2017

Another work around is to use dill for the serialization. Instead of pickle and cloudpickle, dill seems to be able to serialize the respective objects. This also works, if the were not created in the main module.

Edit: dill does not work robustly here.

@abast
Copy link

abast commented Jul 19, 2017

Is there any progress on this? If deap requires to mess with the main module just to make parallelization work, this is a huge drawback.. To be honest, I don't understand the invalid-tag here.

@paddymccrudden
Copy link

I am having the same issue. Would love to hear an update.

@fmder
Copy link
Member

fmder commented Sep 12, 2017

@paddymccrudden Have you tried the proposed workaround?

@paddymccrudden
Copy link

paddymccrudden commented Sep 13, 2017

Hi @fmder. Thanks! I was able to get that part running, but ran into another problem with pickling:


WARNING Pickling Error: Can't pickle : attribute lookup __builtin__.instancemethod failed

which led to:


scoop._comm.scoopexceptions.ReferenceBroken: 
This element could not be pickled:
FutureId(worker='127.0.0.1:50597', rank=1):partial(Individual([ 0.61961557,  0.38038443]),)=None.

To make this work, I had to refactor my code base in order to expose the functions, toolboxes, stats objects etc at the top level of main, along with making some variables global (not pretty).

I also had an issue with the use of decorators in the code. I have been using decorators once to enforce stochastic fitness (see wiki), and once to process individuals to enforce constraints. I should be able get around these by using inline functions rather than decorators, and change the flow of the code.

I've learnt that it pays to think parallel first. Easier than refactoring!

@dgabriele
Copy link

dgabriele commented May 26, 2021

No updates to documentation on this matter? It's still plaguing me. I designed my deap system prior to trying to swap in scoop, and now it's proving a huge headache to get this working. Not only that, but it requires me to make my otherwise modular, tested, and easily reusable framework code into deap's idiosyncratic boilerplate, which now has to be copy and pasted in any deap application I may have. It's not a great feeling, to be honest. The examples in deap's docs should clearly show the main entrypoint in an if __name__ == '__main__' block, don't you think?

@abast
Copy link

abast commented May 26, 2021

So I can add one more workaround. My 'Individuals' genome is fully specified by a simple list of numbers. So for evaluation, inside the map function, which triggers the parallel evaluation in the HPC environment, I convert the Individual to an actual python list, which can be pickled. That way, all the DEAP specifics stay locally. When the results come back, I update the local Individuals based on what comes back from the HPC environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants