Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pickling error on Windows #12

Closed
guenp opened this issue Jan 19, 2016 · 14 comments
Closed

Pickling error on Windows #12

guenp opened this issue Jan 19, 2016 · 14 comments
Assignees

Comments

@guenp
Copy link
Contributor

guenp commented Jan 19, 2016

When running the example script.py on Windows, the following error occurs:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\ccmradmin\Anaconda3\lib\multiprocessing\spawn.py", line 106, in
 spawn_main
    exitcode = _main(fd)
  File "C:\Users\ccmradmin\Anaconda3\lib\multiprocessing\spawn.py", line 116, in
 _main
    self = pickle.load(from_parent)
AttributeError: Can't get attribute 'TestProcedure' on <module '__main__' (built
-in)>

This may have to do with the multiprocessing behavior on Windows.

@guenp guenp self-assigned this Jan 19, 2016
@cjermain
Copy link
Member

Ok, it looks like the difference between spawn and fork. On Unix fork is the default, which means that the TestProcedure gets duplicated in the new processes' environment, so the Worker has access to the class. However, in Windows spawn is the default, which starts a fresh environment.

It looks like the best approach is to test by forcing a spawn process. This may require changes in how the process accesses the Procedure class. What do you think?

@guenp
Copy link
Contributor Author

guenp commented Jan 21, 2016

Yes that's right, here is another explanation:
http://stackoverflow.com/questions/9670926/multiprocessing-on-windows-breaks

I ran into this issue with my measurement code as well and solved it by passing only picklable objects to processes. I'll have a look and see if I can make it work.

fyi, to bugfix this on Mac or Unix we can use:

# force windows multiprocessing behavior on mac
import multiprocessing as mp
mp.set_start_method('spawn')

@cjermain
Copy link
Member

Ok, I agree that forcing the spawn multiprocessing behaviour is necessary to be platform agnostic. I will test this on Linux when I get the chance. Are you able to run a Worker on Windows with this code added?

@guenp
Copy link
Contributor Author

guenp commented Jan 27, 2016

@cjermain I think the problem originates from the procedure object being passed to the worker, which it is then not able to retrieve because it's not a picklable object. One solution might be to create a string representation of the procedure, and then have the worker create it inside the run loop. This string should include the procedure class and the parameters. I'll give it a try.

@guenp
Copy link
Contributor Author

guenp commented Jan 27, 2016

@cjermain ok so what I just described is in fact exactly what pickling does. :) the worker just needs to be able to access the procedure class. This can be done by saving the procedure file in a file that's accessible to the path. See [https://github.com/ralph-group/pymeasure/commit/f599f0a122337feb13371bbde32cfe3208dddc00] and [https://github.com/ralph-group/pymeasure/commit/391eb192e9d8e20ef6448bbade6317a8be41d2d3] for an example. Do you agree with these changes? It would require users to save their procedures in a file so scripting won't work.

@cjermain
Copy link
Member

I'm not yet able to reproduce the pickling bug on Linux. Besides the logging no longer showing up properly, I am able to run the examples/script.py file without any exceptions.

I'd rather not restrict users to having to define their Procedures in a flat file. Let me think about this for a bit. Do you have any other ideas?

@cjermain
Copy link
Member

In the initial error, is the AttributeError raised because TestProcedure is not defined in the worker process? I'm confused as to whether the issue is in TestProcedure's definition in the worker process environment or whether we just need to ensure TestProcedure is pickle-able.

@guenp
Copy link
Contributor Author

guenp commented Jan 28, 2016

It works for you even with using spawn? Hmm. On my mac I can definitely reproduce the error. I agree it's a bit limiting but I'd rather not restrict users to using Linux ;) I think there's not really another solution, but maybe I'm missing something. Perhaps you could import the procedure from the script file (if it's accessible in the path)? That wouldn't work for ipython notebook though.

@guenp
Copy link
Contributor Author

guenp commented Jan 28, 2016

Worker is trying to access self.procedure and it does so through pickling (it saved the object properties in a serial representation and then tries to recreate the object). However, Worker does not have access to the class TestProcedure, hence the AttributeError.

@cjermain
Copy link
Member

I can reproduce the pickeling problem with the following example from stackoverflow.

from multiprocessing import Process, set_start_method
set_start_method('spawn', force=True)

if __name__ == '__main__':
    def f(I):
        print('hello world!',I)

    for I in [1,2,3]:
        Process(target=f,args=(I,)).start()

This gives me:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.4/multiprocessing/spawn.py", line 106, in spawn_main
  File "/usr/lib/python3.4/multiprocessing/spawn.py", line 106, in spawn_main
    exitcode = _main(fd)
  File "/usr/lib/python3.4/multiprocessing/spawn.py", line 116, in _main
    exitcode = _main(fd)
  File "/usr/lib/python3.4/multiprocessing/spawn.py", line 116, in _main
    self = pickle.load(from_parent)
AttributeError: Can't get attribute 'f' on <module '__mp_main__' from '/home/colin/Desktop/test.py'>
    self = pickle.load(from_parent)
AttributeError: Can't get attribute 'f' on <module '__mp_main__' from '/home/colin/Desktop/test.py'>
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.4/multiprocessing/spawn.py", line 106, in spawn_main
    exitcode = _main(fd)
  File "/usr/lib/python3.4/multiprocessing/spawn.py", line 116, in _main
    self = pickle.load(from_parent)
AttributeError: Can't get attribute 'f' on <module '__mp_main__' from '/home/colin/Desktop/test.py'>

In this simple example, moving the function definition out of the __main__ if statement solves the problem. Have you tested examples/script.py to see if that gives the problem? I suspect it wouldn't based on this result, since the procedure definition is still importable automatically during the pickle process.

@guenp
Copy link
Contributor Author

guenp commented Jan 29, 2016

I tried the simple example. Moving the function out of the __main__ if statement didn't solve the problem for me. Maybe there's still a difference between Mac/Linux? Also, the TestProcedure definition is already outside the __main__ statement so that explains why the error didn't occur on your machine (except for the logging stuffing up).

@cjermain
Copy link
Member

I can reproduce what you say on Windows. It looks like there are still some differences in Linux spawn. It may be worth a bug report for CPython.

This behaviour implies that Procedures must be importable from a common file. In Jupyter you could use the %%writefile magic to write the procedure to a file. However, this eliminates the ability to programmability modify the Procedure, which is a really nice feature. What do you think?

@guenp
Copy link
Contributor Author

guenp commented Jan 31, 2016

The %%writefile magic works for me! :) because the Worker loads the procedure module into the spawned environment from scratch, you can still edit your procedure programatically by overwriting the procedures.py file: example. You could optionally also save a copy of the script.py or procedures.py file in the data folder/file or in a git repo for logging purposes.

@cjermain
Copy link
Member

cjermain commented Feb 8, 2016

This should be resolved by pull #13. Feel free to open it if not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants