Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initializer does not override global variable #206

Closed
luerhard opened this issue Apr 13, 2019 · 4 comments
Closed

initializer does not override global variable #206

luerhard opened this issue Apr 13, 2019 · 4 comments

Comments

@luerhard
Copy link

Hey there,
i am currently trying to use the initializer feature from the get_reusable_executor function.

Because i could not get the expected results, ich compared get_reusable_executor with ProcessPoolExecutor in saw a difference in outputs.

The reusable executor does not seem to override the global variable. Can you help me to figure this out?

A minimal example can be found below, the results are identical on loky version 2.4.2 and 2.5.0dev0.

from concurrent.futures import ProcessPoolExecutor
import os
import sys

import loky

print(f"loky: {loky.__version__}")
print(f"Python: {sys.version}")

testvar = 1

def func(i):
    global testvar
    return testvar

def init(i):
    global testvar
    testvar = i + os.getpid()
    print(testvar)

exe = loky.get_reusable_executor(max_workers=2, initializer=init, initargs=(2,))
print(list(exe.map(func, range(4))))
exe.shutdown()

print("-"*30)

with ProcessPoolExecutor(max_workers=2, initializer=init, initargs=(2,)) as exe:
    print(list(exe.map(func, range(4))))
loky: 2.5.0dev0
Python: 3.7.1 (default, Dec 14 2018, 19:28:38) 
[GCC 7.3.0]
28849
[1, 1, 1, 1]
28850
------------------------------
28853
28854
[28853, 28853, 28854, 28853]

thank you very much in advance.

Greetings,
Lukas

@tomMoral
Copy link
Collaborator

Hi Lukas,

This has to do with the behavior of cloudpickle in pickling the __main__ module, which override the value of global variable with the one from the main process at each call of a function.

To use a global variable in an initializer, the function and the initializer should be moved in a separate module:

# module.py
import os

testvar = 1

def func(i):
    global testvar
    return testvar

def init(i):
    global testvar
    testvar = i + os.getpid()
    print(testvar)
# main.py
from concurrent.futures import ProcessPoolExecutor
import os
import sys

import loky

from module import init, func

print(f"loky: {loky.__version__}")
print(f"Python: {sys.version}")

exe = loky.get_reusable_executor(max_workers=2, initializer=init, initargs=(2,))
print(list(exe.map(func, range(4))))
exe.shutdown()

print("-"*30)

with ProcessPoolExecutor(max_workers=2, initializer=init, initargs=(2,)) as exe:
    print(list(exe.map(func, range(4))))
$ python main.py 
loky: 2.4.2
Python: 3.7.0+ (heads/3.7:f9ae07441c, Aug 16 2018, 11:02:05) 
[GCC 8.2.0]
13685
13684
[13685, 13684, 13685, 13685]
------------------------------
13688
13689
[13688, 13689, 13688, 13688]

@luerhard
Copy link
Author

Thank you very much for your answer ! Would you happen to have an idea how this could be achieved inside a single Juypter Notebook? :)

I just found out about this project a couple of weeks ago and like it very much - thanks again for developing it !

@tomMoral
Copy link
Collaborator

tomMoral commented May 1, 2019

Thanks for your kind words!

If you really cannot have an extra module, a quick and dirty way to do it is to mutate a global variable from a module (such as the os module):

from concurrent.futures import ProcessPoolExecutor
import os
import sys

import loky

print(f"loky: {loky.__version__}")
print(f"Python: {sys.version}")

testvar = 1

def func(i):
    testvar = os.testvar
    return testvar

def init(i):
    os.testvar = i + os.getpid()
    print(os.testvar)

exe = loky.get_reusable_executor(max_workers=2, initializer=init, initargs=(2,))
print(list(exe.map(func, range(4))))
exe.shutdown()

print("-"*30)

with ProcessPoolExecutor(max_workers=2, initializer=init, initargs=(2,)) as exe:
    print(list(exe.map(func, range(4))))

This gives

loky: 2.4.2
Python: 3.7.0+ (heads/3.7:f9ae07441c, Aug 16 2018, 11:02:05) 
[GCC 8.2.0]
21246
21247
[21246, 21246, 21246, 21247]
------------------------------
21250
21251
[21250, 21251, 21250, 21250]

Note that this is "unsafe" as it can lead to weird behavior so it should not be used for code that need to be reliable.
We are working on providing a better way to do this but it is the only solution I can come up with quickly.

@luerhard
Copy link
Author

luerhard commented May 1, 2019

Oh wow - and there is something new I learned about python today. Thank you !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants