Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pebble.concurrent.process can not be used dynamically #80

Closed
wimglenn opened this issue Feb 10, 2021 · 2 comments
Closed

pebble.concurrent.process can not be used dynamically #80

wimglenn opened this issue Feb 10, 2021 · 2 comments
Labels

Comments

@wimglenn
Copy link
Contributor

wimglenn commented Feb 10, 2021

Reproducer:

import pebble.concurrent

def hello(who):
    print(f"Hello {who}!")

if __name__ == "__main__":
    hello = pebble.concurrent.process(hello)
    future = hello("world")
    future.result()

This errors out with a traceback like:

pebble.common.RemoteTraceback: Traceback (most recent call last):
  File "/private/tmp/.venv/lib/python3.9/site-packages/pebble/concurrent/process.py", line 189, in _function_lookup
    return _registered_functions[name]
KeyError: 'hello'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/private/tmp/.venv/lib/python3.9/site-packages/pebble/common.py", line 174, in process_execute
    return function(*args, **kwargs)
  File "/private/tmp/.venv/lib/python3.9/site-packages/pebble/concurrent/process.py", line 178, in _trampoline
    function = _function_lookup(name, module)
  File "/private/tmp/.venv/lib/python3.9/site-packages/pebble/concurrent/process.py", line 195, in _function_lookup
    return _registered_functions[name]
KeyError: 'hello'


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/private/tmp/eg.py", line 10, in <module>
    future.result()
  File "/usr/local/Cellar/python@3.9/3.9.1_6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 440, in result
    return self.__get_result()
  File "/usr/local/Cellar/python@3.9/3.9.1_6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
KeyError: 'hello'

The problem goes away if the hello = pebble.concurrent.process(hello) line is moved outside of the __name__ == "__main__" guard (or if the decorator form is used).

It seems the problem is that the handling in

except KeyError: # force function registering
__import__(module)
mod = sys.modules[module]
getattr(mod, name)
return _registered_functions[name]
assumes the function is decorated eagerly at import time, but that's not generally the case. Another manifestation of the same issue is that the decorator won't be working from an inner scope:

import pebble.concurrent

def outer():

    @pebble.concurrent.process
    def inner():
        pass

    future = inner()
    future.result()   # KeyError: 'outer.<locals>.inner'

if __name__ == "__main__":
    outer()

The reason I want to apply pebble.concurrent.process dynamically is that it allows passing daemon=False, which is a feature the process pool doesn't seem to offer (described more in #76). And the reason I don't have a simple callable that I can decorate with @pebble.concurrent.process at the module level scope is that my callables are actually dynamically loaded entrypoints.

The one-line fix in wimglenn@19bcc94 works for my use-case, but it makes a bunch of the tests fail in ways I couldn't understand, so it's probably not as simple as that.

I'm not sure what a fix here might look like, perhaps even just documenting the restricted situations in which these decorators can be used. There is an easy workaround: just defining a new module level wrapper function which accepts the dynamically generated callable and calls it, and then putting the decorator on that function instead.

noxdafox added a commit that referenced this issue Mar 4, 2021
Signed-off-by: Matteo Cafasso <noxdafox@gmail.com>
noxdafox added a commit that referenced this issue Mar 4, 2021
Signed-off-by: Matteo Cafasso <noxdafox@gmail.com>
@noxdafox
Copy link
Owner

noxdafox commented Mar 5, 2021

This is a fairly nasty can of worms.

To better understand what is happening under the hood, we need to understand how multiprocessing works across multiple platforms.

Currently, multiprocessing supports 3 process creation methods of which 2 are of interest to you (as you said in #76 to have a mac).

Fork

Posix standard process creation consists in cloning the parent process into a child which is the exact copy but with a separate memory address space.

In our context, this implies that when we run our decorated function, we simply pass the reference to it and everything works fine. In fact, both your examples above work on Linux and on MAC prior to Python 3.8.

Spawn

NT operating systems do not support the fork primitive and Apple has declared it unsafe to use on its platform (33725, Ruby take on that).

When a new process is started, it begins as a blank one. A brand new Python interpreter is launched and the target function module is then loaded. As the function reference cannot be inherited, its name is passed instead. The interpreter looks up the name within the module and runs the function.

This is where things get complicated. As we decorate the function, when the child invokes it, it actually runs the decorator anew and the cycle of life continues (if daemon=False) until the computer runs out of memory.

This is why we use a trampoline.

The parent actually calls a trampoline function and not the decorated one. When the child process starts and imports the module, the decorator stores the real function reference into a dict within the child memory. When the trampoline is finally executed, it looks up for the right function in the dict and executes it instead of the decorated one.

This works but has certain drawbacks. Among those, the fact that we cannot support inner scoping with the spawn start method. Reason for this is pretty obvious. As the child process did not get a chance to run the outer scope, our trampoline logic will not work.

Regarding the first issue instead (the dynamic use of the decorator) this is something I did not take into account and has been fixed.

I will update the documentation to inform regarding this known limitation.

noxdafox added a commit that referenced this issue Mar 6, 2021
Signed-off-by: Matteo Cafasso <noxdafox@gmail.com>
@noxdafox noxdafox added the bug label Mar 6, 2021
@noxdafox
Copy link
Owner

noxdafox commented Mar 6, 2021

Fix released in 4.6.1.

@noxdafox noxdafox closed this as completed Mar 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants