-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow registering at-fork handlers #60704
Comments
I propose the addition of an 'afterfork' module. The module shall fulfill a similar task as the 'atexit' module except that it handles process forks instead of process shutdown. The 'afterfork' module shall allow libraries to register callbacks that are executed on fork() inside the child process and as soon as possible. Python already has a function that must be called by C code: PyOS_AfterFork(). The 'afterfork' callbacks are called as the last step in PyOS_AfterFork(). Use case example: Open questions: Implementation: |
pthread_atfork() allows the registering of three types of callbacks:
I think all three should be supported. I also think that a recursive "fork lock" should be introduced which is held during the fork. This can be acquired around critical sections during which forks must not occur. This is more or less a duplicate of bpo-6923. See also bpo-6721. |
Thanks Richard! My first reaction was YAGNI but after I read the two tickets I now understand the need for three different hooks. I suggest that we implement our own hooks like the http://linux.die.net/man/3/pthread_atfork function, especially the order of function calls: The parent and child fork handlers shall be called in the order in which they were established by calls to pthread_atfork(). The prepare fork handlers shall be called in the opposite order. I like to focus on three hooks + the Python API and leave the usage of the hooks to other developers. Proposal:
I'm not yet sure how to implement the Python API. I could either implement six methods: atfork.register_before_fork(callable, *args, **kwargs)
atfork.register_after_fork_child(callable, *args, **kwargs)
atfork.register_after_fork_parent(callable, *args, **kwargs)
atfork.unregister_before_fork(callable)
atfork.unregister_after_fork_child(callable)
atfork.unregister_after_fork_parent(callable) or two: atfork.register(prepare=None, parent=None, child=None, *args, **kwargs)
atfork.unregister(prepare=None, parent=None, child=None) |
Note that Gregory P. Smith has written
I also started a pure python patch but did not get round it posting it. (It also implements the fork lock idea.) I'll attach it here. How do you intend to handle the propagation of exceptions? I decided that after atfork.atfork(prepare1, parent1, child1)
atfork.atfork(prepare2, parent2, child2)
...
atfork.atfork(prepareN, parentN, childN) calling "pid = os.fork()" should be equivalent to pid = None
prepareN()
try:
...
prepare2()
try:
prepare1()
try:
pid = posix.fork()
finally:
parent1() if pid != 0 else child1()
finally:
parent2() if pid != 0 else child2()
...
finally:
parentN() if pid != 0 else childN() |
I would not allow exceptions to propagate. No caller is expecting them. |
pthread_atfork() cannot be used to implement this. Another non-python At best, this can be implemented manually as we do with some of the before On Mon, Nov 19, 2012 at 3:59 PM, Gregory P. Smith <report@bugs.python.org>wrote:
|
Meh! Exception handling takes all the fun of the API and is going to make it MUCH more complicated. pthread_atfork() ignores error handling for a good reason. It's going to be hard to get it right. :/ IFF we are going to walk the hard and rocky road of exception handling, then we are going to need at least four hooks and a register function that takres four callables as arguments: register(prepare, error, parent, child). Each prepare() call pushes an error handling onto a stack. In case of an exception in a prepare handler, the error stack is popped until all error handlers are called. This approach allows a prepare handler to actually prevent a fork() call from succeeding. The parent and child hooks are always called no matter what. Exception are recorded and a warning is emitted when at least one hook fails. We might raise an exception but it has to be a special exception that ships information if fork() has succeeded, if the code runs in child or parent and about the child's PID. I fear it's going to be *really* hard to get everything right. Gregory made a good point, too. We can rely on pthread_atfork() as we are unable to predict how third party code is using fork(): "Take cover, dead locks ahead!" :) A cooperative design of the C API with three function is my preferred way, too. PyOS_AfterForkParent() should take an argument to signal a failed fork() call. |
2012/11/20 Christian Heimes <report@bugs.python.org>
FWIW, PyPy already has a notion of fork hooks: Various subsystems (threads cleanup, import lock, threading.local...) You may want to experiment from there :-) |
I think there are two main options if a prepare callback fails:
I favour option 1 since, if they want, users can always wrap their prepare callbacks with try: With option 1 I don't see why error callbacks are necessary. Just unwind the stack of imaginary try...finally... clauses and let any exceptions propagate out using exception chaining if necessary. This is what pure-python-atfork.patch does. Note, however, that if the fork succeeds then any subsequent exception is only printed. |
Amaury: Richard: I can think of six exception scenarios that must be handled: (1) exception in a prepare hook -> don't call the remaining prepare hooks, run all related parent hooks in FILO order, prevent fork() call Do you agree? |
Probably because nobody thought about it. |
I think you are solving a non-problem if you want to expose exceptions from |
Agreed. |
Your suggestion is that the hooks are called as: for hook in hooks:
try:
hook()
except:
try:
sys.excepthook(*sys.exc_info())
except:
pass That makes the implementation much easier. :) |
"The tempfile module has a specialized RNG that re-initialized the RNG after fork() by comparing os.getpid() to an instance variable every time the RNG is accessed. The check can be replaced with an afterfork callback." By the way, OpenSSL expects that its PRNG is reseed somehow (call RNG_add) after a fork. I wrote a patch for OpenSSL, but I don't remember if I sent it to OpenSSL. Reseeding tempfile PRNG is useless (but spend CPU/memory/hang until we have enough entropy?) if the tempfile is not used after fork. I like the current approach. -- I'm not saying that a new atfork module would not help, just that the specific case of tempfile should be discussed :-) I like the idea of a generic module to call code after fork. |
Might make sense to put this in atexit.atfork() to avoid small-module inflation? |
It sounds strange to mix "at exit" and "at fork" in the same module. 2013/1/13 Arfrever Frehtes Taifersar Arahesis <report@bugs.python.org>:
|
On 13.01.2013 00:37, STINNER Victor wrote:
Apparently not, and according to this thread, they don't think http://openssl.6102.n7.nabble.com/recycled-pids-causes-PRNG-to-repeat-td41669.html Note that you don't have to reseed the RNG just make sure that the |
That's true. The sys module would probably be the right place for both functionalities. |
Richard, do you have time to get your patch ready for 3.4? |
Yes. But we don't seem to have concensus on how to handle exceptions. The main question is whether a failed prepare callback should prevent the fork from happenning, or just be printed. |
Yes, I think an exception should prevent the fork from happening. It's fail-safe for the PRNG case (you can guarantee that a fork won't occur without properly re-seeding a PRNG), and it makes bugs easier to catch in unit testing. |
+1 for exception prevents fork |
I have a couple random remarks:
|
API-wise, I went for the minimal route. This avoids any discussion of adding a separate module for a tiny functionality that is only going to be used by a couple of libraries (and probably no application code). Comparisons with atexit are not really relevant, IMO, since the use cases are very different. As for passing explicit arguments to the callable, people can use a lambda or functools.partial. I don't want to complicate the C implementation with matters that are not really important. |
I agree that with lambdas and functools.partial the support of arguments is not needed. Unless someone has good reasons for supporting explicit passing of arguments I'm fine with your design. If the user code manually calls the fork() and (now deprecated) PyOS_AfterFork(), it runs "child" functions, but not "before" and "parent" functions. Is it worth to emit a runtime warning if the "before" or "parent" lists are not empty? Or even immediately exit a child with error? |
I don't think exiting would be a good idea at all. I'm not sure about emitting a warning: the problem is that the people seeing the warning (plain users) can't do anything about it; worse, they probably will not even understand what it is about, and will get confused. The package maintainer should see a deprecation warning when compiling code using PyOS_AfterFork() (thanks to the Py_DEPRECATED attribute). Of course, I'll also update the docs to add a deprecation once the design is settled. |
PyOS_AfterFork() is used by one code, but the "before" and "parent" handlers are registered by the other code (likely the code of other library). The author of the program that uses both libraries (the one that uses PyOS_AfterFork() and the one that registers handlers) can notice the warning and report the bug in the first library. I think silently skipping registered handlers would be worse. Let allow the user to control the behavior by setting the warnings filter. If the corresponding warnings are ignored, nothing happen, if they are errors, the child is exited, otherwise the warning message is output on stderr as for other warnings. |
Right now PyOS_AfterFork() doesn't return an error code. It is not obvious how the caller would react: simply print the error? raise a fatal error? Something else? The only third-party use of PyOS_AfterFork() I found is in uwsgi (and I'm not sure they're using it correctly, since I don't know if the parent process is using Python at all...). |
Could you please update the documentation Antoine?
|
Can threading._after_fork() be rewritten with using the new API? |
Le 28/05/2017 à 11:22, Serhiy Storchaka a écrit :
It should be possible indeed. Let me see. |
Can multiprocessing.util.register_after_fork() be rewritten with using the new API? |
It wouldn't benefit much from it, and there might be timing issue given the comments in BaseProcess._bootstrap(): old_process = _current_process
_current_process = self
try:
util._finalizer_registry.clear()
util._run_after_forkers()
finally:
# delay finalization of the old process object until after
# _run_after_forkers() is executed
del old_process |
In PR 1834 Gregory proposes an alternate API: os.register_at_fork(*, before=None, after_in_parent=None, after_in_child=None) Maybe open a new issue for this? |
I'll just reuse this issue, I started that work just to try and clean up the new API and docs added in this one to be more modern. Thanks for the code review, good suggestions! |
The one of potential advantages of the API proposed by Gregory is that os.register_at_fork() can be made atomic. Either register all callbacks, or do nothing in the case of error. But current proposed implementation is not atomic. If resizing of some list is failed due to MemoryError (or may be KeyboardInterrupt), some callbacks can already be registered. Is it worth to guarantee the atomicity of os.register_at_fork(). |
Le 29/05/2017 à 18:35, Serhiy Storchaka a écrit :
I don't think so, honestly. |
Since there was a disagreement, FYI I also prefer this API. Just the Anyway, it's nice to see this very old issue finally fixed! It was I also prefer to have a builtin registered callback in the random |
This may also allow us to do something reasonable for http://bugs.python.org/issue6721 as well. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: