-
-
Notifications
You must be signed in to change notification settings - Fork 31.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python crashes on macOS after fork with no exec #77906
Comments
This issue seems to be reported a few times on various githubs projects. I've also reproduced using a brew install of python 2.7.15. I haven't been able to reproduce with python 3.6. Note this requires a framework build of python. Background on the underlying issue cause due to a change in high Sierra The work around seems to be setting an environment variable OBJC_DISABLE_INITIALIZE_FORK_SAFETY prior to executing python. Other reports https://bugs.python.org/issue30837 |
A better solution is to avoid using fork mode for multiprocessing. The spawn and fork server modes should work fine. The underlying problem is that macOS system frameworks (basically anything higher level than libc) are not save wrt fork(2) and fixing that appears to have no priority at all at Apple. |
(As a side note, the macOS Pythons provided by python.org installers should not behave differently on macOS 10.13 High Sierra since none of them are built with a 10.13 SDK.) |
I understand that Apple, with their limited resources, cannot spend expensive engineer manpower on improving POSIX support in macOS </snark>. In any case, I'm unsure this bug can be fixed at the Python level. If macOS APIs don't like fork(), they don't like fork(), point bar. As Ronald says, on 3.x you should use "forkserver" (for multiple reasons, not only this issue). On 2.7 you're stuck dealing with the issue by yourself. |
Antoine, the issue is not necessarily related to POSIX compliance, AFAIK strictly POSIX compliant code should work just fine. The problem is in higher-level APIs (CoreFoundation, Foundation, AppKit, ...), and appears to be related to using multi-threading in those libraries without spending effort on pre/post fork handlers to ensure that new processes are in a sane state after fork(). In older macOS versions this could result in hard to debug issues, in newer versions APIs seem to guard against this by aborting when the detect that the pid changed. Anyways... I agree that we shouldn't try to work around this in CPython, there's bound to more problems that are hidden with the proposed workaround. --- <http://www.sealiesoftware.com/blog/archive/2017/6/5/Objective-C_and_fork_in_macOS_1013.html\> describes what the environment variable does, and this "just" changes behavior of the ObjC runtime, and doesn't make using macOS system frameworks after a fork saver. |
@ned: In the long run the macOS installers should be build using the latest SDK, primarily to get full API coverage and access to all system APIs. AFAIK building using the macOS 10.9 SDK still excludes a number of libSystem APIs that would be made available through the posix module when building with a newer SDK. That's something that would require some effort though to ensure that the resulting binary still works on older versions of macOS (basically similar to the work I've done in the post to weak link some other symbols in the posix module). |
{Note: this is not particularly relevant to the issue here.) Ronald:
I agree that being able to build with the latest SDK would be nice but it's also true it would require effort on our part, both one-time and ongoing, at least for every new macOS SDK release and update to test with each older system. It would also require that the third-party libraries we build for an installer also behave correctly. And to make full use of it, third-party Python packages with extension modules would also need to behave correctly. I see one of the primary use cases for the python.org macOS installers as being for Python app developers who want to provide apps that run on a range of macOS releases. It seems to me that the safest and simplest way to guarantee that python.org macOS Pythons fulfill that need is to continue to always build them on the oldest supported system. Yes, that means that users may miss out on a few features only supported on the more recent macOS releases but I think that's the right trade-off until we have the resources to truly investigate and decide to support weak linking from current systems. |
bpo-35219 is where I've run into this problem. I'm still trying to figure out all the details in my own case, but I can confirm that setting the environment variable does not always help. |
Hoo boy. I'm not sure I have the full picture, but things are starting to come into focus. After much debugging, I've narrowed down at least one crash to urllib.request.getproxies(). On macOS (darwin), this ends up calling _scproxy.get_proxies() which calls into the SystemConfiguration framework. I'll bet dollars to donuts that that calls into the ObjC runtime. Thus it is unsafe to call between fork and exec. This certainly seems to be the case even if the environment variable is set. The problem is that I think requests.post() probably also ends up in here somehow (still untraced), because by removing our call to urllib.requests.getproxies(), we just crash later on when requests.post() is called. I don't know what, if anything can be done in Python, except perhaps to document that anything that calls into the ObjC runtime between fork and exec can potentially crash the subprocess. |
A few other things I don't understand:
|
FWIW, I suspect that setting the environment variable only helps if it's done before the process starts. You cannot set it before the fork and have it affect the child. |
Barry's effort as well as comments in other links seem to all suggest that OBJC_DISABLE_INITIALIZE_FORK_SAFETY is not comprehensive in its ability to make other threads "safe" before forking. "Objective-C classes defined by the OS frameworks remain fork-unsafe" (from @kapilt's first link) suggests we furthermore remain at risk using certain MacOS system libraries prior to any call to fork. "To guarantee that forking is safe, the application must not be running any threads at the point of fork" (from @kapilt's second link) is an old truth that we continue to fight with even when we know very well that it's the truth. For newly developed code, we have the alternative to employ spawn instead of fork to avoid these problems in Python, C, Ruby, etc. For existing legacy code that employed fork and now surprises us by failing-fast on MacOS 10.13 and 10.14, it seems we are forced to face a technical debt incurred back when the choice was first made to spin up threads and afterwards to use fork. If we didn't already have an "obvious" (zen of Python) way to avoid such problems with spawn versus fork, I would feel this was something to solve in Python. As to helping the poor unfortunate souls who must fight the good fight with legacy code, I am not sure what to do to help though I would like to be able to help. |
Legacy code is easy to migrate as long as it uses Python 3. Just call mp.set_start_method('forkserver') at the top of your code and you're done. Some use cases may fail (if sharing non-picklable types), but they're probably not very common. |
_scproxy has been known to be problematic for some time, see for instance bpo-31818. That issue also gives a simple workaround: setting urllib's "no_proxy" environment variable to "*" will prevent the calls to the System Configuration framework. |
Given the original post mentioned 2.7.15, I wonder if it is feasible to fork near the beginning of execution, then maintain and pass around a multiprocessing.Pool to be used when needed instead of dynamically forking? Working with legacy code is almost always more interesting than you want it to be. |
On Nov 14, 2018, at 10:11, Davin Potts <report@bugs.python.org> wrote:
Right. Setting the env var will definitely not make it thread safe. My understanding (please correct me if I’m wrong!) isn’t that this env var makes it safe, just that it prevents the ObjC runtime from core dumping. So it’s still up to the developer to know whether threads are involved or not. In our cases, these are single threaded applications. I’ve read elsewhere that ObjC doesn’t care if threads have actually been spun up or not.
Actually, it’s unsafe to call anything between fork and exec. Note that this doesn’t just affect Python; this is a pretty common idiom in other scripting languages too, from what I can tell. It’s certainly very common in Python. Note too that urllib.request.getproxies() will end up calling into the ObjC runtime via _scproxy, so you can’t even use requests after a fork but before exec. What I am still experimenting with is to see if I can define a pthread_atfork handler that will initialize the ObjC runtime before fork is actually called. I saw a Ruby approach like this, but it’s made more difficult in Python because pthread_atfork isn’t exposed to Python. I’m trying to see if I can implement it in ctypes, before I write an extension.
True, but do realize this problem affects you even in single threaded applications.
It’s tech debt you incur even if you don’t spin up threads. Just fork and do some work in the child before calling exec. If that work enters the ObjC runtime (as in the getproxies example), your child will coredump,
*If* we can provide a hook to initialize the ObjC runtime in pthread_atfork, I think that’s something we could expose in Python. Then we can say legacy code can just invoke that, and at least you will avoid the worst outcome. |
I have a reliable way to call *something* in the pthread_atfork prepare handler, but I honestly don't know what to call to prevent the crash. In the Ruby thread, it seemed to say that you could just dlopen /System/Library/Frameworks/Foundation.framework/Foundation but that does not work for me. Neither does also loading the CoreFoundation and SystemConfiguration frameworks. If anybody has something that will reliably initialize the runtime, I can post my approach (there are a few subtleties). Short of that, I think there's nothing that can be done except ensure that exec is called right after fork. |
AFAIK there is nothing you can do between after calling fork(2) to "reinitialise" the ObjC runtime. And I don't think that's the issue anyway: I suspect that the actual problem is that Apple's system frameworks use multithreading (in particular libdispatch) and don't have code to ensure a sane state after calling fork. In Python 3 there is another workaround to avoid problems using multiprocessing: use multiprocessing.set_start_method() to switch away from the "fork" startup handler to "spawn" or "forkserver" (the latter only when calling set_start_method before calling any code that might call into Apple system frameworks. |
Since it looks like multiprocessing_fork is not going to be fixable for macOS, the main issue remaining is how to help users avoid this trap (literally). Should we add a check and issues a warning or error at run time? Or is a doc change sufficient? In the meantime, I've merged changes to disable running test_multiprocessing_fork which will sometimes (but not always) segfault on 10.14 Mojave. I should apologize to Barry and others who have run into this. I did notice the occasional segfault when testing with Mojave just prior to its release but it wasn't always reproducible and I didn't follow up on it. Now that the change in 10.14 behavior makes this existing problem with fork no exec more obvious, it's clear that the test segfaults are another manifestation of this. |
Do we really need to disable the running of test_multiprocessing_fork entirely on MacOS? My understanding so far is that not *all* of the system libraries on the mac are spinning up threads and so we should expect that there are situations where fork alone may be permissible, but of course we don't yet know what those are. Pragmatically speaking, I have not yet seen a report of test_multiprocessing_fork tests triggering this problem but I would like to see/hear that when it is observed (that's my pitch for leaving the tests enabled). |
@ned.deily: Apologies, I misread what you wrote -- I would like to see the random segfaults that you were seeing on Mojave if you can still point me to a few. |
We run pytest with `--forked` in nixpkgs, to reduce side effects that can occur when multiple tests mutate their environment in incompatible ways. Forking on macOS 10.13 and later is unsafe when an application does work between calls to fork() and its followup exec(). This may lead to crashes when calls into the Objective-C runtime are issued, which will in turn coredump the Python interpreter. One good reproducer for this scenario is when the urllib module tries to lookup proxy configurations in `urllib.request.getproxies()` through `get_proxies_macos_sysconf` into the native `_scproxy` module. This is a class of issues that is of course not limited to the urllib module. The general recommendation is to use `spawn` instead of `fork`, but we don't have any influence on upstream developers to do one or the other. One often cited workaround would be to disable fork safety entirely on calls to `initialize()`, which is probably a better solution than running without multithreading (slow) or without the `--forked` (prone to side effects) mode. This currently happens on aarch64-linux only, where we use more recent 11.0 SDK version, while x86_64-darwin has been stuck on 10.12 for a while now. python/cpython#77906 (comment) http://www.sealiesoftware.com/blog/archive/2017/6/5/Objective-C_and_fork_in_macOS_1013.html Closes: NixOS#194290
In Python multiprocessing, start method name is platform dependent [1]. For macOS, it is "spawn" for reasons listed in [2], vs. "fork" for Linux. [1] https://docs.python.org/3/library/multiprocessing.html#multiprocessing.get_start_method [2] python/cpython#77906
In Python multiprocessing, start method name is platform dependent [1]. For macOS, it is "spawn" for reasons listed in [2], vs. "fork" for Linux. [1] https://docs.python.org/3/library/multiprocessing.html#multiprocessing.get_start_method [2] python/cpython#77906
In Python multiprocessing, start method name is platform dependent [1]. For macOS, it is "spawn" for reasons listed in [2], vs. "fork" for Linux. [1] https://docs.python.org/3/library/multiprocessing.html#multiprocessing.get_start_method [2] python/cpython#77906
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: