-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forking in a thread raises RuntimeError #51491
Comments
Python 2.6.4 (r264:75706, Oct 29 2009, 12:00:12) [C] on sunos5 On my sunos5 system if I call os.fork() in a thread created by The problem can be re-produced by running the script attached. I've looked into the source and it seems to me the following: Based on the the change above, it calls fork1() system call between a 3635 » _PyImport_AcquireLock(); _PyImport_ReleaseLock is called in both the child and parent. Digging
In the above code, if I interpret correctly, it compares the result of Based on my results on solaris the 'me' variable will be different in I'm using pthreads on both linux and solaris. On linux the code above is |
Well, there has been no such change between 2.6.3 and 2.6.4. |
Sorry, the working version is not 2.6.3 (I mistyped the version), it's |
This only appears to happen on Solaris. What version of Solaris are you I haven't look closely enough at the code yet, but reinitializing the |
I've tested it only on solaris 8, 32-bit. |
solaris 10 x86, 32-bit, sun-studio 11 is ok (in this case the parent's So it seems it's independent from sun-cc but depends from the |
I've attached a patch which seems to fix this issue. It sets |
Additional info: |
Could you please provide any advise on this bug? |
One relevant difference may be the change to the fork semantics in Solaris 10: fork() now means the same as fork1(). Before, fork() meant the same as forkall(), unless the applications was linked with -lpthread, in which case fork() also meant fork1(). See fork(2). So I'd be curious if a) the test case passes on Solaris 8 if fork1 is being used, and b) whether it passes if you make sure -lpthread is being used. If so, I'd rather fix this issue by changing the linkage for Solaris 8 (or declaring that system as unsupported altogether), instead of fiddling yet again with the fork handling. In Python, posix.fork definitely needs to mean "POSIX fork". If it's something else (i.e. the thread ID changes in Solaris 8 whether fork1 or forkall is used), then the patch is fine in principle. However, I would always reset the thread ID. In your patch, it is not clear why you apply this resetting to Solaris and AIX, but not other systems. |
I compile it with -lpthread. With os.fork1() I have the same results, RuntimeError. I was not able to compile it without pthread because I haven't found any configure options for that. If it's possible I'm happy to try it. In my patch I wanted to reduce the effect on systems where forking in thread is working (eg. linux), that's the reason why I added "(defined (__SVR4) && defined (__sun)". But it just checks for solaris, not the OS version (on solaris 10/intel my demo is working). (btw forking in thread actually happens in a unittest related to BaseHTTPServer, which obviously fails on my platform) |
I think that's inappropriate, please change that. |
Ok, here's the new patch. I've removed the #ifdef-#endif lines. It passed the test thread_test.py on linux (and as well on solaris). |
I verified patch_2.diff on Solaris 10 w/SOS11 and it actually resolves a number of issues I had with Mercurial. |
I verified that patch_2.diff resolves the problem for me on Solaris 9/SPARC. |
The patch should really add an unit test for this. |
There's a bundled unittest in test_httpservers which actually fails if this patch is not applied. See the unittest attached. Unfortunatelly RuntimError is raised in the child process after fork() so I cannot re-raise it to the parent, instead I send a message from the child to the parent via a pipe if the child is ok. |
I've run unittest with both patched and vanilla versions of python 2.6.4 and I confirm that it fails when it's run by vanilla on solaris 8, but passes on the patched version. |
I spent some time working on and testing a unit test as well. It's the same basic idea as Zsolt Cserna's, but with a slightly different approach. See 7242_unittest.diff. My unittest fails pre-patch and succeeds post-patch. However, I still have reservations about the patch. The existing test test_threading.ThreadJoinOnShutdown.test_3_join_in_forked_from_thread hangs with the patch in place. Vanilla 2.6.2 - test passes Note: the code of the test_threading test is identical in all 3 cases. I'd feel more confident about the patch if this test didn't hang with the patch in place. |
Using bpo-7242-gps01.diff on release26-maint and a freshly downloaded opensolaris 2009-06 VM test_thread, test_threading and test_subprocess all pass for me both before -and- after the patch. Nor does the original thread_test.py cause the problem for me (meaning I'm unable to reproduce the problem in this VM in the first place so... someone else needs to) Regardless, I altered patch_2 a bit in that diff to do what I though _PyImport_ReInitLock() should really do. I also added the thread_unittest testcase to that diff. % uname -a Can someone who could reproduce the problem in the first place please test that patch. The logic change makes sense to me. I don't know why test_threading.ThreadJoinOnShutdown.test_3_join_in_forked_from_thread would be changing behavior for you. |
The problem only seems to appear on Solaris 9 and earlier. I'll try to test the updated patch tonight or tomorrow and let you know what I find. |
If you have a chance tonight that'd be awesome. I'd love to get this in before 2.6.5rc1 (being cut tomorrow) but as its platform specific (and a pretty-old platform at that) its not worth holding up the release. |
I tested the updated patch, and the new unit test passes on my Sol 8 sparc, but the test_threading test still hangs on my system. However, given that the test is skipped on several platforms and it does work on more relevant versions of Solaris, it's probably OK. It's possible that an OS bug is causing that particular hang. Plus, the original patch fixed the 'real world' scenario I was running into, so I'd like to see it get into the release candidate if you're OK with it. |
Agreed. Committed to trunk r78527. I'll wait for the buildbots before merging to release26-maint. |
r78543 backports this to release26-maint, it should be in 2.6.5 assuming no show stopper issue comes up with this during the release candidates. |
Just adding some info: On Python 2.6.4, thread_test.py fails with the same RunTime error exception. On Python 2.6.6, it passes and things look good. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: