-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
socketserver.ForkingMixIn leaves zombies, also fails to reap all zombies in one pass #55318
Comments
This is the same issue as was reported here: http://bugs.python.org/issue1954. It is still a problem in Python 3.1. I'm writing a server that will be receiving a massive number of requests and I'd like to eliminate the zombie problem. Once I figured out what was going on, I tried adding a call to collect_children() in the serve_forever() loop. It worked very well. I've included a patch of what I did, however, I obviously can't leave this change in my socketserver.py because we will be deploying this on a lot of servers. Is there any reason not to collect_children() in the serve_forever() loop? It seems like the place to do it to me. The patch will only collect children if there are any, so it doesn't have to call it every time through the loop. |
I guess I didn't really explain the issue much. The problem is that if the server receives say, 10 requests at once, and then forks a process for each, after those processes finish they will sit as zombies until process_request() is called again, which calls collect_children(). So that patch simply moves the call to collect_children() into the serve_forever() loop so that it doesn't leave zombies sitting around. |
Rather than depending on the internal details of ForkingMixIn in your BaseServer.serve_forever modification I'd prefer to see that simply call self._cleanup() Define a do-nothing _periodic_cleanup method in BaseServer. ForkingMixIn should implement its own _periodic_cleanup method that does the active_children test and calls collect_children as appropriate. |
Good point. I was just writing up something quick that works. Here's another patch, is this acceptable? |
I believe that is good. I'll commit it after the 3.2 release has been cut (we're in release candidate release blocker only lockdown right now). Looking at ForkingMixIn.collect_children() there appears to be another buglet: it loops over self.active_children and calls self.active_children.remove(pid). This modification of the list while looping over it will cause it to skip the next item in the list. For every child waited on successfully, it skips checking one of the others. |
I noticed that ForkingMixIn also was overriding handle_timeout() trying to cleanup zombies after 300 seconds of inactivity, which is useless on a busy server. I'm replacing the patch with one that also removes handle_timeout(). |
I hope I did that last patch right. I did a 'diff -u' instead of a 'diff -c'. If you need something different, let me know. |
Sorry I keep plaguing this with comments and files, but I got to thinking, anyone should be able to override _loop_actions() and implement what they need in the loop. While it's probably a bad idea in most cases, there may be legitimate needs and it's always good to allow the flexibility. So, I'm adding one more patch that changes the name to loop_actions() and adds it to the list of methods that can be overridden in the BaseServer docstring. If you like it, keep it, if not, just use the last one. |
Just following up on this. Now that 3.2 is out, has the patch been committed? |
not yet, thanks for the reminder. if any other committers feel like jumping on this and doing it before I get around to it, feel free. |
Don't mean to nag. Just checking to see if anyone has taken it upon themselves to commit this yet since it's been a couple more months :P |
Justin, The patch and logic is okay. We can have this is 3.3.
A suggestion for better method name is a must! :) Thanks! |
New changeset e2363b1c4bca by Senthil Kumaran in branch 'default': New changeset 3e3cd0ed82bb by Senthil Kumaran in branch 'default': |
The feature is in 3.3. I did not remove the handle_timeout method from the Mixin class, because it might be used in the production by the existing servers. It is not appropriate to remove methods all of sudden without deprecation warnings and also it is not required to remove in this case. Added the Documentation and News entry too. |
This is fixed in 3.3 now. Keeping it open for test_socketserver update. After a ForkingServer service call, it should be asserted the collect_children routine is run. |
Please try to format NEWS entries using "Issue #xxx: xxx" instead of "Issue #xxx - xxx". |
Victor - Sure, I understand Issue #xxx: desc must be useful while generation reST docs. |
Sorry, I haven't had a chance to look at this in a couple days. I've been very busy with work. I'm not sure exactly how to write the test for this, so I don't know that I'd be much help there. One last thing, I was just looking through the commits and I noticed in the NEWS update (http://hg.python.org/cpython/rev/3e3cd0ed82bb/) you have my name as Justin Wark. The last name is actually Warkentin, I just didn't have anything showing that. That's my fault, sorry. I just updated my profile to show my full name. |
New changeset 991c24b8969d by R David Murray in branch '3.3': New changeset 1234300bc056 by R David Murray in branch 'default': |
I've created bpo-45935 for the missing test and will close this as resolved in version 3.3. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: