Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os.popen in gunicorn ggevent worker class encounter "IOError: [Errno 10] No child processes" #696

Closed
frostyplanet opened this issue Dec 6, 2015 · 3 comments

Comments

@frostyplanet
Copy link

This issue was introduced in v1.1b4. (not present in v1.1b3)

How to reproduce:

  1. write a flask app named test_dll.py
#!/usr/bin/env python
# coding:utf-8

from flask import Flask
app = Flask("test")

from ctypes.util import find_library
print find_library("X11")

@app.route("/")
def foo():
        return "test"
  1. run test_dll.py with gunicorn 19.4

gunicorn -k gevent test_dll2:app

[2015-12-06 14:56:39 +0000] [726] [INFO] Starting gunicorn 19.4.1
[2015-12-06 14:56:39 +0000] [726] [INFO] Listening at: http://127.0.0.1:8000 (726)
[2015-12-06 14:56:39 +0000] [726] [INFO] Using worker: gevent
[2015-12-06 14:56:39 +0000] [731] [INFO] Booting worker with pid: 731
[2015-12-06 14:56:39 +0000] [731] [ERROR] Exception in worker process:
Traceback (most recent call last):
  File "/root/gunicorn/gunicorn/arbiter.py", line 515, in spawn_worker
    worker.init_process()
  File "/root/gunicorn/gunicorn/workers/ggevent.py", line 201, in init_process
    super(GeventWorker, self).init_process()
  File "/root/gunicorn/gunicorn/workers/base.py", line 122, in init_process
    self.load_wsgi()
  File "/root/gunicorn/gunicorn/workers/base.py", line 130, in load_wsgi
    self.wsgi = self.app.wsgi()
  File "/root/gunicorn/gunicorn/app/base.py", line 67, in wsgi
    self.callable = self.load()
  File "/root/gunicorn/gunicorn/app/wsgiapp.py", line 65, in load
    return self.load_wsgiapp()
  File "/root/gunicorn/gunicorn/app/wsgiapp.py", line 52, in load_wsgiapp
    return util.import_app(self.app_uri)
  File "/root/gunicorn/gunicorn/util.py", line 354, in import_app
    __import__(module)
  File "/root/gevent/gevent/builtins.py", line 53, in __import__
    result = _import(*args, **kwargs)
  File "/root/test_dll2.py", line 8, in <module>
    print find_library("X11")
  File "/usr/lib64/python2.7/ctypes/util.py", line 244, in find_library
    return _findSoname_ldconfig(name) or _get_soname(_findLib_gcc(name))
  File "/usr/lib64/python2.7/ctypes/util.py", line 237, in _findSoname_ldconfig
    f.close()
IOError: [Errno 10] No child processes

[2015-12-06 14:56:39 +0000] [731] [INFO] Worker exiting (pid: 731)
[2015-12-06 14:56:39 +0000] [726] [INFO] Shutting down: Master
[2015-12-06 14:56:39 +0000] [726] [INFO] Reason: Worker failed to boot.
@jamadden
Copy link
Member

jamadden commented Dec 6, 2015

Can you reproduce this without any third-party dependencies? Based on your description, I would expect to be able to produce this with a simple use of os.popen or ctypes.util.find_library in a monkey-patched system. But I cannot reproduce this (testing on ubuntu 15.10). The following code works as expected:

Python 2.7.10 (default, Oct 14 2015, 16:09:02)
[GCC 5.2.1 20151010] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import gevent.monkey
>>> gevent.monkey.patch_all()
>>> import gevent
>>> gevent.__version__
'1.1rc1'
>>> import ctypes.util
>>> ctypes.util.find_library('X11')
'libX11.so.6'
>>> import os
>>> f = os.popen('type gcc')
>>> f.read()
'gcc is /usr/bin/gcc\n'
>>> f.close()
>>> os.fork.func_code # prove we're patched
<code object fork at 0x7faf91c57cb0, file "/.../gevent/local/lib/python2.7/site-packages/gevent/os.py", line 372>

@frostyplanet
Copy link
Author

ok, I 'll try to figure out how to reproduce without only gevent.

@jamadden
Copy link
Member

jamadden commented Dec 6, 2015

This program produces the error without using gevent at all:

import os
import signal

def handle(*args):
    os.waitpid(-1, os.WNOHANG)
signal.signal(signal.SIGCHLD, handle)

pid = os.fork()

if pid: # parent
    os.waitpid(-1, 0)
else: # child
    import ctypes.util
    print ctypes.util.find_library('X11')
Traceback (most recent call last):
  File "/home/adminuser/test.py", line 17, in <module>
    print ctypes.util.find_library('X11')
  File "/usr/lib/python2.7/ctypes/util.py", line 253, in find_library
    return _findSoname_ldconfig(name) or _get_soname(_findLib_gcc(name))
  File "/usr/lib/python2.7/ctypes/util.py", line 246, in _findSoname_ldconfig
    f.close()
IOError: [Errno 10] No child processes

That is, if any SIGCHLD handler is installed that does a global waitpid, there's a race condition with os.popen: os.popen returns a file object that calls pclose when it is closed, and pclose wants to wait for the child; any error in waiting for the child is raised by pclose/closing the file.

Gunicorn's Arbiter (master) process installs a SIGCHLD handler that ultimately does exactly this. Workers try to reset this to SIG_DFL, but usage of the subprocess module would reset that again. I think I can make the race condition smaller, but I don't know if I can close it completely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants