Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: maximum recursion depth exceeded #1034

Closed
fatelei opened this issue May 20, 2015 · 11 comments
Closed

RuntimeError: maximum recursion depth exceeded #1034

fatelei opened this issue May 20, 2015 · 11 comments

Comments

@fatelei
Copy link

fatelei commented May 20, 2015

Work gunicorn with tornado,when tornado server worker accpet a new connection it sometimes will lead to RuntimeError: maximum recursion depth exceeded.

The gunicorn always calls the code below until reaches the maximum recursion depth:

def __getattr__(self, name):
     return getattr(self.sock, name)

Related package version:

  • tornado == 4.0.2
  • gunicorn == 19.1.0
@fatelei
Copy link
Author

fatelei commented May 20, 2015

How can I fix this bug?

@benoitc
Copy link
Owner

benoitc commented May 20, 2015

@fatelei you can set in the post_fork hook the recursion limit using sys.setrecursionlimit

@tilgovi @berkerpeksag what do you think about increasing it in the tornado worker directly?

@berkerpeksag
Copy link
Collaborator

I'm not familiar with both gunicorn/sock.py and Tornado, but I think we shouldn't play with sys.setrecursionlimit. Recursive error is probably sign of a bug in Gunicorn.

Perhaps something like

def __getattr__(self, name):
        if name == 'sock':
            raise AttributeError
        return getattr(self.sock, name)

would work (didn't tested).

@berkerpeksag
Copy link
Collaborator

@fatelei could you please paste the full traceback?

@berkerpeksag
Copy link
Collaborator

If *Socket.set_options() fails with an exception, *Socket.__getattr__ can cause the recursion error.

@benoitc
Copy link
Owner

benoitc commented May 20, 2015

@berkerpeksag right! the change looks OK for me. We should have a way to reproduce it though

@fatelei
Copy link
Author

fatelei commented May 20, 2015

I have figured out the reason. When gunicorn graceful restarts the worker, in tornado ioloop there is a READ event occurs so that the server socket start to accept a new connection. This code locates in tornado.netutil line 194. I past it below.

 def accept_handler(fd, events):
        # More connections may come in while we're handling callbacks;
        # to prevent starvation of other tasks we must limit the number
        # of connections we accept at a time.  Ideally we would accept
        # up to the number of connections that were waiting when we
        # entered this method, but this information is not available
        # (and rearranging this method to call accept() as many times
        # as possible before running any callbacks would have adverse
        # effects on load balancing in multiprocess configurations).
        # Instead, we use the (default) listen backlog as a rough
        # heuristic for the number of connections we can reasonably
        # accept at once.
        for i in xrange(_DEFAULT_BACKLOG):
            try:
                connection, address = sock.accept()
            except socket.error as e:
                # _ERRNO_WOULDBLOCK indicate we have accepted every
                # connection that is available.
                if errno_from_exception(e) in _ERRNO_WOULDBLOCK:
                    return
                # ECONNABORTED indicates that there was a connection
                # but it was closed while still in the accept queue.
                # (observed on FreeBSD).
                if errno_from_exception(e) == errno.ECONNABORTED:
                    continue
                raise
            callback(connection, address)

But the sock variable has been wrapped by gunicorn's BaseSocket that locates in gunicorn.sock.
When it accesses the accept attribute, it will goto the code:

def __getattr__(self, name):
     return getattr(self.sock, name)

When gunicorn graceful reloads, this sock has been setted to None. Call sock.accept will be getattr(self.sock, 'accept') -> ... -> getattr(self.sock, 'accept') until the error occurs. Because current sock instance has no attribute named accept.

Maybe, the code can be changed to:

def __getattr__(self, name):
      if self.sock:
         return getattr(self.sock, name)
      else:
         raise AttributeError(name)

@benoitc

@tilgovi
Copy link
Collaborator

tilgovi commented May 20, 2015

Sounds good. Open a pull request and let's get this fixed.

@tilgovi
Copy link
Collaborator

tilgovi commented May 20, 2015

Thanks for the investigation.

@fatelei
Copy link
Author

fatelei commented May 21, 2015

I find pull-1033 will fix this bug. it ensure when server close the socket, the ioloop will not occur READ event.

I will close this issue, when gunicorn-19.4 is released.

Thanks everyone.

@berkerpeksag
Copy link
Collaborator

Fixed by d9b8959.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants