New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pywsgi cause file-descriptor's leak #649

Closed
byaka opened this Issue Sep 6, 2015 · 8 comments

Comments

Projects
None yet
2 participants
@byaka
Copy link

byaka commented Sep 6, 2015

First of all, i have javascript code, installed on some sites. Summary traffic from all this sites near 1.5kk per day (near 20 requests per second). This script request server via ajax with POST request.

now i run this simple server

# -*- coding: utf-8 -*-
import gevent
from gevent.pywsgi import WSGIServer
import gevent.monkey

def application(env, start_response):
   start_response('200 OK', [('Content-Type', 'application/json'), ('Access-Control-Allow-Headers', 'Origin, Authorization, X-Requested-With, Content-Type, Accept'), ('Access-Control-Max-Age','0'), ('Access-Control-Allow-Methods', 'GET, POST, OPTIONS'), ('Access-Control-Allow-Origin', '*'), ('Access-Control-Allow-Methods', 'GET, POST, OPTIONS')])
   return ["test\n"]

if __name__ == '__main__':
   print gevent.__version__
   print 'Running..'
   gevent.monkey.patch_all()
   WSGIServer(('0.0.0.0', 6001), application, log=None).serve_forever()

and add test ajax request in my javascript code to this server.

now i'm look to count of File-descriptors, used by process.

watch -d 'cd /proc/$PID$/fd && ls -l | wc -l'

and now i wait 1 hour. After it i see more then 200 file-descriptors, opened by server's process.
And this number incrementing forever, while not be a OS limit. And then process report error, of course.

Debian, Python 2.7, Gevent 1.0.2.

Any ideas?

@jamadden

This comment has been minimized.

Copy link
Member

jamadden commented Sep 15, 2015

I cannot reproduce any leak using command-line applications like httpie, wget or curl. Perhaps your client application is using HTTP connection keep-alive? (Using HTTP 1.1 and not setting the Connection: close request header.) gevent implements connection keep-alive and does not place an arbitrary time limit on how long it will allow a client to keep a connection alive, so many clients using keep-alive connections will occupy as many file descriptors as there are clients.

You can try setting the Connection: close response header to disable keep-alive even if the client supports it (thus forcing each request to use a new socket); however, this can significantly add to the cost of requests, especially if TLS/SSL is in use.

@jamadden jamadden added the question label Sep 15, 2015

@byaka

This comment has been minimized.

Copy link
Author

byaka commented Sep 18, 2015

i know, it cannot be reproduced with command line instruments. I'm try to debug this problem more then 2 month. It happened only with some javascript clients. But not all, and my browsers not invoke this problem. It's very strange. And it happened with all my servers (4 for now) with different Debian (6 and 7) and different Python (2.7 and 2.6). And with different Gevent too (0.13.6 and 1.0.2).
I add one trick to my servers. Server checking count of descriptors in background and restart listener socket when this count is critical. It work, but it ugly :(

Clients really use Keep-Alive. But clients is simple web pages (in different sites) and is not real that peoples never close pages. Is it possible to add Connection-Timeout to server and permanently close connections?

@jamadden jamadden added the invalid label Sep 18, 2015

@jamadden

This comment has been minimized.

Copy link
Member

jamadden commented Sep 18, 2015

Given that this only occurs for some clients and isn't reproducible without usingt hose particular clients, this doesn't seem to be any sort of bug or problem in gevent, it is behaving as designed. If you have clients that are using "too many" connections you have several options including:

  • have your application use the Connection: close header to close connections
  • use an upstream reverse proxy such as HAProxy or nginx to close connections

Any of these options can be implemented with the logic that best suits your application (E.g., for certain clients, after certain times, etc).

@jamadden jamadden closed this Sep 18, 2015

@byaka

This comment has been minimized.

Copy link
Author

byaka commented Sep 18, 2015

I understand it, but what about Connection-Timeout for PyWSGI?

@jamadden

This comment has been minimized.

Copy link
Member

jamadden commented Sep 18, 2015

I would be happy to review a PR adding that feature. As a new feature this late in the beta stage, it's unlikely to be merged for 1.1.

I left it off the list of options, but you can probably subclass gevent.pywsgi.WSGIHandler pretty easily to implement a "timeout" in your application.

@byaka

This comment has been minimized.

Copy link
Author

byaka commented Sep 18, 2015

Can u give me simple example, please?

@jamadden

This comment has been minimized.

Copy link
Member

jamadden commented Sep 18, 2015

This is just off the top of my head and not tested but you should get the basic idea:

class MyHandler(WSGIHandler):

    def handle(self):
       timeout = gevent.Timeout.start_new(my_timeout)
       try:
           WSGIHandler.handle(self)
       except gevent.Timeout as ex:
           if ex is timeout:
             # We timed out, take appropriate action.
             # NOTE: The socket is already closed at this point
             pass
           else: 
              raise
@byaka

This comment has been minimized.

Copy link
Author

byaka commented Sep 18, 2015

Thx😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment