New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fd leaks #9
Comments
It's clear that it's coming from the client-side of powerhose, since there's no runner crypto workers on stage2, and most gunicorn process eat more than 10k fds: token2
[root@token2 tarek]# ps aux|grep guni
token 847 19.9 0.5 19398712 393676 ? Sl 01:24 0:04 /usr/bin/python /usr/bin/gunicorn -k gevent -w 12 -b 127.0.0.1:8000 tokenserver.run:application
token 9619 11.7 0.7 32770976 487452 ? Sl 01:24 0:05 /usr/bin/python /usr/bin/gunicorn -k gevent -w 12 -b 127.0.0.1:8000 tokenserver.run:application
token 10343 24.8 0.2 6670920 155860 ? Sl 01:25 0:02 /usr/bin/python /usr/bin/gunicorn -k gevent -w 12 -b 127.0.0.1:8000 tokenserver.run:application
token 10984 13.7 0.7 33275668 498964 ? Sl 01:24 0:06 /usr/bin/python /usr/bin/gunicorn -k gevent -w 12 -b 127.0.0.1:8000 tokenserver.run:application
token 11796 0.3 0.0 109372 11800 ? S Jun01 12:51 /usr/bin/python /usr/bin/gunicorn -k gevent -w 12 -b 127.0.0.1:8000 tokenserver.run:application
token 14189 6.7 0.7 31951172 468020 ? Sl 01:24 0:04 /usr/bin/python /usr/bin/gunicorn -k gevent -w 12 -b 127.0.0.1:8000 tokenserver.run:application
token 14191 8.8 0.7 39765532 487920 ? Sl 01:24 0:06 /usr/bin/python /usr/bin/gunicorn -k gevent -w 12 -b 127.0.0.1:8000 tokenserver.run:application
token 14193 7.3 0.7 34042692 465516 ? Sl 01:24 0:05 /usr/bin/python /usr/bin/gunicorn -k gevent -w 12 -b 127.0.0.1:8000 tokenserver.run:application
token 14201 10.1 0.7 41437216 503880 ? Sl 01:24 0:07 /usr/bin/python /usr/bin/gunicorn -k gevent -w 12 -b 127.0.0.1:8000 tokenserver.run:application
token 14203 11.0 0.7 42490944 517628 ? Sl 01:24 0:07 /usr/bin/python /usr/bin/gunicorn -k gevent -w 12 -b 127.0.0.1:8000 tokenserver.run:application
token 23989 14.2 0.7 28161028 464228 ? Sl 01:24 0:05 /usr/bin/python /usr/bin/gunicorn -k gevent -w 12 -b 127.0.0.1:8000 tokenserver.run:application
token 26006 11.1 0.7 40459876 509476 ? Sl 01:24 0:07 /usr/bin/python /usr/bin/gunicorn -k gevent -w 12 -b 127.0.0.1:8000 tokenserver.run:application
token 30224 15.8 0.7 25082540 472260 ? Sl 01:24 0:04 /usr/bin/python /usr/bin/gunicorn -k gevent -w 12 -b 127.0.0.1:8000 tokenserver.run:application
[root@token2 tarek]# ls /proc/14189/fd|wc -l
9674
[root@token2 tarek]# ls /proc/14191/fd|wc -l
14481
[root@token2 tarek]# ls /proc/14193/fd|wc -l
14260
[root@token2 tarek]# ls /proc/14201/fd|wc -l
17261
[root@token2 tarek]# ls /proc/14203/fd|wc -l
18490 Now looking what's happening with the powerhose pool. |
I was unable to find any leaks in Powerhose with these tests:
However, running gunicorn in stage2, I found out that the number of used FDs is very high even when you just start it and The current formula is: FDs = 12 + NUMWORKERS + for I IN NUMWORKERS (1305 + I) For 12 workers: So, just running gunicorn with 12 workers eats 15715 already, which is a lot We have 60 to 65 sql connectors per worker, and 50 connectors for the crypto clients pool So I don't know where the 1000+ extra fds are going... maybe membase ? :s Continuing the investigation |
Each powerhose worker eats 25 fds. So, that's why we have 1250+ fds per gunicorn worker. Now looking why and if this can be reduced. But so far, I have seen no leaks, just big amounts of fds used by pyzmq |
we have 6 KQUEUE and 19 sockets launched in a client. The number of KQUEUEs can be reduced to 2 by setting the iothread value from 5 to 1 at https://github.com/mozilla-services/powerhose/blob/master/powerhose/client.py#L42 That should not impact the speed, and bring us back to 21 fds per worker, so down to 1050 per gunicorn worker. I don't think I can reduce the sockets -- looking |
These lines (which is the gist of the client) create already 2 KQUEUEs and 10 sockets : import zmq
c = zmq.Context()
s = c.socket(zmq.REQ)
poller = zmq.Poller()
poller.register(s, zmq.POLLIN) |
I have found a way to share some FDs between clients of the same pool. Doing the change now and trying in stage2 |
@tarekziade how did the change and retest go? Is this considered fixed? |
Closing this out since we no longer have powerhose stuff directly in this repo, if it's still a problem it could be moved to powerhose repo |
The token server is leaking fds on stage2.
This is probably in powerhose, either in the client Pool, or in the workers restarting.
Will write a test that counts the number of fds before and after each request to find out where the problem happens
/cc @fetep @ametaireau
The text was updated successfully, but these errors were encountered: