New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ssl socket leak #74056
Comments
When upgrading to 3.5.3 we noticed that the requests module was leaking memory rather quickly. This led to me logging the issue: https://github.com/kennethreitz/requests/issues/3933. After more investigation I've found that the leak is caused by the raw python SSL sockets. I've created a test file here: https://gist.github.com/thehesiod/ef79dd77e2df7a0a7893dfea6325d30a which allows you to reproduce the leak with raw python ssl socket (CLIENT_TYPE = ClientType.RAW), aiohttp or requests. They all leak in a similar way due to their use of the python SSL socket objects. I tried tracing the memory usage with tracemalloc but nothing interesting popped up so I believe this is a leak in the native code. A docker cloud image is available here: amohr/testing:stretch_request_leak based on:
I believe this issue was introduced in python 3.5.3 as we're not seeing the leak with 3.5.2. Also I haven't verified yet if this happens on non-debian systems. I'll update if I have any more info. I believe 3.6 is similarly impacted but am not 100% certain yet. |
validated 3.6 in fedora is affected as well, see github bug for charts. So it seems all 3.5.3+ versions are affected. I'm guessing it was introduced in one of the SSL changes in 3.5.3: https://docs.python.org/3.5/whatsnew/changelog.html#python-3-5-3 |
adding valgrind log of 3.5.3 on debian: jessie |
interestingly the valgrind run doesn't show a leak in the profile |
Can you record sys.getallocatedblocks() to see whether it grows continuously? |
@pitrou: sys.getallocatedblocks does not seem to increase |
I see. This may mean the leak is in memory that's not managed directly by Python (e.g. some OpenSSL structure). Is there a way to reproduce without third-party libraries such as requests? |
yes, in the gist I created you can switch between the various clients, by default right now it uses raw sockets. |
After adapting your test script to run against a local openssl server ( Does it need a specific server to test against to showcase the leak? |
the interesting part is it doesn't leak with a local https server, it appears to need to be a remove server. |
Is there a fast enough remote server that shows the leak? I've tested with my own remote server (https://pitrou.net/), but it doesn't leak. |
ya, my sample script hits google.com <http://google.com/\>, it's pretty fast. It just does a "HEAD".
|
Google is not very fast here (a couple of requests / sec at most). How many requests does it take to see a clear tendency? |
see graphs here: https://github.com/kennethreitz/requests/issues/3933, x-axis is number of requests not what it says (seconds). |
Ok, thank you. I've tweaked the script to remove most threads and use maps.google.com (which is faster here), and I managed to bisect the leak to deduce that the offending changeset is 598894f. |
The following addition fixes the leak: diff --git a/Modules/_ssl.c b/Modules/_ssl.c
index bb40051..8f5facd 100644
--- a/Modules/_ssl.c
+++ b/Modules/_ssl.c
@@ -1203,6 +1203,8 @@ _get_crl_dp(X509 *certificate) {
Py_XDECREF(lst);
#if OPENSSL_VERSION_NUMBER < 0x10001000L
sk_DIST_POINT_free(dps);
+#else
+ CRL_DIST_POINTS_free(dps);
#endif
return res;
} Christian, what do you think? |
CRL_DIST_POINTS_free() should be available in all supported OpenSSL versions. The function is defined by DECLARE_ASN1_FUNCTIONS(CRL_DIST_POINTS). |
So we should use it instead of sk_DIST_POINT_free()? I'd like to minimize potential breakage here. |
awesome! Thanks for finding a proposing fix pitrou! btw I found an example of freeing this structure here: http://www.zedwood.com/article/c-openssl-parse-x509-certificate-pem |
Yes, I'm currently testing the change with a bunch of OpenSSL and LibreSSL versions. By the way the memory issue can be reproduced with any certificate that contains a CRL distribution point. Letsencrypt certs don't have a CRL DP. I guess Alexander's test cert doesn't have a CRL DP either. The Nokia test cert in our test suite contains one. --- import _ssl
import sys
PEM = 'Lib/test/nokia.pem'
def mem():
with open('/proc/self/status') as f:
for line in f:
if line.startswith('RssAnon'):
print(line, end='')
for i in range(10000):
if i % 1000 == 0:
mem()
d = _ssl._test_decode_cert(PEM)
assert d['crlDistributionPoints']
mem() Without fix: $ ./python t.py
RssAnon: 4376 kB
RssAnon: 4840 kB
RssAnon: 5224 kB
RssAnon: 5608 kB
RssAnon: 6120 kB
RssAnon: 6504 kB
RssAnon: 6888 kB
RssAnon: 7272 kB
RssAnon: 7656 kB
RssAnon: 8040 kB
RssAnon: 8424 kB With fix: $ ./python t.py
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB
RssAnon: 4376 kB |
Antoine, you might find multissl.py helpful. I wrote a script to automate testing with multiple versions of OpenSSL and libressl. The first time it takes about half an hour to download, compile and install all versions locally. https://github.com/tiran/multissl/blob/master/multissl.py |
Very nice, thank you! |
I believe this was fixed in https://bugs.python.org/issue29738, so I'm closing this. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: