New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TLS/SSL asyncio leaks memory #109534
Comments
Can confirm that the issue exists - Python 3.11, and uvloop 0.17.0, when trying to set up many client SSL connections that sometimes reconnects - even directly with Sometimes it leaked several megabytes per second for me. I was able to track this by attaching to a running python process via
Also, a |
@1st1, @asvetlov, @graingert, @gvanrossum, @kumaraditya303 (as asyncio module experts) |
Unfortunately I am really bad at this type of low level stuff. :-( If you suspect a leak in the C accelerator, you can disable it and see if the leak goes away. (Or at least becomes less severe.) |
OTOH if the problem is in the SSL code, we need a different kind of expert. |
Finally please figure out if this is in uvloop or not. If it’s tied to uvloop this is the wrong tracker. |
Unfortunately, this bug was reproduced only in our production, and we weren't able to cause it in other environments. It's already mitigated, as the root cause was in poor stability of connections. I don't think I would be able to reproduce it again in production without But ready to assist and provide more details, if anyone wants to ask me some clarifying questions about it. Also, there are several related tickets, and I'm not sure if they are all about
|
I actually was able to reproduce this leak isolated several days ago, but it leaked for only ~5 megabytes in several hours. I adopted a repro script from this comment to the aiohttp bug mentioned above to be ready for run, and ran it for about 4 hours. The initial memory consumption (RSS) had initially stabilised on #!/usr/bin/env python3
import aiohttp
import tracemalloc
import ssl
import asyncio
def init():
timeout = aiohttp.ClientTimeout(connect=5, sock_read=5)
ssl_ctx = ssl._create_unverified_context()
conn = aiohttp.TCPConnector(ssl=ssl_ctx, enable_cleanup_closed=True)
session = aiohttp.ClientSession(connector=conn, timeout=timeout, cookie_jar=aiohttp.CookieJar(unsafe=True))
return session
async def fetch(client):
try:
async with client.request('GET', url=f'https://api.pro.coinbase.com/products/BTC-USD/ticker') as r:
msg = await r.text()
print(msg)
except asyncio.CancelledError:
raise
except Exception as err:
print("error", err)
pass
async def main():
requests = 600
clients = [init() for _ in range(requests)]
tracemalloc.start()
try:
while True:
await asyncio.gather(*[fetch(client) for client in clients])
await asyncio.sleep(5)
except asyncio.CancelledError:
pass # end and clean things up
finally:
memory_used = tracemalloc.get_traced_memory()
snapshot = tracemalloc.take_snapshot()
stats = snapshot.statistics('lineno')
for stat in stats[:10]:
print(stat)
try:
await asyncio.gather(*[client.close() for client in clients.values()])
except:
pass
asyncio.run(main()) |
Could you find a reproducer without aiohttp? |
Unsure if this is helpful at all or related but I came across this issue encode/uvicorn#2078 in which in the thread it is discussed/concluded that the issue of memory not being released is not isolated to uvicorn but seen in granian/gunicorn/hypercorn and as a result could be interpreter level (apologies for butchering the summary). Thread has some great charts/analysis though it is on the server level, however the different implementations w.r.t granian and uvicorn can help approximate where the issue might be surfacing if it's related. Example repo: https://github.com/Besedo/memory_issue/tree/main by @EBazarov Apologies in advance if it is unrelated and will remove the comment/create one in the right place. gi0baro:
|
Upstream is pretty hard pressed to debug this (I'm no web developer or admin). Can I ask one of the framework owners or users who are struggling with this to help us reduce their example to the point where we have a small self-contained program that demonstrates the issue? |
i provided working example of leak, testing with ab gaves 1+gb/min leak, openssl version 1.1 IMPORTANT: there is no problem if using python3.9 WITHOUT uvloop, with uvloop it leaks, without is not. |
shorten the mvp even more
|
Updated first post for clarity. |
example certificates |
I only have a Mac, and I got the example to work and even managed to install the apache2 utils. I modified the example to print the process size once a second (rather than bothering with graphs) using Now I've got an idea. Given the complexity of the So maybe someone who is interested in getting to the bottom of this issue could rewrite the example without the use of (If you can't figure out how to use |
@gvanrossum Biggest leak i saw is at Python 3.12 (without uvloop) |
This doesn’t really help. We need someone to try and find which objects are being leaked. |
i tried different python memory profilers with no result at all, idk how to create useful dump, but inside coredump are tons of server certificates info |
Might one of the magical functions in the |
objgraph is great for this https://objgraph.readthedocs.io/en/stable/#memory-leak-example |
Sorry, what without repro (but I will try to do it anyway) but I also got the same issue with Tornado + SSL. And I can confirm, such a leak exist even without uvloop, and sometimes (if I response 404 to cloudflare proxy a lot) it's became significant. |
SummaryFindings:
Minimum replicationThis snippet runs the server and runs two separate pings (one insecure) in a subprocess, capturing the The output is as follows Δ Memory Allocations = 2727.56kb
Δ Memory Allocations = 6.66kb # without SSL Notice the import asyncio
import asyncio.sslproto
import ssl
import tracemalloc
class HTTP(asyncio.Protocol):
def __init__(self):
self.transport = None
def connection_made(self, transport):
self.transport = transport
def data_received(self, data):
self.transport.write(
b"HTTP/1.1 200 OK\r\nContent-Length: 0\r\nConnection: keep-alive\r\n\r\n"
)
self.transport.close()
def make_tls_context():
ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
ctx.load_cert_chain(".jamie/iss109534/server.crt", ".jamie/iss109534/server.key")
return ctx
async def start_server(loop):
tls_context = make_tls_context()
return await loop.create_server(
HTTP, "127.0.0.1", 4443, backlog=65535, ssl=tls_context, start_serving=True
)
async def ping(delay: float = 1.0, n_iter: int = 1, insecure: bool = False):
await asyncio.sleep(delay)
# ----------------------------------------------------------------------------------
# Before
current_1, _ = tracemalloc.get_traced_memory()
# ----------------------------------------------------------------------------------
# Run a single request
if insecure:
cmd = "curl --insecure"
else:
cmd = "curl"
for _ in range(n_iter):
proc = await asyncio.create_subprocess_shell(
f"{cmd} https://127.0.0.1:4443",
stderr=asyncio.subprocess.PIPE,
stdout=asyncio.subprocess.PIPE
)
await proc.communicate()
# ----------------------------------------------------------------------------------
# After
current_2, _ = tracemalloc.get_traced_memory()
print(f"Δ Memory Allocations = {(current_2 - current_1)/1000:.2f}kb")
if __name__ == "__main__":
tracemalloc.start()
loop = asyncio.new_event_loop()
loop.run_until_complete(start_server(loop))
# Run with SSL verification
loop.run_until_complete(ping(delay=0.5, n_iter=10))
# Run without SSL verification
loop.run_until_complete(ping(delay=0.5, insecure=1, n_iter=10))
loop.close() Trace malloc snapshotI updated async def ping(delay: float = 1.0, n_iter: int = 1, insecure: bool = False):
# ...
snapshot_1 = tracemalloc.take_snapshot()
# ...
# Same as before
# ....
snapshot_2 = tracemalloc.take_snapshot()
print('-'*40)
if insecure:
print("Insecure")
print(f"Δ Inner Memory Allocations = {(current_2 - current_1)/1000:.2f}kb")
top_stats = snapshot_2.compare_to(snapshot_1, 'traceback')
print("\n[ Top stat ]")
for stat in top_stats[:1]:
print(stat)
for line in stat.traceback.format(limit=25):
print("\t", line)
print('-'*40) And, now we can see where the allocations are happening (with
Dev ModeWe can further confirm this when we run
EditUsing Edit 2, updated minimal replicationA simpler code example for replicating the issue... import asyncio
import ssl
import tracemalloc
async def main(certfile, keyfile):
tracemalloc.start()
# Start server with SSL
ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
ssl_context.load_cert_chain(certfile, keyfile)
await asyncio.start_server(
lambda _, w: w.write(b"\0"), "127.0.0.1", "4443", ssl=ssl_context
)
current_1, _ = tracemalloc.get_traced_memory()
# Ping the server with cURL
proc = await asyncio.create_subprocess_shell(
f"curl https://127.0.0.1:4443 2> /dev/null"
)
await proc.communicate()
current_2, _ = tracemalloc.get_traced_memory()
print(f"{(current_2-current_1)/1000:.2f}KB")
if __name__ == "__main__":
asyncio.run(
main(certfile="server.crt", keyfile="server.key")
) |
@mjpieters Just curious -- have you ever experienced anything similar? |
Excellent find!! Can you think of a suitable place to free this memory? I think it must be passed into OpenSSL via a C wrapper and that C wrapper must somehow forget to free it? So it would have to be in the C code for I don't think the leak is in the Python code, even though that's where it's being allocated (Python objects don't easily leak attribute values). |
I haven't, the code using asyncio HTTPS is relatively short-running (a few minutes at a time at most). |
Hi @stalkerg please, no worries. But I'm afraid you must let go of refcounts and assume this is a problem with a malloc or realloc somewhere. Try this:
With preventive fix in geraldog@e08719a memory resource usage is 866 megabytes of RAM. Without preventive fix in mentioned commit memory resource usage is 3155+ megabytes of RAM and it never gets freed - when the connections created by Something is obviously awry with bytearray/memoryview as class members here. I'm on gentoo by the way, but I doubt this matters much. I believe you will have the same excessive memory usage without the preventive fix, and very much lighter memory usage by never declaring the bytearray/memoryview as class member. |
Hi @stalkerg -- there has been a fix merged and back-ported for 3.11 and 3.12, although I'm not sure if you're building from source.
Hi @geraldog -- I believe the earlier discussion was just making aware that the issue was differentially observed for different distros/openssl versions (something that I'm yet to commit time for to investigate why). Was not suggesting there is a fix (or no fix) in openssl.
One issue observed (and now patched) was that the @geraldog's suggested patch overrides the
In the PR, there is a brief discussion on why this may be the case. Please let me know if you continue to experience a memory issue post-fix :) |
Hi @ordinary-jamie thanks for working on a fix. Unfortunately I must say I had the opportunity to test your fix very early on, as soon as it appeared on this thread, and it does not fix the underlying issue for me. Presumably because the connections With preventive fix in geraldog@e08719a memory usage is ~800 megabytes. Without preventive fix, 3100+ megabytes of RAM, never gets deallocated. Why? The real reason is to be found deep within the guts of how memoryview and bytearray are allocated. I still don't know the real reason. That's why I called my commit a preventive fix: it does not solve the underlying issue of bytearray making the leak happen when present as class member (I tested and leak happens independent of memoryview). valgrind has been of little use to me to help trace it by the way, and that gdb script helps but is very confusing and may lead to false positives. |
@stalkerg @ordinary-jamie perhaps I am going insane or maybe my gentoo setup is haunted... could you guys replicate the experiment? Did you also get 3100+ megabytes of RAM that never gets deallocated with default python and |
That's why i created this issue. Problem exists, leaks as hell on my production servers. So there any patches i can test? |
@rojamit good to hear from you. Could you give this a try: diff --git a/Lib/asyncio/selector_events.py b/Lib/asyncio/selector_events.py
index 8e888d26ea..85cb05a2de 100644
--- a/Lib/asyncio/selector_events.py
+++ b/Lib/asyncio/selector_events.py
@@ -989,7 +989,7 @@ def _read_ready__get_buffer(self):
return
try:
- self._protocol.buffer_updated(nbytes)
+ self._protocol.buffer_updated(nbytes, buf[:nbytes])
except (SystemExit, KeyboardInterrupt):
raise
except BaseException as exc:
diff --git a/Lib/asyncio/sslproto.py b/Lib/asyncio/sslproto.py
index fa99d4533a..3d6944d05a 100644
--- a/Lib/asyncio/sslproto.py
+++ b/Lib/asyncio/sslproto.py
@@ -275,8 +275,8 @@ def __init__(self, loop, app_protocol, sslcontext, waiter,
if ssl is None:
raise RuntimeError("stdlib ssl module not available")
- self._ssl_buffer = bytearray(self.max_size)
- self._ssl_buffer_view = memoryview(self._ssl_buffer)
+ self._ssl_buffer = None
+ self._ssl_buffer_view = None
if ssl_handshake_timeout is None:
ssl_handshake_timeout = constants.SSL_HANDSHAKE_TIMEOUT
@@ -427,13 +427,12 @@ def get_buffer(self, n):
want = n
if want <= 0 or want > self.max_size:
want = self.max_size
- if len(self._ssl_buffer) < want:
- self._ssl_buffer = bytearray(want)
- self._ssl_buffer_view = memoryview(self._ssl_buffer)
- return self._ssl_buffer_view
+ _ssl_buffer = bytearray(want)
+ _ssl_buffer_view = memoryview(_ssl_buffer)
+ return _ssl_buffer_view
- def buffer_updated(self, nbytes):
- self._incoming.write(self._ssl_buffer_view[:nbytes])
+ def buffer_updated(self, nbytes, buffer):
+ self._incoming.write(buffer)
if self._state == SSLProtocolState.DO_HANDSHAKE:
self._do_handshake() |
Maybe i need to clarify the situation even more?
Leak is INSANE.
Python 3.12.0 without any patches, loop = uvloop:
Python 3.9.2 without any patches, loop = asyncio:
Python 3.9.2 without any patches, loop = uvloop:
Every run of ab increasing allocated ram. |
@geraldog Python 3.12.0, Just realised: with your patch RAM usage doesn't grow up after multiple ab runs!
Untouched python (with or without uvloop) will leak more and more after every ab re-run, but looks like this patch solves it But. New interesting moment: sometimes ram usage grows a bit after ab run already done:
|
@rojamit thanks for the thorough testing. I appreciate the confirmation that my patch seems to mitigate the issue for you. This last note of yours:
This is definitely not good, it probably means there's another source of leaks. Also, the bad news is my patch is only a workaround, it doesn't clarify why having bytearray as class member generates leaks, it just doesn't do that and goes its merry way, but there must be an underlying reason (I bet on C side of Python memory allocation) why it leaks. Since you confirmed the patch is a bona fide workaround that prevents memory from leaking, I'll go ahead and file a PR. Hopefully we can work together with the Python maintainers to end this nightmare of sorts. |
Workaround horrendous memory leaks, does not solve the underlying issue of why bytearray as class member like that will cause leaks, at least in this case
I can reproduce it on 3.12 on Gentoo but I can't see it inside container base on alpine or debian, it's de allocate properly. |
While I do confirm @stalkerg observation that So I think we need to explore that angle perhaps, that when connection_lost() is called there's actually memory cleanup. I'll try confirmation of such hypothesis. But @stalkerg please don't kid yourself that this is "pooling" of some sort. While I did observe like you, repeated |
oh, it was a good hint. It's can confuse everything even more, but still. I tried to use
btw on my Gentoo I use glibc 2.38, on debian is 2.36 and in alpine is not a glibc.
UPDATE: UPDATE2: |
I confirm @stalkerg observation that with jemalloc there is deallocation of the buffers after
I also confirm @stalkerg observation that issuing |
@graingert @rojamit seems like any numbers for UPDATE: Okey, it's happened because such env var disable dynamic threshold. https://github.com/lattera/glibc/blob/master/malloc/malloc.c#L5031 UPDATE: this is actual values during our test: |
Confirming that setting |
@geraldog maybe Some tests with Python 3.12 with patch, asyncio loop:
Python 3.9, asyncio loop:
Python 3.12 with patch, uvloop, desired result:
(patch doesnt affect uvloop as i remember it has own ssl module) Looks like |
@rojamit why for your first test you have much more requests? Anyway, you should try update glibc and maybe linux kernel.
as I said, it's just mitigation, it's also not solve my issue on a long run. Did you try to replace allocator? If |
@stalkerg higher requests count is not linked with memory consumption in this test uname -r Debian GNU/Linux 11 (bullseye) |
Sorry @stalkerg but I think the burden is on you to write a Bug Report to https://sourceware.org/mailman/listinfo/libc-stable and Cc: the relevant people involved in glibc development. It was you who discovered it. There's a high possibility they won't fix however, citing that it's an actual optimization for some cases and that the burden is on the user to tell glibc to deallocate. And then we're basically stuck, all that is left is noting in documentation that this is a known issue... |
Hello, I experienced the problem with my Django ASGI web app and finally found this topic. I use railway.app as a hosting service, I also described how to reproduce the problem with minimal Django ASGI app on stackoverflow: https://stackoverflow.com/questions/78339166/python-django-asgi-memory-leak-updated-2 |
@fifdee did you try use different allocator? You can find instructions in this topic. |
@stalkerg I haven't. I'm not sure if it's possible with PaaS host like Railway, currently I don't have much experience with this kind of stuff. |
I tried setting environment variable And for those who use Django ASGI and suffer the same problem, this is modified
|
Bug report
Bug description:
python3.9 without uvloop doesn't leaks memory (or noticeably smaller).
python3.11+ (and others?) leaks memory A LOT under load (with or without uvloop) - up to +2gb per every test!
test commands:
ab -n50000 -c15000 -r https://127.0.0.1/
(apt install apache2-utils)
CPython versions tested on:
3.9, 3.11, 3.12
Operating systems tested on:
Debian Linux
Linked PRs
The text was updated successfully, but these errors were encountered: