Not waiting for connection_lost() when transport closed #1925
Comments
I think the waiting for the transport shutdown would have to happen within the aexit coroutine in ClientSession, which currently just calls an ordinary function to perform the close. I don't see any other stage in program execution where the wait could be done. |
What is the reason to wait until close procedure completes in client code?
|
Well, graceful closing all connections before exit from |
connection_lost has nothing to do with client session. Also why should we care? @asvetlov adding graicful close support to session is bad idea. |
Maybe graceful close is bad wording but I want to finish Otherwise people should put |
So what is so important in close? Is session is alive we can wait for connection_close, but if developer dropped session why aiohttp should wait for connection_lost? That is expected behavior otherwise will get same as with old |
Another problem with implementing this at the moment is that the default SSL transport never calls As to why proper shutdown behaviour is required, it's because without it it is not safe for the client to close the event loop after the session is closed, and the client has no reliable way to wait for a time at which it is safe to discard the loop. SSL has a shutdown sequence that requires multiple steps, which in asyncio are naturally implemented asynchronously. Calling The only reason that these problems are not currently occurring is that asyncio's default SSL transport isn't proceeding with its multi-step SSL shutdown sequence, so there are no tasks pending on the queue. This is a bug in that transport, and means that it never calls If I'm understanding it correctly, this is a very nasty 2-way bug-bug dependency between aiohttp and asyncio. Fixing either of the two bugs will result in breakage: if you fix asyncio's SSL transport to do a correct SSL shutdown sequence without fixing aiohttp to wait for it to complete then existing clients who discard the event loop immediately after the aiohttp session is closed will be destroying the event loop with tasks pending; if you make aiohttp wait for the transport to do its shutdown without fixing the SSL transport to actually proceed with its shutdown, you will cause existing clients to hang. It's a bit of a Gordian knot, if I'm understanding it correctly. |
Oops, I didn't mean to close the issue! |
I dont think we should do anything until asyncio get fixed or changes. I still don't see why we should care about transports if developer dropped session. |
Ok, let just put the issue on hold until async fixes SSL transport. |
I still don't see reason why aiohttp needs to wait for connection_close call. lets just create issue in cpython repo and make reference. |
At least because without it user will get a warning about non-closed resources in debug mode IIRC. |
Or he need put a sleep after closing session but before program finishing. |
this will complicates logic, and reason only debug message. doesn't sound like a good win for me. |
See https://docs.python.org/3/library/asyncio-dev.html#pending-task-destroyed The warnings are there for a reason. Closing the queue with tasks pending means that those tasks will never run. The implications of that depend on exactly what the tasks were supposed to do, but could include leaks, incorrect protocol behaviour, crashes, anything really. I only listed the warnings because they're a specific, predictable result, but the warnings are there to alert you to the less predictable and more serious potential consequences. If you don't think the warnings are serious enough to warrant any effort to avoid them you could raise an issue suggesting they be removed, but I doubt it would get very far. |
Oh, just one detail: These aren't "debug messages" (with the implication that they are a detail that can be ignored in production). They are warnings. |
But close is not a task, it just bunch of callbacks, it is transpor's responsibility to execute all of them |
This boils down to programming contracts. When I use If that's not possible due to the way asyncio is designed, fine, document that – and urge people to ultimately find a better way of handling async code. Like, well, https://github.com/python-trio/trio for example. NB: is there an actual issue which this bug is on hold for? |
We should support it finally. |
Really need this to be properly supported ASAP and ready to assist in any way I can, project I’m working on heavily relies on async calls (via aiohttp), and since recently - on ssl async calls |
Is the issue solved? If so the stable doc is not up to date (dunno):
|
Only partially, something should be done still |
Hacky workaround for those who want everything and right now: transports = 0
all_is_lost = asyncio.Event()
for conn in session.connector._conns.values():
for handler, _ in conn:
transports += 1
proto = handler.transport._ssl_protocol
orig_lost = proto.connection_lost
def connection_lost(exc):
orig_lost(exc)
nonlocal transports
transports -= 1
if transports == 0:
all_is_lost.set()
proto.connection_lost = connection_lost
await session.close()
await all_is_lost.wait() |
@vmarkovtsev wow, this code is really helped me, I was spent so much time on this issue.
It ignored, however still print in console. Code with this fixes, also added Timeout to wait, so that it will not wait forever in any case. transports = 0
all_is_lost = asyncio.Event()
sess_conn = session.connector
if sess_conn is not None:
sess_conn_vals = sess_conn._conns.values()
if len(sess_conn_vals) == 0:
all_is_lost.set()
for conn in sess_conn_vals:
for handler, _ in conn:
transports += 1
proto = handler.transport._ssl_protocol # type: ignore
orig_lost = proto.connection_lost
def connection_lost(exc):
orig_lost(exc)
nonlocal transports
transports -= 1
if transports == 0:
all_is_lost.set()
proto.connection_lost = connection_lost
else:
all_is_lost.set()
try:
async with aiohttp.ClientTimeout(total = 30.):
await session.close()
await all_is_lost.wait()
except (AttributeError, asyncio.TimeoutError):
pass |
Yes, I simplified the real working code to demo the main idea. Indeed, there were some unhandled edge cases. |
FWIW, here is my full version that survived a month of testing: transports = 0
all_is_lost = asyncio.Event()
if len(session.connector._conns) == 0:
all_is_lost.set()
for conn in session.connector._conns.values():
for handler, _ in conn:
proto = getattr(handler.transport, "_ssl_protocol", None)
if proto is None:
continue
transports += 1
orig_lost = proto.connection_lost
orig_eof_received = proto.eof_received
def connection_lost(exc):
orig_lost(exc)
nonlocal transports
transports -= 1
if transports == 0:
all_is_lost.set()
def eof_received():
try:
orig_eof_received()
except AttributeError:
# It may happen that eof_received() is called after
# _app_protocol and _transport are set to None.
pass
proto.connection_lost = connection_lost
proto.eof_received = eof_received
await session.close()
if transports > 0:
await all_is_lost.wait() |
@vmarkovtsev Thanks for posting your comment #1925 (comment) with the workaround. It was very helpful in pointing me in the right direction. I saw lint errors when I implemented your fix that My understanding is that this means the code will set I made a quick tweak to remove this issue. I'm posting in case people copy the above implementation without linting to discover the potential problem: import asyncio
import functools
def create_aiohttp_closed_event(session) -> asyncio.Event:
"""Work around aiohttp issuethat doesn't properly close transports on exit.
See https://github.com/aio-libs/aiohttp/issues/1925#issuecomment-592596034
Returns:
An event that will be set once all transports have been properly closed.
"""
transports = 0
all_is_lost = asyncio.Event()
if len(session.connector._conns) == 0:
all_is_lost.set()
return all_is_lost
def connection_lost(exc, orig_lost):
nonlocal transports
try:
orig_lost(exc)
finally:
transports -= 1
if transports == 0:
all_is_lost.set()
def eof_received(orig_eof_received):
try:
orig_eof_received()
except AttributeError:
# It may happen that eof_received() is called after
# _app_protocol and _transport are set to None.
pass
for conn in session.connector._conns.values():
for handler, _ in conn:
proto = getattr(handler.transport, "_ssl_protocol", None)
if proto is None:
continue
transports += 1
orig_lost = proto.connection_lost
orig_eof_received = proto.eof_received
proto.connection_lost = functools.partial(connection_lost, orig_lost=orig_lost)
proto.eof_received = functools.partial(eof_received, orig_eof_received=orig_eof_received)
return all_is_lost |
Indeed. Your linter is better than mine |
@vmarkovtsev A combination of |
I noticed that with your function, if we only have one connection which is not using ssl, def create_aiohttp_closed_event(session) -> asyncio.Event:
"""Work around aiohttp issue that doesn't properly close transports on exit.
See https://github.com/aio-libs/aiohttp/issues/1925#issuecomment-639080209
Returns:
An event that will be set once all transports have been properly closed.
"""
transports = 0
all_is_lost = asyncio.Event()
def connection_lost(exc, orig_lost):
nonlocal transports
try:
orig_lost(exc)
finally:
transports -= 1
if transports == 0:
all_is_lost.set()
def eof_received(orig_eof_received):
try:
orig_eof_received()
except AttributeError:
# It may happen that eof_received() is called after
# _app_protocol and _transport are set to None.
pass
for conn in session.connector._conns.values():
for handler, _ in conn:
proto = getattr(handler.transport, "_ssl_protocol", None)
if proto is None:
continue
transports += 1
orig_lost = proto.connection_lost
orig_eof_received = proto.eof_received
proto.connection_lost = functools.partial(
connection_lost, orig_lost=orig_lost
)
proto.eof_received = functools.partial(
eof_received, orig_eof_received=orig_eof_received
)
if transports == 0:
all_is_lost.set()
return all_is_lost this function can be used like this: closed_event = create_aiohttp_closed_event(session)
await session.close()
await closed_event.wait() |
Agreed, we had to make that change internally as well and I forgot to update the comment. Thanks @leszekhanusz for posting it. |
Please check master |
So to be clear, is this bug fixed in the release 3.7.0 and the workaround no longer necessary ? |
It will be fixed in aiohttp 4.0.0 which is not released yet |
Long story short
aiohttp calls
close()
on the transport and then immediately discards it without waiting forconnection_lost()
. This is a problem for SSL transports, which have to shut down asynchronously, and it appears aiohttp only "works" with the standard library's SSL transport because of a bug in that transport.Expected behaviour
aiohttp waiting for connection_lost() before discarding transport
Actual behaviour
It not doing so
Steps to reproduce
Run this program and breakpoint
connection_made()
andconnection_lost()
in client_proto.py. The former is called, but not the latter.The program appears to run successfully, but to convince you that something is severely amiss, add a
__del__
method to the class_SelectorSocketTransport
in asyncio's selector_events.py and breakpoint it and that transport'sclose()
method. You'll see thatclose()
is never called and__del__
gets called during shutdown, after the loop has been closed and the module finalised, because it still has a reader registered and is therefore leaked (while still waiting on I/O) in a reference cycle with the event loop.If I'm understanding it right, the overall behaviour appears to be due to two bugs, one in asyncio and one in aiohttp:
close()
is called on the SSLProtocolTransport in asyncio's ssl_proto.py it initiates an asynchronous shutdown process, which should ultimately result in it callingconnection_lost()
. However, this doesn't progress for reasons I haven't fully figured out, soconnection_lost()
is never called. The shutdown process is clearly broken because it never results inclose()
being called on the underlying transport, which is leaked as shown above.connection_lost()
, so the hang that one would expect as a result of this never happens. By luck, the broken shutdown sequence never actually causes any errors.I'm actually more interested in the latter, the aiohttp behaviour, because I've written my own SSL transport (based on pyopenssl), which does a correct asynchronous shutdown sequence. When aiohttp discards my transport without waiting for
connection_lost()
, it causes errors, which I don't think are my fault.Your environment
Tested on Windows and Linux with python 3.5.1 and 3.5.2 and aiohttp 2.0.7 and 2.1.0.
The text was updated successfully, but these errors were encountered: