StreamReader.readexactly() raises GeneratorExit on ProactorEventLoop #83297

twisteroidambassador · 2019-12-21T16:04:36Z

BPO	39116
Nosy	@asvetlov, @gsalgado, @1st1, @twisteroidambassador, @MatthewScholefield
Files	prettysocks.py error_log_on_linux_python38.txt

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2019-12-21.16:04:36.435>
labels = ['type-bug', '3.8', '3.9', 'expert-asyncio']
title = 'StreamReader.readexactly() raises GeneratorExit on ProactorEventLoop'
updated_at = <Date 2020-11-29.00:12:06.904>
user = 'https://github.com/twisteroidambassador'

bugs.python.org fields:

activity = <Date 2020-11-29.00:12:06.904>
actor = 'matthew'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['asyncio']
creation = <Date 2019-12-21.16:04:36.435>
creator = 'twisteroid ambassador'
dependencies = []
files = ['49532', '49534']
hgrepos = []
issue_num = 39116
keywords = []
message_count = 5.0
messages = ['358774', '378931', '379149', '379205', '382027']
nosy_count = 5.0
nosy_names = ['asvetlov', 'salgado', 'yselivanov', 'twisteroid ambassador', 'matthew']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue39116'
versions = ['Python 3.8', 'Python 3.9']

twisteroidambassador · 2019-12-21T16:04:36Z

I have been getting these strange exception since Python 3.8 on my Windows 10 machine. The external symptoms are many errors like "RuntimeError: aclose(): asynchronous generator is already running" and "Task was destroyed but it is pending!".

By adding try..except..logging around my code, I found that my StreamReaders would raise GeneratorExit on readexactly(). Digging deeper, it seems like the following line in StreamReader._wait_for_data():

await self._waiter

would raise a GeneratorExit.

There are only two other methods on StreamReader that actually does anything to _waiter, set_exception() and _wakeup_waiter(), but neither of these methods were called before GeneratorExit is raised. In fact, both these methods sets self._waiter to None, so normally after _wait_for_data() does "await self._waiter", self._waiter is None. However, after GeneratorExit is raised, I can see that self._waiter is not None. So it seems the GeneratorExit came from nowhere.

I have not been able to reproduce this behavior in other code. This is with Python 3.8.1 on latest Windows 10 1909, using ProactorEventLoop. I don't remember seeing this ever on Python 3.7.

twisteroidambassador · 2020-10-19T08:07:48Z

This problem still exists on Python 3.9 and latest Windows 10.

I tried to catch the GeneratorExit and turn it into a normal Exception, and things only got weirder from here. Often several lines later another await statement would raise another GeneratorExit, such as writer.write() or even asyncio.sleep(). Doesn't matter whether I catch the additional GeneratorExit or not, once code exits this coroutine a RuntimeError('coroutine ignored GeneratorExit') is raised. And it doesn't matter what I do with this RuntimeError, the outermost coroutine's Task always generates an 'asyncio Task was destroyed but it is pending!' error message.

Taking a step back from this specific problem. Does a "casual" user of asyncio need to worry about handling GeneratorExits? Can I assume that I should not see GeneratorExits in user code?

twisteroidambassador · 2020-10-20T17:41:55Z

I have attached a script that should be able to reproduces this problem. It's not a minimal reproduction, but hopefully easy enough to trigger.

The script is a SOCKS5 proxy server listening on localhost:1080. In its current form it does not need any external dependencies. Run it on Windows 10 + Python 3.9, set a browser to use the proxy server, and browse a little bit, it should soon start printing mysterious errors involving GeneratorExit.

twisteroidambassador · 2020-10-21T11:03:27Z

Well this is unexpected, the same code running on Linux is throwing GeneratorExit-related mysterious exceptions as well. I'm not sure whether this is the same problem, but this one has a clearer traceback. I will attach the full error log, but the most pertinent part seems to be this:

During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/lib/python3.8/contextlib.py", line 662, in __aexit__
    cb_suppress = await cb(*exc_details)
  File "/usr/lib/python3.8/contextlib.py", line 189, in __aexit__
    await self.gen.athrow(typ, value, traceback)
  File "/opt/prettysocks/prettysocks.py", line 332, in closing_writer
    await writer.wait_closed()
  File "/usr/lib/python3.8/asyncio/streams.py", line 376, in wait_closed
    await self._protocol._get_close_waiter(self)
RuntimeError: cannot reuse already awaited coroutine

closing_writer() is an async context manager that calls close() and await wait_closed() on the given StreamWriter. So it looks like wait_closed() can occasionally reuse a coroutine?

matthewscholefield · 2020-11-29T00:12:06Z

Let me preface this by declaring that I am very new to Python async so it is very possible that I am missing something seemingly obvious. That being said, I've been looking at various resources to try to understand the internals of asyncio and it hasn't led to any insights on this problem thus far.
-----------------

This all sounds quite similar to an experience I am dealing with. I'm working with pub sub within aioredis which internally uses a StreamReader with a function equivalent to readexactly. This all started from debugging "Task was destroyed but it is pending!" to which attempted fixes led to multiple "RuntimeError: aclose(): asynchronous generator is already running" errors.

I did the same thing, adding try excepts everywhere in my code to understand what was happening and this led me to identifying that a regular async function would raise GeneratorExit during await. However, even if I suppress this, the caller awaiting on this function would also raise a GeneratorExit. Suppressing this exception at the top level leads to an unsuspecting (to me) error "coroutine ignored GeneratorExit".

I understand that GeneratorExit is raised in unfinished generators when garbage collected to handle cleanup. And I understand that async functions are essentially a generator in the sense that they yield when they await. So, if the entire coroutine were garbage collected this might trigger GeneratorExit in each nested coroutine. However, from all of my logging I am sure that prior to the GeneratorExit, nothing returns upwards so there should still be valid references to every object.

I'll include some errors below, in case they may be of relevance:

=== Exception in await of inner async function ===

Traceback (most recent call last):
  File ".../site-packages/uvicorn/protocols/http/httptools_impl.py", line 165, in data_received
    self.parser.feed_data(data)
  File "httptools/parser/parser.pyx", line 196, in httptools.parser.parser.HttpParser.feed_data
httptools.parser.errors.HttpParserUpgrade: 858

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File ".../my_code.py", line 199, in wait_for_update
    return await self.waiter.wait_for_value()
GeneratorExit

=== Exception when suppressing GeneratorExit on the top level ===

Exception ignored in: <coroutine object parent_async_function at 0x0b...>
Traceback (most recent call last):
  File ".../site-packages/websockets/protocol.py", line 229, in __init__
    self.reader = asyncio.StreamReader(limit=read_limit // 2, loop=loop)
RuntimeError: coroutine ignored GeneratorExit

kumaraditya303 · 2022-09-29T08:49:51Z

First the reproducer is very large, you should try to reduce it. I tried it and wasn't able to reproduce the reported errors. It is possible that is already fixed as this bug is 3 years old. @twisteroidambassador Can you try it on latest main?

gvanrossum · 2022-09-30T01:08:09Z

I wonder if the problem is solved by GH-96323? There was a task created by connection_made() (only when there was a callback for that event) that wasn't kept alive, so a poorly timed GC run would destroy the task.

But the reproducer is definitely too big to definitively declare that this was the root cause.

twisteroidambassador · 2022-10-11T18:18:06Z

Reproducing this bug has always been difficult. It happens maybe once or twice a day for my program that runs all day long. The only thing I can say for certain is that it still exists on Python 3.9.9 + Windows 10.

I see that GH-96323 has been merged and backported to 3.10 and 3.11, but it haven't made into any release yet. I am going to build a 3.10 from git and see whether it still has this bug.

twisteroidambassador · 2022-10-11T18:28:49Z

I just caught this bug in 3.10.7 (official installer).

The traceback is no help at all. It's only one layer deep, like this:

Traceback (most recent call last):
  File "***", line ***, in ***
    buf = await ureader.readexactly(2)
GeneratorExit

It is happening when a freshly-connected StreamReader is being read for the first time (although the associated StreamWriter has been written to once already), so GH-96323 does sound plausible.

EDIT: oh, and a few lines below:

ERROR    asyncio Task was destroyed but it is pending!
task: <Task pending name='Task-2850' coro=<*** done, defined at ***> wait_for=<Future pending cb=[Task.task_wakeup()]>>

gvanrossum · 2022-10-11T18:33:48Z

Could you post the exact code to reproduce? The traceback by itself is (as you say) not helpful.

gvanrossum · 2022-10-11T18:35:48Z

FWIW that error about task destroyed but pending could mean you have code that calls create_task() but doesn't hold onto the task object until it's completed. (We added a note about this to the docs recently.)

gvanrossum · 2022-10-11T19:53:42Z

[me]

Could you post the exact code to reproduce? The traceback by itself is (as you say) not helpful.

Never mind, I realize now that you're observing this vry occasionally in a large production app. And that the 3.10.7 you're using there does not have the GH-96323 fix. And that the orphan task being reaped is an internal task (which that PR fixes). Also, as we all know by now from earlier comments, GeneratorExit comes from a coroutine (the result of calling an async function) getting closed. So this basically boils down to "no new info yet".

gvanrossum · 2022-10-11T19:59:15Z

Also, Python 3.10.8 was just released, for an unrelated security fix, but it has the fix. So you can upgrade your production to that (I know, that's probably a pain) and eventually you can report that it's gone... Or not!

twisteroidambassador · 2022-10-12T07:49:05Z

FWIW that error about task destroyed but pending could mean you have code that calls create_task() but doesn't hold onto the task object until it's completed. (We added a note about this to the docs recently.)

Thanks for pointing this out! I believe this may be my immediate problem. I do have a task that I start up and forget about around the problematic area.

By having a background thread calling gc.collect() on a loop, I was able to trigger a GeneratorExit in relatively short time. I have added a strong reference to this task, and will see whether it fixes my issue.

Side note: I'm having a hard time wrapping my head around the "task being gc'd" concept. Say I write a coroutine that does not explicitly refer to its own task (no calling asyncio.current_task() or anything similar). I create a Task from the coroutine, and does not keep a strong reference to it. Now the Task has no strong references at all. When might the Task be garbage collected? Does it matter whether the Task is currently being run by the event loop or not?

Say I write another coroutine, which keeps a reference to asyncio.current_task(). I create a Task from this new coroutine, but still does not keep a strong reference to it otherwise. Now the Task has a circular reference to itself, so will it be garbage collected by a sufficiently motivated GC?

In the case of GH-96323, the fix is to keep a reference of <Task _client_connected_cb> in the StreamReaderProtocol instance. This task is called with arguments (StreamReader, StreamWriter), so once the Task starts to run it will hold a circular reference to itself. All of these are in context of a asyncio.base_events.Server receiving a new connection, so the Task is started by asyncio machinery, and the programmer only gain access to this Task until it starts running, and only from within the Task itself.

So, is there a possibility that a) the StreamReaderProtocol gets gc'd before <Task _client_connected_cb> starts running, or b) StreamReaderProtocol, <Task _client_connected_cb> and related instances all gets gc'd after the Task started but before it finishes, if the programmer does not explicitly save a strong reference to the Task somewhere?

twisteroidambassador · 2022-10-12T09:06:44Z

I'll try to trace the creation of <Task _client_connected_cb> with a SelectorEventLoop:

When an incoming connection is being accepted, the loop calls selector_events.BaseSelectorEventLoop._accept_connection which calls selector_events.BaseSelectorEventLoop._accept_connection2. It creates an instance of StreamReaderProtocol by calling protocol_factory, which is defined here. (StreamReaderProtocol holds a strong reference to StreamReader it needs.) Then a Transport is created (I'm only tracing the non-ssl tranport here), which holds a strong reference to the Protocol, and saves a strong reference of the Protocol and itself in the event loop via call_soon).

The loop first runs streams.StreamReaderProtocol.connection_made. It creates a StreamWriter (which holds a strong reference to the Transport), then creates a Task with client_connected_cb, binding StreamReader and StreamWriter to the Task, and saves a strong reference of the Task (GH-96323). It then removes the Protocol's direct strong reference to StreamReader.

The loop then runs selector_events._SelectorTransport._add_reader, binding the Transport to the event loop.

So at this point, the event loop keeps the Transport alive, the Transport keeps the Protocol alive, the Protocol has a strong reference to <Task _client_connected_cb> which keeps StreamReader and StreamWriter alive. Sounds good. I believe the event loop should keep the Transport alive until the underlying connection is done?

gvanrossum · 2022-10-12T19:52:05Z

I think that's correct. I've never heard of problems where transports or protocols are GC'ed prematurely -- AFAICT this is only an issue with Tasks. I don't actually know (because it's so hard to trace) what state the coroutine is in when this is happening, but it must be in a state where closing it will send a GeneratorExit exception into the coroutine (have fun with genobject.c if you want to trace this more precisely).

I hope that at some point you can confirm that either updating to 3.10.8 or by adding a hard ref to your own task(s) the problem is gone for you.

One final note, probably a red herring. In your attached error log I notice that one of your traceback lines occurs in concurrent/futures/thread.py -- that's a totally different kind of Future (though it has a similar API). If you're not being careful with threads (I can't tell from the error alone if you are or aren't), some very different caveats apply -- you should never do anything from a different thread than the one where the loop is running, except calling loop.call_soon_threadsafe().

twisteroidambassador · 2022-11-17T18:11:11Z

Since making sure every task created has hard references, I haven't seen a single GeneratorExit in the past month. I think it's safe to declare the problem solved!

gvanrossum · 2022-11-17T18:54:49Z

Thanks for checking!

twisteroidambassador mannequin added 3.8 only security fixes topic-asyncio type-bug An unexpected behavior, bug, or error labels Dec 21, 2019

twisteroidambassador mannequin added 3.9 only security fixes labels Oct 19, 2020

ezio-melotti transferred this issue from another repository Apr 10, 2022

kumaraditya303 added pending The issue will be closed if no feedback is provided and removed 3.9 only security fixes 3.8 only security fixes labels Sep 29, 2022

gvanrossum closed this as completed Nov 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StreamReader.readexactly() raises GeneratorExit on ProactorEventLoop #83297

StreamReader.readexactly() raises GeneratorExit on ProactorEventLoop #83297

twisteroidambassador mannequin commented Dec 21, 2019

twisteroidambassador mannequin commented Dec 21, 2019

twisteroidambassador mannequin commented Oct 19, 2020

twisteroidambassador mannequin commented Oct 20, 2020

twisteroidambassador mannequin commented Oct 21, 2020

matthewscholefield mannequin commented Nov 29, 2020

kumaraditya303 commented Sep 29, 2022

gvanrossum commented Sep 30, 2022

twisteroidambassador commented Oct 11, 2022

twisteroidambassador commented Oct 11, 2022 •

edited

Loading

gvanrossum commented Oct 11, 2022

gvanrossum commented Oct 11, 2022

gvanrossum commented Oct 11, 2022

gvanrossum commented Oct 11, 2022

twisteroidambassador commented Oct 12, 2022 •

edited

Loading

twisteroidambassador commented Oct 12, 2022

gvanrossum commented Oct 12, 2022

twisteroidambassador commented Nov 17, 2022

gvanrossum commented Nov 17, 2022

StreamReader.readexactly() raises GeneratorExit on ProactorEventLoop #83297

StreamReader.readexactly() raises GeneratorExit on ProactorEventLoop #83297

Comments

twisteroidambassador mannequin commented Dec 21, 2019

twisteroidambassador mannequin commented Dec 21, 2019

twisteroidambassador mannequin commented Oct 19, 2020

twisteroidambassador mannequin commented Oct 20, 2020

twisteroidambassador mannequin commented Oct 21, 2020

matthewscholefield mannequin commented Nov 29, 2020

kumaraditya303 commented Sep 29, 2022

gvanrossum commented Sep 30, 2022

twisteroidambassador commented Oct 11, 2022

twisteroidambassador commented Oct 11, 2022 • edited Loading

gvanrossum commented Oct 11, 2022

gvanrossum commented Oct 11, 2022

gvanrossum commented Oct 11, 2022

gvanrossum commented Oct 11, 2022

twisteroidambassador commented Oct 12, 2022 • edited Loading

twisteroidambassador commented Oct 12, 2022

gvanrossum commented Oct 12, 2022

twisteroidambassador commented Nov 17, 2022

gvanrossum commented Nov 17, 2022

twisteroidambassador commented Oct 11, 2022 •

edited

Loading

twisteroidambassador commented Oct 12, 2022 •

edited

Loading