asyncio as_completed() thrashes adding and removing callbacks #64765

glangford · 2014-02-08T20:18:14Z

BPO	20566
Nosy	@gvanrossum, @pitrou, @vstinner, @1st1
Files	test_thrash.py: Test code output.txt: Test output (with instrumented code in _wait()

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2014-02-13.01:58:53.772>
created_at = <Date 2014-02-08.20:18:14.254>
labels = ['library', 'performance']
title = 'asyncio as_completed() thrashes adding and removing callbacks'
updated_at = <Date 2014-03-17.06:30:46.450>
user = 'https://bugs.python.org/glangford'

bugs.python.org fields:

activity = <Date 2014-03-17.06:30:46.450>
actor = 'python-dev'
assignee = 'none'
closed = True
closed_date = <Date 2014-02-13.01:58:53.772>
closer = 'gvanrossum'
components = ['Library (Lib)']
creation = <Date 2014-02-08.20:18:14.254>
creator = 'glangford'
dependencies = []
files = ['34000', '34001']
hgrepos = []
issue_num = 20566
keywords = []
message_count = 9.0
messages = ['210679', '210680', '210683', '210690', '210709', '211054', '211055', '211122', '213801']
nosy_count = 6.0
nosy_names = ['gvanrossum', 'pitrou', 'vstinner', 'python-dev', 'yselivanov', 'glangford']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = None
status = 'closed'
superseder = None
type = 'performance'
url = 'https://bugs.python.org/issue20566'
versions = ['Python 3.4', 'Python 3.5']

glangford · 2014-02-08T20:18:14Z

In asyncio, tasks.py as_completed() appears to trigger adding and removing callbacks multiple times for the pending set of futures, each time a single future completes.

For example, to wait for 5 futures which complete at different times:

as_completed() uses _wait()
_wait() will add 5 callbacks, one for each future
when one future completes, a callback is triggered and all 5 callbacks are removed; this is because _wait() was called with FIRST_COMPLETED
for the 4 remaining futures - 4 callbacks have to be added back again
when the next future completes, the 4 callbacks are removed
etc…

The worst case is if as_completed() is called to wait for all of a larger number of futures, which complete at different times. For example, with 100 futures worst case, ~10,000 callback adds and removes would be performed.

(I am very new to the asyncio code, so I don't have a patch to offer at this point).

gvanrossum · 2014-02-08T21:16:59Z

Yup, I remember feeling a bit guilty doing it this way, but at least the semantics are correctly, and there were bigger fish to fry. Thanks for the test code that proves the issue!

I assume it can be fixed by not using _wait() but some custom approach. If we get this done by RC2, fine, otherwise we'll just have to documented that this is O(N**2) and not to use it for large numbers of tasks until the fix lands, perhaps in 3.4.1. (Usually there's a solution that avoids as_completed() altogether.)

I've created upstream bug http://code.google.com/p/tulip/issues/detail?id=127 to track this.

gvanrossum · 2014-02-08T21:41:22Z

BTW, just curious: Glenn, what led you to discover this?

glangford · 2014-02-08T22:02:36Z

Glenn, what led you to discover this?

It was found by code inspection. I recently started looking at concurrent.futures, really just curious as to how futures were implemented. Because one of the concurrent.futures bugs I raised also applied to asyncio, I started poking into the source for it as well.

gvanrossum · 2014-02-09T01:17:19Z

I'm looking into a solution for this. The idea is pretty straightforward: http://codereview.appspot.com/61210043.

This needs more code to support the optional timeout feature, and it now returns Futures instead of coroutines (which I think is fine).

But to my surprise, test_as_completed_reverse_wait() failed. After nearly an hour of debugging my own code I realized that this test specifically verifies the following weird behavior: if you get two values (futures/coroutines) out of as_completed() without waiting for either, and then wait for the *second* one, it will wait for the *first* result. I guess this is defensible because it is the first one you wait for, but I find it hard to believe that this is desirable behavior -- even though I wrote the code and the test! (http://code.google.com/p/tulip/source/detail?r=674355412f33.)

So I'd like permission to just change these semantics. They aren't clear from the docs or from PEP-3156, and concurrent.futures.as_completed() doesn't have the same issue (there, __next__() on the iterator blocks until the result is ready).

gvanrossum · 2014-02-12T04:31:46Z

Everyone interested, I plan to push the latest version on view in Rietveld tomorrow: http://codereview.appspot.com/61210043

It's not as drastic a rewrite as my original attempt; Glenn's idea of using a Queue worked out great!

gvanrossum · 2014-02-12T04:35:18Z

Note: The new version does *not* change the semantics as mentioned in msg210709. Nobody should depend on those semantics anyway.

python-dev · 2014-02-13T01:58:31Z

New changeset 6e04027ed53e by Guido van Rossum in branch 'default':
asyncio: Change as_completed() to use a Queue, to avoid O(N**2) behavior. Fixes issue bpo-20566.
http://hg.python.org/cpython/rev/6e04027ed53e

python-dev · 2014-03-17T06:30:46Z

New changeset b52113fb58a5 by Guido van Rossum in branch '3.4':
asyncio: Change as_completed() to use a Queue, to avoid O(N**2) behavior. Fixes issue bpo-20566.
http://hg.python.org/cpython/rev/b52113fb58a5

glangford mannequin added stdlib Python modules in the Lib dir performance Performance or resource usage labels Feb 8, 2014

gvanrossum closed this as completed Feb 13, 2014

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

asyncio as_completed() thrashes adding and removing callbacks #64765

asyncio as_completed() thrashes adding and removing callbacks #64765

glangford mannequin commented Feb 8, 2014

glangford mannequin commented Feb 8, 2014

gvanrossum commented Feb 8, 2014

gvanrossum commented Feb 8, 2014

glangford mannequin commented Feb 8, 2014

gvanrossum commented Feb 9, 2014

gvanrossum commented Feb 12, 2014

gvanrossum commented Feb 12, 2014

python-dev mannequin commented Feb 13, 2014

python-dev mannequin commented Mar 17, 2014

asyncio as_completed() thrashes adding and removing callbacks #64765

asyncio as_completed() thrashes adding and removing callbacks #64765

Comments

glangford mannequin commented Feb 8, 2014

glangford mannequin commented Feb 8, 2014

gvanrossum commented Feb 8, 2014

gvanrossum commented Feb 8, 2014

glangford mannequin commented Feb 8, 2014

gvanrossum commented Feb 9, 2014

gvanrossum commented Feb 12, 2014

gvanrossum commented Feb 12, 2014

python-dev mannequin commented Feb 13, 2014

python-dev mannequin commented Mar 17, 2014