Reduce overhead to add/remove asyncio readers and writers #106527

bdraco · 2023-07-07T18:17:17Z

Pitch

The current code path for adding and removing asyncio readers and writers always has to do a try except KeyError to add a new reader/writer

cpython/Lib/asyncio/selector_events.py

Line 316 in b3648f0

except KeyError:

For use cases where readers are added and removed frequently (hard to change the design without breaking changes) this adds up quickly.

Linked PRs

gh-106527: asyncio: optimize to add/remove readers and writers #106528

The text was updated successfully, but these errors were encountered:

gvanrossum · 2023-07-07T18:58:43Z

Interesting. Have you measured your alternative? Is it faster? I'm a little surprised that calling .get_map().get(key) is slower than calling .get_key(key) plus raising and catching an exception -- that would suggest a function call is faster than raising an catching an exception.

Maybe we would do better trying to make exception handling faster? (I guess this code is unusual because in the common case the exception gets raised, so our "zero-overhead exceptions" don't help, since it's only zero overhead when it doesn't get raised.)

@kumaraditya303 Thoughts? The PR seems well thought out.

bdraco · 2023-07-07T19:09:59Z

Have you measured your alternative? Is it faster?

The flames disappear from the py-spy after the change. I also did a cProfile and its faster. This isn't a great comparison though as its doesn't quantify the actual performance improvement and only shows more of the real world impact. I'll see if I can put together something that benchmarks it

bdraco · 2023-07-07T19:36:45Z

original: 1.048221042030491
new: 0.626125167007558

I think this is close enough to benchmark the change:

import asyncio
import timeit

loop = asyncio.get_event_loop()

original_code = """
try:
    key = selector.get_key(1)
except KeyError:
    pass
"""

new_code = """
key = selector.get_map().get(1)
if key is None:
    pass
"""

original_time = timeit.timeit(
    original_code,
    number=1000000,
    globals={"selector": loop._selector},
)
new_time = timeit.timeit(
    new_code,
    number=1000000,
    globals={"selector": loop._selector},
)

print("original: %s" % original_time)
print("new: %s" % new_time)

gvanrossum · 2023-07-07T23:25:51Z

Yeah, that’s acceptable.

bdraco · 2023-07-08T19:03:10Z

_SelectorMapping.__getitem__ has the same pattern

probably something like this would offer the same speed up

cpython/Lib/selectors.py

Line 73 in da98ed0

except KeyError:

    def __getitem__(self, fileobj):
        fd = self._selector._fileobj_lookup(fileobj)
        key = self._selector._fd_to_key.get(fd)
        if key is None:
            raise KeyError("{!r} is not registered".format(fileobj))
        return key

I'll wait for the review of the linked PR and pending the outcome there will open another PR for this. In the mean time I'll push the above code to production and run it for a bit.

That helps a bit as it avoids one except catch but on the add_reader case the last KeyError has to raise

bdraco · 2023-07-09T01:31:05Z

It also looks like the selectors could remove _key_from_fd and replace it with a simple self._fd_to_key.get() unless _key_from_fd is a defined API promise. That's roughly a 10% reduction of run time of the code in selectors.py when running an asyncio event loop with ~10000 calls to select per minute since it doesn't have the extra function depth overhead.

That should also likely be another PR. Probably worth doing since its called in the select loop and this was on a mostly idle system.

@@ -332,7 +332,7 @@ def select(self, timeout=None):
             if fd in w:
                 events |= EVENT_WRITE
 
-            key = self._key_from_fd(fd)
+            key = self._fd_to_key.get(fd)
             if key:
                 ready.append((key, events & key.events))
         return ready
@@ -422,7 +422,7 @@ def select(self, timeout=None):
             if event & ~self._EVENT_WRITE:
                 events |= EVENT_READ
 
-            key = self._key_from_fd(fd)
+            key = self._fd_to_key.get(fd)
             if key:
                 ready.append((key, events & key.events))
         return ready
@@ -475,7 +475,7 @@ def select(self, timeout=None):
                 if event & ~select.EPOLLOUT:
                     events |= EVENT_READ
 
-                key = self._key_from_fd(fd)
+                key = self._fd_to_key.get(fd)
                 if key:
                     ready.append((key, events & key.events))
             return ready
@@ -570,7 +570,7 @@ def select(self, timeout=None):
                 if flag == select.KQ_FILTER_WRITE:
                     events |= EVENT_WRITE
 
-                key = self._key_from_fd(fd)
+                key = self._fd_to_key.get(fd)
                 if key:
                     ready.append((key, events & key.events))
             return ready

rough profile (different points in time with similar workloads)
before:
after

AlexWaygood · 2023-07-12T09:15:04Z

Maybe we would do better trying to make exception handling faster? (I guess this code is unusual because in the common case the exception gets raised, so our "zero-overhead exceptions" don't help, since it's only zero overhead when it doesn't get raised.)

I think that would also be worthwhile, FWIW. I was quite surprised how much switching to LBYL idioms sped things up in #103318, largely because of how slow exception handling is.

(N.B. I don't think longterm goals like that should impact whether this proposed change is accepted or not.)

kumaraditya303 · 2023-07-14T09:49:52Z

FWIW I think it is much better to optimize selectors module rather than asyncio in this case.

bdraco · 2023-07-14T17:46:11Z

FWIW I think it is much better to optimize selectors module rather than asyncio in this case.

I agree that selectors are the primary source of the bottleneck I'm seeing. (nothing else is called 100000x per minute on busy systems besides _run_once)

More discussion on that in #106555

bdraco · 2023-07-14T17:48:13Z

The linked #106528 can use the new get implementation #106665 which will avoids the exceptions in the happy path of _add_reader which I think is still valuable on its own when you have a situation where readers are added and removed serval times per second.

bdraco · 2023-07-14T17:55:55Z

edit: sorry I screwed up and linked the wrong PR above. Should have been #106665

…ython#106528)

kumaraditya303 · 2023-07-24T09:08:28Z

Thanks for working on this.

bdraco added the type-feature A feature request or enhancement label Jul 7, 2023

AlexWaygood added performance Performance or resource usage topic-asyncio labels Jul 7, 2023

bdraco mentioned this issue Jul 7, 2023

gh-106527: asyncio: optimize to add/remove readers and writers #106528

Merged

This was referenced Jul 12, 2023

Adding selectors has two KeyError exceptions in the success path #106664

Closed

gh-106664: selectors: add get method to _SelectorMapping #106665

Merged

methane pushed a commit that referenced this issue Jul 22, 2023

gh-106527: asyncio: optimize to add/remove readers and writers (#106528)

b7dc795

jtcave pushed a commit to jtcave/cpython that referenced this issue Jul 23, 2023

pythongh-106527: asyncio: optimize to add/remove readers and writers (p…

636f995

…ython#106528)

mementum pushed a commit to mementum/cpython that referenced this issue Jul 23, 2023

pythongh-106527: asyncio: optimize to add/remove readers and writers (p…

8259583

…ython#106528)

kumaraditya303 closed this as completed Jul 24, 2023

This was referenced Dec 12, 2023

Calling loop.sock_connect has a KeyError exception in the success path #112989

Closed

gh-112989: asyncio: Reduce overhead to connect sockets with SelectorEventLoop #112991

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce overhead to add/remove asyncio readers and writers #106527

Reduce overhead to add/remove asyncio readers and writers #106527

bdraco commented Jul 7, 2023 •

edited

gvanrossum commented Jul 7, 2023

bdraco commented Jul 7, 2023

bdraco commented Jul 7, 2023 •

edited

gvanrossum commented Jul 7, 2023

bdraco commented Jul 8, 2023 •

edited

bdraco commented Jul 9, 2023 •

edited

AlexWaygood commented Jul 12, 2023

kumaraditya303 commented Jul 14, 2023

bdraco commented Jul 14, 2023 •

edited

bdraco commented Jul 14, 2023 •

edited

bdraco commented Jul 14, 2023

kumaraditya303 commented Jul 24, 2023

Reduce overhead to add/remove asyncio readers and writers #106527

Reduce overhead to add/remove asyncio readers and writers #106527

Comments

bdraco commented Jul 7, 2023 • edited

Pitch

Linked PRs

gvanrossum commented Jul 7, 2023

bdraco commented Jul 7, 2023

bdraco commented Jul 7, 2023 • edited

gvanrossum commented Jul 7, 2023

bdraco commented Jul 8, 2023 • edited

bdraco commented Jul 9, 2023 • edited

AlexWaygood commented Jul 12, 2023

kumaraditya303 commented Jul 14, 2023

bdraco commented Jul 14, 2023 • edited

bdraco commented Jul 14, 2023 • edited

bdraco commented Jul 14, 2023

kumaraditya303 commented Jul 24, 2023

bdraco commented Jul 7, 2023 •

edited

bdraco commented Jul 7, 2023 •

edited

bdraco commented Jul 8, 2023 •

edited

bdraco commented Jul 9, 2023 •

edited

bdraco commented Jul 14, 2023 •

edited

bdraco commented Jul 14, 2023 •

edited