-
-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup DefaultSelectors.modify() by 2x #74200
Comments
Patch in attachment modifies DefaultSelector.modify() so that it uses the underlying selector's modify() method instead of unregister() and register() resulting in a 2x speedup. Without patch: With patch: |
Hm, do you have a realistic benchmark which would show the benefit? |
modify() can be used often in verbose protocols such as FTP and SMTP where a lot of requests and responses are continuously exchanged between client and server, so you need to switch from EVENT_READ to EVENT_WRITE all the time. I did a similar change some years ago in pyftpdlib but I don't have another benchmark other than this one. |
Hi Giampaolo Rodola'! It seems like you proposed the same idea 4 years ago and I wrote a similar patch: issue bpo-18932 :-) I suggest you to use my perf module to produce more reliable benchmarks. Here is my results on my computer smithers tuned for benchmarks: haypo@smithers$ ./python bench_selectors.py -o ref.json Not significant (1): SelectSelector.modify @neologix: "Hm, do you have a realistic benchmark which would show the benefit?" I don't think that selector.modify() can be a bottleneck, but IMHO the change is simple and safe enough to be worth it. In a network server with 10k client, an optimization making .modify() 1.52x faster is welcomed. |
@giampaolo Rodola: CPython development moved to GitHub, can you please create a pull request instead of a patch? Thank you. Hum, I see that my old patch of issue bpo-18932 (selectors_optimize_modify-2.patch) is different. I tried to factorize code. What do you think of my change? |
Hey Stinner, thanks for chiming in! Your old patch uses unregister() and register() so I'm not sure what speedup that'll take as the whole point is to avoid doing that. You may want to rewrite it and benchmark it but it looks like it'll be slower. |
PR: #1030 |
@neologix: Do you have a GitHub account? It's hard to me to see if https://github.com/neologix is you or not, and your GitHub username is not filled in the Bug Tracker database. |
Giampaolo Rodola': "Your old patch uses unregister() and register() so I'm not sure what speedup that'll take as the whole point is to avoid doing that." My patch calls generic unregister() and register() of the base classes, but it only uses one syscall per modify() call. Oh by the way, it seems like my patch changes KqueueSelector, but your change doesn't. Am I right? KqueueSelector is the default selector on FreeBSD and macOS. IMHO it's worth it to optimize it as well. |
Doesn't that mean doing 3 operations (unregister(), register(), modify()) instead of the current 2 (unregister(), register())? I don't see how it can be faster than a single modify() syscall.
You are right but it looks like you end up doing the same thing as unregister() and register(). kqueue() has no modify() method so I don't think it can benefit from this change. |
The idea is to reuse _BaseSelectorImpl.register() and _BaseSelectorImpl.unregister() to factorize the code. These methods don't use syscall, they create the SelectorKey object and update _fd_to_key. So each class doesn't have to redo these things. I don't insist to redo what I did, I'm just trying to explain my change because your change basically copy/paste the same code 3 times, and you forgot KqueueSelector, so you even may have to copy it a 4th time ;-) |
You can't factorize the logic of modify() into those as they do two different things. I also don't like repeating the same thing 3 times but given how the module is organized I'm not sure how to do that as I need to pass 3 things around: the low-level selector (epoll, poll, whatever) and the read and write constants (POLLIN, EPOLLIN) which change depending on the selector being used. |
IMHO it complicates the code for little benefit: that's why I asked |
In certain protocols modify() is supposed to be used on every interaction between client and server. E.g. an FTP server does this:
...so it's two calls for each command received. |
@neologix: here's a PR which refactors the poll-related classes: https://github.com/python/cpython/pull/1035/files |
This refactoring was already suggested a long time ago, and at the Also, this whole thread is a repeat of: At the time, I already asked for one realistic use case demonstrating |
For the sake of experiment I'm attaching a toy echo server which uses modify() to switch between EVENT_READ and EVENT_WRITE. Without patch I get 35000 req/sec, with patch around 39000 req/sec (11.4% faster). import socket
sock = socket.socket()
sock.connect(('google.com', 80))
sock.setblocking(False)
print(sock.send(b'x' * 50000))
print(sock.send(b'x' * 50000)) # raise BlockingIOError So basically my benchmark is emulating a worst case scenario in which send() always blocks on the first call and succeed on the next one, in order to mimic recv() which blocks half of the times. |
The rationale for rejecting wouldn't be "DRY does not apply in this |
I also like the plan starting with refactoring ;-) |
OK, #1030 should be good to go. |
PR was merged 6 months ago, closing the issue. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: