-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add asyncio.BufferedProtocol #76432
Comments
A couple emails from async-sig for the context:
I propose to add another Protocol base class to asyncio: BufferedProtocol. It will have 'get_buffer()' and 'buffer_updated(nbytes)' methods instead of 'data_received()': class asyncio.BufferedProtocol: def get_buffer(self) -> memoryview: def buffer_updated(self, nbytes: int): When the protocol's transport is ready to receive data, it will call When the I've implemented the proposed design in uvloop (branch 'get_buffer', [1]) and adjusted your benchmark [2] to use it. Here are benchmark results from my machine (macOS): vanilla asyncio: 120-135 Mb/s [1] https://github.com/MagicStack/uvloop/tree/get_buffer |
I've made a PR that implements the change for selector_events.py. With the change: vanilla asyncio: 120-135 Mb/s If we decide to go forward with the proposed design, I'll update the PR with support for proactor_events |
Numbers are great! Couple questions.
|
I have another question: what happens if there is a partial read? For example, let's says I return a 1024-bytes buffer in get_buffer(), but recv_into() receives data in 512 chunks. Is it:
or is it:
|
Let's say we have 2Kb of data in the socket's network buffer, and we only preallocated 0.5Kb in our buffer. We will the receive 0.5Kb in our buffer, 'Protocol.buffer_updated()' will be called with nbytes=512, and 1.5Kb of data will be left in the network buffer. So the loop will call get_buffer()/buffer_updated() again, and the cycle will continue until there's no data left.
Flow control will continue working absolutely the same for BufferedProtocols.
It will be as follows:
Now it's the responsibility of the Protocol to return a correct view over buffer the next time The general idea is to:
I've implemented precisely this approach here: https://gist.github.com/1st1/1c606e5b83ef0e9c41faf21564d75ad7#file-get_buffer_bench-py-L27-L43 |
Do you think it's possible to fold BufferedProtocol into Protocol? This would allow StreamReader to implement readinto(). |
It would be a backwards incompatible change :( Coincidentally, there might be protocols that already implement 'get_buffer()' and 'buffer_updated()' methods that do something completely different.
We can easily refactor StreamReader to use 'BufferedProtocol'. Methods like 'readexactly()' would benefit from that. |
New protocol may speed up not only |
+1 on the idea, I would use this. |
Looks nice. Can it speed up aiohttp too? |
Yes. |
See https://eklitzke.org/goroutines-nonblocking-io-and-memory-usage for an interesting discussion of the drawbacks of some buffer handling idioms. |
Thanks for the link! It does make sense to use a pool of buffers for the proposed BufferedProtocol when you need to keep thousands of long-open connections. The current design makes that easy: when BufferedProtocol.get_buffer() is called you either take a buffer from the pool or allocate a temporary new one. For use-cases like DB connections (asyncpg) a permanently allocated buffer per protocol instance is a good solution too, as usually there's a fairly limited number of open DB connections. |
I did some benchmarks, and it looks like BufferedProtocol can make httptools up to 5% faster for relatively small requests < 10Kb. Don't expect big speedups there. For asyncpg, benchmarks that fetch a lot of data (50-100Kb) get faster up to 15%. So we'll definitely use the BufferedProtocol in asyncpg. For applications that need to handle megabytes of data per request (like Antoine's benchmark) the speedup will be up to 2x. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: