-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoiding copies #35
Comments
Yeah, it's an interesting thing to think about. I'm not sure there's much point in worrying about these copies, for several reasons. First, almost anything in Python is slow compared to The other option in principle would be to use a fancier buffer structure, like a linked list of "chunks", or a ring buffer. But the problem with these kinds of constructs is that they require significant amounts of Python-level logic to construct, search through, etc. h11's strategy for speed is to use as little Python as possible – it leans heavily on C methods like All in all, it seems difficult to get a meaningful advantage from using (Note thought that h11 does support zero-copy sends of data by using |
For the reasons discussed above, I don't think there are any practical changes we can make here in the short-to-medium term, so closing. |
These are some uneducated musings (I am not really knowledgeable about network programming).
Let's say we are writing a simple-minded server over a TCP socket. For simplicity I am only talking about the receive path, not the send path. The basic flow using h11 is:
recv
from the socket into a buffer.A side note about the first step (not the topic of this issue): this is usually done using
data = socket.recv(BUF_SIZE)
. From my reading of the cpython code, the way python does this magic is that it just allocates a fresh buffer of sizeBUF_SIZE
and reads into that. IfBUF_SIZE - 1 < SMALL_REQUEST_THRESHOLD (= 512)
, this might come from some memory pool, otherwise it's justmalloc
. So for a C programmer, this seems very wasteful. But fortunately it seems possible to reuse a buffer by doing something like this (I haven't checked to see if it actually makes a difference):What I do want to talk about (and is actually relevant to h11...) is the fact that we make two copies of the data: kernel -> buffer, buffer -> h11. It seems natural to ask: can we reduce this to one copy?
I don't think this is a pressing issue for h11, as any copying overhead is pretty minor compared to other overhead, currently. But it seems interesting to ask in relation to the sans-io methodology in general.
One way I imagine this could work, without inverting the logic again and losing the advantage of sans-io, is to have a way for h11 itself provide a buffer for the application to
recv
into. Like maybe the application tellsh11
how much it wants torecv
, and gets back amemoryview
of h11's buffer. Then it receives into that and tells h11 how much it read. But there are probably better ways.The text was updated successfully, but these errors were encountered: