Tune default receive buffer size #1139

njsmith · 2019-07-04T21:10:39Z

Since #1123, our streams have a default receive size baked in. For example, the initial default for SocketStream is 64KiB. Is this the best size? We have no idea. It would be nice to have some idea.

On #1123, @oremanj raised some of the issues that might affect this:

One wrinkle: AFAIK, each call to socket.recv() allocates a new bytes object that is large enough for the entire given chunksize. If large allocations are more expensive, passing a too-large buffer is probably bad for performance. (The allocators I know of use 128KB as their threshold for "this is big, mmap it instead of finding a free chunk" but if one used 64KB instead and we got a mmap/munmap pair on each receive, that feels maybe bad?)

My intuition favors a much lower buffer size, like 4KB or 8KB, but I also do most of my work on systems that are rarely backlogged, so my intuition might well be off when it comes to a high-throughput Trio application.

Another option we could consider: the socket owns a receive buffer (bytearray) which it reuses, calls recv_into(), and extracts just the amount actually received into a bytes for returning. Downside: spends 64KB (or whatever) per socket in steady state. Counterpoint: the OS-level socket buffers are probably much larger than that (but I don't know how much memory they occupy when the socket isn't backlogged).

It's true that if you do sock.recv(N), Python has to malloc an N byte buffer, and then realloc back down to the actual size, so there is some cost to using a large N. The consequences of that aren't very obvious to me though. Most allocators have countermeasures against repeatedly growing/shrinking the heap like that (e.g. search malloc hysteresis). If doing our own buffer management is worthwhile, then note that we could potentially share a buffer between all sockets in the same thread. But of course the real answer to all of this is that we have to measure.

The text was updated successfully, but these errors were encountered:

njsmith mentioned this issue Jul 4, 2019

Streams are iterable + receive_some doesn't require an explicit size #1123

Merged

oremanj added the performance label May 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tune default receive buffer size #1139

Tune default receive buffer size #1139

njsmith commented Jul 4, 2019

Tune default receive buffer size #1139

Tune default receive buffer size #1139

Comments

njsmith commented Jul 4, 2019