Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

_handle_read and _try_inline_read does not allow processing of data as long as new data arrives #597

Closed
fredde-fisk opened this Issue Sep 24, 2012 · 2 comments

Comments

Projects
None yet
2 participants

_handle_read and _try_inline_read never allows processing of data as long as new data continues to arrive. The reason is the loop in those methods that will continue reading as long as there are more data in the stream:

while True:
    if self._read_to_buffer() == 0:
        break

So a read_bytes call can end up buffering 100 MiB of data and throwing an "IOError: Reached maximum read buffer size" exception even though it only wanted to read 2 bytes.

I discovered this problem when running a websocket server where the client sends a continuous stream of messages.

Owner

bdarnell commented Sep 29, 2012

For reference, the responsible commit was 41463a9

This problem can only occur if data is arriving as fast as read_to_buffer can pull it out of the socket (the socket buffer itself is limited in size so the data must be newly-arriving); I had assumed this wouldn't be an issue in practice, but I guess it can happen.

Calling read_from_buffer after every read_to_buffer is unacceptably slow in the read_until case, although it would be fine for read_bytes. Our options include A) attempt to read_from_buffer only if we're reading a number of bytes, and keep the current behavior for other reads, B) read_from_buffer after every N calls to read_to_buffer, regardless of the type of read, C) increase read_chunk_size, which may let us revert 41463a9 while keeping acceptable performance D) be clever in read_from_buffer for delimited reads, taking advantage of the fact that the first part of the buffer has already been searched.

Owner

bdarnell commented Apr 28, 2014

This has been fixed in the master branch for read_bytes and read_until_close. For read_until and read_until_regex the max_bytes parameter can be used to mitigate the problem by setting a limit below the max_buffer_size.

@bdarnell bdarnell closed this Apr 28, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment