Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Websocket compression for send_bytes on by default #2856

Closed
frederikaalund opened this issue Mar 20, 2018 · 9 comments
Closed

Websocket compression for send_bytes on by default #2856

frederikaalund opened this issue Mar 20, 2018 · 9 comments
Labels

Comments

@frederikaalund
Copy link
Contributor

Long story short

I recently upgraded from Aiohttp 2.2.5 to 3.0.6. After that, I got a lot of dropped websocket connections. Through investigation, I found that 2.3.0 introduced the compression=True parameter for websockets. It turns out, that this change caused the dropped connections. Simply setting compression=False (and thereby returning to the pre-2.3.0 behaviour) solved the issue.

I can understand that having compression on by default makes a lot of sense for text messages (e.g., send_str or send_json messages). We can naturally assume statistical redundancy for text messages.
However, I find it strange to find compression on by default for binary messages (e.g., send_bytes messages). I don't think we can assume anything about binary messages. Said messages may very well be near-random (and therefore compress very poorly).

In my application, I send a lot of binary data over a websocket (roughly 500 KBps) that doesn't compress very well at all. The (needless) compression caused the python process take up all CPU resources and, ultimately, drop the connection due to timeout.

Expected behaviour

I expected send_bytes to have compression off by default.

Actual behaviour

Since 2.3.0, WebSocketResponse has compression on by default. Since 3.0.0, it is possible to disable compression on a per-method basis (send_bytes(..., compression=False)). However, compression is on by default, which I think is very questionable.

@asvetlov
Copy link
Member

What is exact reason for dropping connections? Timeouts on sending/receiving data?

If both client and server negotiates that they want and able to use compression -- why force disabling it?

I don't see why aiohttp should make assumption about data content. Binary data could be random but very often they are structured and could be compressed.
You can send a random text over websocket as well :)

If you don't need a compression -- just disable it.

The parameter's default value is kind of frozen. To don't make a mess we should support a consistency for all aiohttp 3.x releases.

We can return to the question at aiohttp 4.0 development time but I still not convinced that we should. I see very many cases when compression is useful (maybe with lower throughput)

@frederikaalund
Copy link
Contributor Author

Thanks for the quick reply.

What is exact reason for dropping connections? Timeouts on sending/receiving data?

Timeout on send_bytes. I've wrapped send_bytes with async_timeout.timeout(5).

Binary data could be random but very often they are structured and could be compressed.

Emphasis mine. If that is indeed the case, then just go ahead and close this issue. In my experience, with the data that I'm used to seeing, binary data is very often near-random (and therefore incompressible). Only rarely do I send structured binary data.

I haven't done any studies of the usual structure (or non-structure) of binary data in HTTP-based applications. It may very well be that structured binary case is the common case (and that my case is rare).

We can return to the question at aiohttp 4.0 development time but I still not convinced that we should. I see very many cases when compression is useful (maybe with lower throughput)

That makes a lot of sense. Let's see if others are having the same issue that I had. If not, just keep things as they are. Closing this for now.

@hubo1016
Copy link
Contributor

@asvetlov I've noticed that aiohttp uses the default level for zlib.compressobj, which is 6. I think that is too large for a transport compression. We should set it to 1 (Z_BEST_SPEED). Setting the compress level to Z_BEST_SPEED improves nearly 100% performance for common text data like JSON, with only 2% larger result.

There are a lot of binary data that are already compressed (e.g. JPEG images, MP4 videos, MP3 audio, ZIP archives, etc). Processing such data causes a lot of CPU time, and user should disable compression explicitly (even Z_BEST_SPEED is too slow)

@asvetlov
Copy link
Member

What https://tools.ietf.org/html/rfc7692 recommends?
What other implementations does?

@asvetlov
Copy link
Member

Compressing already compressed data is pointless, you are right.
I think it is up to user do disable compression for sending such payload.
Websocket library itself has no knowledge about the data nature.

@hubo1016
Copy link
Contributor

@asvetlov Yes you are right. Users are responsible for these situations.

RFC said nothing about compress level, I think it is just an implementation detail, because it is transparent to the decompression side.

@hubo1016
Copy link
Contributor

Syntax:	gzip_comp_level level;
Default:	
gzip_comp_level 1;
Context:	http, server, location
Sets a gzip compression level of a response. Acceptable values are in the range from 1 to 9.

Nginx gzip module uses 1 by default.

@asvetlov
Copy link
Member

Level 1 makes sense. Does anybody want to make a pull request?

@lock
Copy link

lock bot commented Oct 28, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs.
If you feel like there's important points made in this discussion, please include those exceprts into that [new issue].
[new issue]: https://github.com/aio-libs/aiohttp/issues/new

@lock lock bot added the outdated label Oct 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants