Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of POST body processing speed is 10x slower in Cowboy 2.10.0 compared to 1.1.2 #1611

Open
EzoeRyou opened this issue Jun 28, 2023 · 3 comments

Comments

@EzoeRyou
Copy link

We're porting an Erlang software that depends on now deprecated Cowboy 1.1.2 to the recent Cowboy 2.10.0.

During the porting process, we found out that on processing POST body, Cowboy 2.10.0 performs 10x slower in terms of bandwidth relative to the Cowboy 1.1.2 without enabling JIT. It's still 8.4x slower even if we enabled JIT.

This regression of performance prevent us to update the Cowboy in our software.

Here is the minimal benchmark code to reproduce the issue, and the summary of benchmark result.

https://github.com/AoiMoe/cowboy_post_bench

@essen
Copy link
Member

essen commented Jun 28, 2023

You may want to tweak the read_body options or the HTTP/1.1 option active_n, maybe others.

@EzoeRyou
Copy link
Author

Thanks for the suggestions.

We tweaked various options, changing active_n and length doesn't solve the performance regression.

We found out that changing buffer size of socket setopts was effective. But it's still 10-20% slower than cowboy1. buffer need to be set to really huge value to compensate the regression introduced in cowboy2

The detailed micro benchmark code and results are noted here, see Test 3.

https://github.com/AoiMoe/cowboy_post_bench

The summary of tweaking buffer size is, cowboy2 with default buffer size of 1460 is 10x slower than cowboy1. the performance improves as we increase the buffer size. We saw dramatic improvement(or I'd like to call it compensation) on performance until buffer size of 32768. After that, it appears like diminishing returns but we see some performance improvement until 262144. Buffer size of 524288 was worse than 262144. It will never reach to the same performance of cowboy1.

While the performance regression on cowboy2 was somewhat mitigate by increasing the buffer size, the micro benchmark was performed on a loopback device rather than going through the real Internet route so it's not the real world scenario, we still think 10-20% performance regression is too much to risk the upgrade. We also think default behaviour should be sane.

Is there any way we can do to completely fix the performance regression introduced in cowboy2?

@essen
Copy link
Member

essen commented Jul 14, 2023

The changes that result in a performance drop are related to the support for HTTP/2 which performs better than HTTP/1.1 in real use cases. In the future Cowboy will also support HTTP/3 which performs even better (http3 branch is a work in progress).

There's likely room for improvement for HTTP/1.1 still, I'll take a look when time allows. But right now my priority is HTTP/3.

There's not much point measuring performance using loopback for what it's worth, although I'm sure the code performs worse in Cowboy 2 due to how it is structured. One thing you can do with Cowboy 2 however is write your own stream handler to handle these requests as stream handlers execute in the connection process and have the same performance properties as Cowboy 1 had.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants