Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

add an async http benchmark #16

Merged

Conversation

anuragsoni
Copy link
Contributor

Hello! I'm not sure if there is any interest in adding a comparison with an async based http server so feel free to close if this will add too much overhead for running the benchmarks 馃槃 , but i've been experimenting with some async based buffered channels and so far on my test box the results seem pretty promising.

@anuragsoni
Copy link
Contributor Author

anuragsoni commented Sep 8, 2021

I was able to get a test run working (using the default make target from this repo). The benchmark was run inside docker on an 4 core i7-8559U CPU @ 2.70GHz (the base OS is Ubuntu 21.04), with 1000 connections and 60 second runs.

image

Copy link
Collaborator

@talex5 talex5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I do wonder why eio (io-uring) is slower with 1000 connections though. I'm hoping it's a Linux kernel problem and will get fixed at some point without me doing anything though ;-)

@talex5 talex5 merged commit c4311aa into ocaml-multicore:master Sep 9, 2021
@anuragsoni anuragsoni deleted the add-an-async-http-server-test branch September 9, 2021 12:07
@anuragsoni
Copy link
Contributor Author

I'm hoping it's a Linux kernel problem and will get fixed at some point without me doing anything though ;-)

That is quite possible :) I've been reading about io_uring performance improvements with every new kernel release so this might just be one of those things that is resolved by waiting a little longer 馃槃

@gasche
Copy link

gasche commented Oct 1, 2021

(I ended up here from the Multicore newsletter.)

Thanks for the nice benchmark! Almost all implementations seem to suffer from a breakdown point where they stop being able to serve more requests -- sometimes we see them dropping some requests beforehand, sometimes not. The only two implementations for which the breakdown point was not reached in the test are shuttle_async and rust_hyper. Could the number of requests/seconds be increased to reach those breakdown points as well?

In other words: if you try to measure implementations by "maximum number of requests/s they can handle under this benchmark", the current measurement for shuttle_async and rust_hyper is "infinitely many". This suggests that we may need to extend the measurement range to get real numbers for these two benchmarks, and actually know how many more requests they can handle than the others.

@anuragsoni
Copy link
Contributor Author

@gasche Could the number of requests/seconds be increased to reach those breakdown points as well?

I was curious about this as well! I found the test outputs from one my older test runs where I tried to increase the requests/seconds to 200,000.

image

I was also curious about what things look like with a much higher number of connections, so I also had a test run with 10_000 connections.

image

What's missing in these graphs is the latency numbers. The Rust server fared better than any of the OCaml options on the latency front.

@gasche
Copy link

gasche commented Oct 2, 2021

Thanks! I think ideally it would be better if the complete benchmark (comparing all implementations) included the rust_hyper breakdown point, by going to 160_000 or 200_000. It would give a better visual intuition of the performance ratio between the implementations.

P.S.: Congratulations on Shuttle, it looks like it's doing really well :-)

@anuragsoni
Copy link
Contributor Author

@gasche P.S.: Congratulations on Shuttle, it looks like it's doing really well :-)

Thanks! I've been pretty happy with how far I could stretch async so far!

I think ideally it would be better if the complete benchmark (comparing all implementations) included the rust_hyper breakdown point, by going to 160_000 or 200_000. It would give a better visual intuition of the performance ratio between the implementations.

Agreed! I'll propose a pull request with this change. I also want to take some time to see how much work it'll be to also generate graphs for latency numbers as that seems like a pretty important metric to keep an eye on. Serving a lot of connections might not be extremely useful if the latency numbers degrade too much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants