-
Notifications
You must be signed in to change notification settings - Fork 660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmarking test - statistical results #101
Comments
It's quite fun watching all 16 hyperthreads get pegged to 100% I have to say. ^.^ First 'big' run was done at just 1s warmup and 1s test run time, so it is likely not accurate but is a quick way to make sure it is all working at least, the results: Total Cores: 16
Processing: bin/server_rust_iron
Shutting down server: bin/server_rust_iron Processing: bin/server_rust_nickel
Shutting down server: bin/server_rust_nickel Processing: bin/server_rust_rocket
Shutting down server: bin/server_rust_rocket Processing: bin/server_crystal_kemal
Shutting down server: bin/server_crystal_kemal Processing: bin/server_crystal_raze
Shutting down server: bin/server_crystal_raze Processing: bin/server_crystal_router_cr
Shutting down server: bin/server_crystal_router_cr Processing: bin/server_go_echo
Shutting down server: bin/server_go_echo Processing: bin/server_go_fasthttprouter
Shutting down server: bin/server_go_fasthttprouter Processing: bin/server_go_gin
Shutting down server: bin/server_go_gin Processing: bin/server_go_gorilla_mux
Shutting down server: bin/server_go_gorilla_mux Processing: bin/server_go_iris
Shutting down server: bin/server_go_iris Processing: bin/server_python_flask
Shutting down server: bin/server_python_flask Processing: bin/server_python_sanic
Shutting down server: bin/server_python_sanic Processing: bin/server_elixir_phoenix
Shutting down server: bin/server_elixir_phoenix Processing: bin/server_elixir_plug
Shutting down server: bin/server_elixir_plug Processing: bin/server_node_clusterexpress
Shutting down server: bin/server_node_clusterexpress Processing: bin/server_node_express
Shutting down server: bin/server_node_express
|
That looks so much better on github than in the terminal... ^.^; Good to see no error happened on any of them! Some are clearly faster than others, but most are quite close together. It is sorted by name at this point so ignore the sorting (what should I sort by?). It is also quite interesting at some of the speed differences between the 3 different routes for a single given server. In general it looks like from a quick look on that, that python's flask is the overall slowest (by far!), with node's express not far behind. Go seemed to have the most impressive showing (I really really dislike the 'language' itself though, blegh...). Rust was not far behind Go. Elixir was a good middle ground. Crystal was pretty far behind though, I guess it's multi-threading is not up to par quite yet unlike many of the other frameworks (I am running the latest/nightly versions of crystal/rust/go/python3, though my elixir is one version out of date now since a new version just came out). Now I think I may let it run with a higher connections count with 5s warmup and 30s runtime per test. |
First I decided to test 'just' rust with 10k concurrent connections, whoops, I need to increase one of my OS limits, glad I did this test first... >.> On the plus side for as many threads as it was able to create, (EDIT: misread, amended this sentance ^.^) Rust did quite well, it did not really scale any better with the connection count though. The results: Total Cores: 16 Processing servers:
Processing: bin/server_rust_iron
Processing: bin/server_rust_nickel
Processing: bin/server_rust_rocket
|
With my system limit enhanced, I decided to try elixir with 50K connections (I set the file descriptor limit to 65536) and warmup of 3s and runtime of 10s, and the results: Total Cores: 16 Processing servers:
Processing: bin/server_elixir_phoenix
Processing: bin/server_elixir_plug
|
50k is definitely a bit more than my system can handle it seems, I think 1k is a good default value then, after all even rust didn't scale to higher concurrent connections later but it did seem to sustain it well. Now it is time for the 5s warmup, 30s runtime tests with 1k connections. I went ahead and pasted in the shell command I used to launch it as well so all settings can be seen, it defaults to 1000 concurrent connections and it is limited to half the total core count of sending the surge of requests, 'most' servers absolutely cap out every CPU at 100%, though some are obviously single threaded, like python and node. Memory usage did not seem to flucuate much among any of the servers though, baseline of system was about 390megs in use, rust seemed to spike to between 420-600megs (the higher memory usage one was the faster one as expected, I should have it report cpu/memory usage of the system at baseline and during tests... maybe later), crystal seemed to average ~500megs, though it seemed to only be using 1 CPU total (~102% CPU usage instead of the expected ~1200%), shouldn't crystal be heavily multi-threaded, this would definitely explain its poor ranking, though perhaps it's trying to minimize context switching cost by passing information between threads (going multi-threaded will definitely add in a hit on speed, just a question of if the multi-threading itself will make up for it or not, for note, the siegeing engine is using ~10% of each core), and wow when the go servers start running all cores are pegged to 100% again, with surprisingly little memory usage, only around ~438M system usage (why does the language syntax suck so much, blegh...), all the 4(5?) other go servers used almost identical memory too, fasthttprouter was just a 'tiny' bit less memory than the others (~432M) and iris using just a tiny bit more (~450), and pythons flask ate about ~418m and only 1 core, however python's sanic hit near 100% CPU usage(!) and ate ~520M,, it is multi-process as I saw many processes startup, then elixir phoenix started, ate about 80% of each CPU (it's VM tries to be CPU friendly as much as possible) and ate 550M, elixir plug is identical, which makes sense since phoenix just a full web framework on top of plug, and nodes clusterexpress is obviously mulit-core, it hits 100% of all CPU's, however it eats an amazing 1.24G of RAM (!?), express by itself barely hits 100% of one CPU, but only eats ~477M (which is still high compared to most servers here), and the output: ╰─➤ ../stats.exs -c 1000 -w 5 -d 30 rust crystal go flask sanic elixir node Processing servers:
Processing: bin/server_rust_iron
Processing: bin/server_rust_nickel
Processing: bin/server_rust_rocket
Processing: bin/server_crystal_kemal
Processing: bin/server_crystal_raze
Processing: bin/server_crystal_router_cr
Processing: bin/server_go_echo
Processing: bin/server_go_fasthttprouter
Processing: bin/server_go_gin
Processing: bin/server_go_gorilla_mux
Processing: bin/server_go_iris
Processing: bin/server_python_flask
Processing: bin/server_python_sanic
Processing: bin/server_elixir_phoenix
Processing: bin/server_elixir_plug
Processing: bin/server_node_clusterexpress
Processing: bin/server_node_express
|
It seems to be about the same relative values as compared to the 1s test (^.^:), go just stomps everything else with rust close behind, etc... etc... Interestingly flask got overloaded well enough that it started having a tiny bit of errors overall, still poor compared to everything else that is error-less. ^.^ Most things seem to have a pretty good worst-case latency as well (Rust had the best worst-case by far of anything else). To continue tests I think I'd need to start overwhelming them and seeing when they start to fail. Should I toss my script into the repo? Not pretty at all but it works. |
Also apparently I had req/s and throughput backwards, ignore that, it's fixed in the code now. :-) |
I think we should monitor memory usage, that should have a high value in the analysis. I missed analyzing Lucky (Crystal), I didn't find it in your analysis. |
I think we should improve the presentation of the results and even create a kind of graphics in this style, we can even put the results in different environments like win, Linux, Mac and their configs eg. |
Yeah this is a very raw dump of data, it could easily be output in any way. And Lucky must not have been in the latest git master as that is what I used. ^.^ |
I whipped up a C++ server using libevent (libevent is the library that a lot, like a LOT of servers use, such as most servers in node, go, etc...) using the http(s) server API (this is a full server routing framework and all), I figure it should be a good baseline as most things build on top of libevent (should I PR it in?): Results:╰─➤ ../stats.exs -w 1 -d 2 evhtp rust crystal go Processing servers:
Processing: bin/server_cpp_evhtp
Processing: bin/server_rust_iron
Processing: bin/server_rust_nickel
Processing: bin/server_rust_rocket
Processing: bin/server_crystal_kemal
Processing: bin/server_crystal_raze
Processing: bin/server_crystal_router_cr
Processing: bin/server_go_echo
Processing: bin/server_go_fasthttprouter
Processing: bin/server_go_gin
Processing: bin/server_go_gorilla_mux
Processing: bin/server_go_iris
As expected, C++ was absolutely faster than everything else by a fairly healthy (though not a magnetudal) margin, with Go's fasthttprouter coming up second. Should I PR it in to use as a good baseline @tbrand and others? |
I added a final listing of 'Rankings' based on the average requests per second, running the above again gives (skipping all the setup/running stuff as it's identical, and as usual a 2 second testing time does not give any real difference from 30 second testing time, so 2s is what I used so I'm not sitting here for a half hour while it runs): |Path|URL|Errors|Total Requests Count|Total Requests/s|Total Requests Throughput|Total Throughput/s|Req/s Avg|Req/s Std RankingsRanking by Average Requests per second:
|
No I don't know why crystal is so very slow compared to the rest, this is a highly multi-core system so I can only guess that it's not taking advantage of threading or something... This would of course be more invisible the fewer cores you have (as was tested in another issue somewhere). Is there anyone that can take a look at the crystal servers and see how to get them working with many cores? |
And for the next update of the stats displayer it adds a space around the Example:
RankingsRanking by Average Requests per second:
|
Ran a larger test, results (with the very long 'running' part snipped out, I need to put that behind a verbose flag... hmm, actually, doing that now, done): ╰─➤ ../stats.exs -w 1 -d 3 cpp crystal elixir go node rust python Processing servers:
Processing: bin/server_cpp_evhtp Processing: bin/server_crystal_amber unable to connect to 127.0.0.1:3000 Connection refused
unable to connect to 127.0.0.1:3000 Connection refused
unable to connect to 127.0.0.1:3000 Connection refused
Processing: bin/server_crystal_kemal Processing: bin/server_crystal_lucky Processing: bin/server_crystal_raze Processing: bin/server_crystal_router_cr Processing: bin/server_elixir_phoenix Processing: bin/server_elixir_plug Processing: bin/server_go_echo Processing: bin/server_go_fasthttprouter Processing: bin/server_go_gin Processing: bin/server_go_gorilla_mux Processing: bin/server_go_iris Processing: bin/server_python_django unable to connect to 127.0.0.1:3000 Connection refused
unable to connect to 127.0.0.1:3000 Connection refused
unable to connect to 127.0.0.1:3000 Connection refused
Processing: bin/server_node_clusterexpress Processing: bin/server_node_clusterpolka Processing: bin/server_node_express Processing: bin/server_node_polka Processing: bin/server_rust_iron Processing: bin/server_rust_nickel Processing: bin/server_rust_rocket Processing: bin/server_python_django unable to connect to 127.0.0.1:3000 Connection refused
unable to connect to 127.0.0.1:3000 Connection refused
unable to connect to 127.0.0.1:3000 Connection refused
Processing: bin/server_python_flask Processing: bin/server_python_flask.py Processing: bin/server_python_japronto Processing: bin/server_python_sanic
RankingsRanking by Average Requests per second:
|
Got ruby installed on the server, that was painful, new results of everything I can compile so far (maybe install nim or dotnetcore next): ─➤ ../stats.exs -w 1 -d 3 _ Processing servers:
Processing: bin/server_cpp_evhtp
unable to connect to 127.0.0.1:3000 Connection refused
unable to connect to 127.0.0.1:3000 Connection refused
Processing: bin/server_crystal_kemal
RankingsRanking by Average Requests per second:
|
For note, amber fails to run because of: ╰─➤ bin/server_crystal_amber
Environment file not found for ./config/environments/production (Amber::Exceptions::Environment)
from ???
from crystal/amber/lib/amber/src/amber.cr:24:3 in '__crystal_main'
from /usr/share/crystal/src/crystal/main.cr:0:3 in 'main'
from __libc_start_main
from _start
from ??? I'm guessing this is a bug in this server and it should be corrected (pathing maybe?). |
Still awaiting some Nim results ;) Here is a quick link for you: https://nim-lang.org/install_unix.html. If you're using a package manager, be sure to verify you got at least nim v0.18.0. |
@dom96 Oh hey, thanks, you've spurred me on to go ahead and install nim then. ^.^ Running the tests on all servers, this will take a while to run after I install nim, will post results when complete. :-) |
Also decided to install dotnet too, and wow that csharp server was a pain to get running, dependency after dependency after dependency that it does not acquire itself and just assumes (you know what they say about assuming ;-) ) that they are all installed globally. Doesn't dotnetcore have a dependency management system like decent languages? o.O Tests are finally running now, will post when complete... |
Wow! Initial tests show that nim is fast, like whupping everything but C++ itself, I'm curious to see how everything will fair when the final results come in! |
Oy, well one of the nim frameworks is super fast, the other is abysmally slow, and I'm surprised just how poor dotnet is doing considering it is a JIT'd language (it's even beat by Elixir, which is a bytecode interpreted language!), probably just a really poorly made framework: ╰─➤ ../stats.exs -w 1 -d 3 _ Processing servers:
Processing: bin/server_cpp_evhtp
unable to connect to 127.0.0.1:3000 Connection refused
unable to connect to 127.0.0.1:3000 Connection refused
Processing: bin/server_crystal_kemal
RankingsRanking by Average Requests per second:
|
Python's django is missing some |
Setting up scala now (ah been a while, I miss this language, hate the JVM, but scala is fun, and sbt is awesome). I have no clue how to do anything with obj and swift, does anyone have some easy instructions on setting those up and getting them working? |
I installed every module I could find for python that contains |
I wouldn't call that abysmally slow ;) That's my framework which I haven't put any optimisation effort into at all yet. That's where https://github.com/dom96/httpbeast comes in. Happy to see mofuw beating me to the punch, it shows what you can achieve if you just put a little effort into optimisation (and winning benchmarks ;)). CC @2vg. You'll like the results, keep up the awesome work! :) |
Oh and btw. Keep in mind that Jester is single threaded, it does not take advantage of extra cores at all. |
Aaaand the scala webserver is borked, it tries to start up multiple copies of itself all on the same port consequently crashing... leaving that out then... >.> |
@dom96 That is a BIG issue then, you need to multi-thread that stuff! ^.^ You want to fix the server for that? :-) For note, it's still about half (almost) crystal's speed and crystal is single-core too, so add in multi-core to that server and some optimizations and let's get it on up near the top! :-) |
I was curious, so I set the C++ webserver to single-threaded (I made the source configurable, so I just change the thread count constant at the top of the ╰─➤ ../stats.exs -w 1 -d 3 cpp crystal Processing servers:
Processing: bin/server_cpp_evhtp
unable to connect to 127.0.0.1:3000 Connection refused
unable to connect to 127.0.0.1:3000 Connection refused
Processing: bin/server_crystal_kemal
RankingsRanking by Average Requests per second:
|
So crystal is decent with single-threaded, but considering the C++ code still has locks and I'm not sure the crystal code does, it is still not an accurate test (especially as the C++ code is still a good deal faster). |
@OvermindDL1 To be honest I'm not convinced these benchmarks should mean much for most users. I've been using Jester happily for years and never had any performance problems, the Nim forum (even though it's ugly) has been running in production for years. But yeah, I'll play the benchmark game, I've been meaning to for years as well :) |
Btw I'm very surprised that Rust isn't winning these benchmarks, they're always raving about zero-cost abstractions, what's happening? :) |
Oh indeed, remember what repo this is, it is purely for testing maximal response time and load from servers in entirely unrealistic scenario's, essentially it's only testing framework overhead, 99.9999% of webservers easy will be dominated by the user code, in any case, it's only when the framework is beyond horribly slow (*cough*ruby*cough*python*cough*) that it really starts to matter. Though a few times it will matter, like with IOT devices where you will potentially have millions upon millions of active connections. :-)
Yep, it's entirely just a fun thing, nothing serious at all. ^.^ I love putting in some concentration at times to really focus on some trivial problem though, it means that I can be quicker at work since I get my focusing out elsewhere. Heh...
It's mostly the backends they use. If you check their assembly they are fast, like should be faster than C++ kind of fast, but they are using the normal OS primitives and it doesn't look like any of the rust frameworks are using anything like libevent (though they could, it's a simple C API). I might make a Rust server sometime if anyone pings me enough about it that uses evhtp from Rust, I'd expect it to be about equal to C++ then. EDIT: For note, I think both Go and nodejs use libevent. If NIM's mofow framework doesn't use libevent, then I have to say major major kudo's on it's speed! |
It doesn't. It uses
All the libraries you mentioned are also using these primitives, perhaps they are very well optimised, but with effort I'm sure Rust could get up to the same speed. I just expected it to be there already since they really care about zero-cost abstractions. |
Heh, true, though zero-cost abstractions does not mean that they know the most efficient kernel calls. ^.^; |
So my new server is setup, nice and empty and ready to run tests on (for now), so I'm playing with it. :-)
8-core (16-hyperthreads), 3.7ghz, 16gigs ram (for now), etc... etc...
First, the current result sets that are currently built into this git project are not indicative of the actual through-put that a server can sustain as it always tests a simple iteration of command and how fast they complete rather then how long each request takes, the average, the longest a request took, etc... etc...
So I whipped up a quick script to test the statistical parts in far far greater detail than this current git HEAD does, these are the results for a set of servers. I used rust, crystal, go, python, elixir, and node for the servers, mostly because I haven't installed (or figured out how, depending) the rest of the stuff, plus these run the fastest overall. I did have an issue with
server_python_japronto
, it refused to run... Theserver_python_flask
did not installflask
, I had to do that manually, that should be fixed...japronto
does not seem to be in pip, so it remained missing. I'll post the results here as I get them (it outputs markdown format so the posts will be a run with a bit of description at the top of each).The text was updated successfully, but these errors were encountered: