Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use wrk as a proper sieger #223

Merged
merged 10 commits into from
Jun 11, 2018
Merged

Use wrk as a proper sieger #223

merged 10 commits into from
Jun 11, 2018

Conversation

waghanza
Copy link
Collaborator

@waghanza waghanza commented Jun 1, 2018

Hi,

This PR enable wrk. Siege process is handled by wrk and a custom lua script, write results in a csv file.

The crystal part parse the csv and compute the rank from the average latency.

I'm not quiet sure of how to define this rank (what metric should be in).

@OvermindDL1 Any idea ?

Regards,

@OvermindDL1
Copy link
Collaborator

OvermindDL1 commented Jun 1, 2018

Honestly I'd probably use 2 ranks:

  1. Throughput, showing average as well as the deviation (max/min at least).
  2. Latency, showing average as well as the deviation (max/min at least).

If only one wanted to be picked then I'd pick throughput (per second or so) as it tells 'more' information than just latency while also including some of the latency information within it where latency does not do the opposite. In addition, showing the max tested latency is a massive boon as knowing the 'max' time that a request can take (there is a lot of GC horrors out there) is very useful as if 1 out of every 100 requests on your server suddenly takes 3 seconds instead of 50ms is very bad for the user experience (hence why showing the deviation is useful to know how often it occurs).

Having both throughput and latency as rankings gives you a more complete overall picture.

@waghanza
Copy link
Collaborator Author

waghanza commented Jun 1, 2018

@OvermindDL1 interesting ... and is the percentile as proper information ?

@OvermindDL1
Copy link
Collaborator

Percentile is part of the deviation yep.

@waghanza
Copy link
Collaborator Author

waghanza commented Jun 1, 2018

@OvermindDL1 sorry I do not understand how those metrics are calculated.
could you explain how those metrics (those defined here) are calculated (and what their are representing) ?

@OvermindDL1
Copy link
Collaborator

@waghanza You can get the full deviation via that latency:percentile(float) function it looks like, so probably this to get the full statistical deviation:

   for _, p in pairs({ 50, 90, 99, 99.999 }) do
      n = latency:percentile(p)
      io.write(string.format("%g%%,%d\n", p, n))
   end

Which of course you can output to dot format to generate pretty charts or output to csv or whatever. :-)

And of course the summary.requests/summary.duration should get you the requests/second I think.

Oh, it is also good to report errors, only if non-0 though, those signify major issues and that test needs to be checked on if that happens, so we definitely don't want to ignore those!

@OvermindDL1
Copy link
Collaborator

In addition to the other latency values of course*

@OvermindDL1
Copy link
Collaborator

OvermindDL1 commented Jun 1, 2018

All of that information could be put on a whisker-chart or a some other kind of deviation chart to show how things relate in more detail.

For an overall ranking number, I'd probably just use requests-per-second, maybe with a poor weighting for ones with high latency.stdev values (as those indicate wildly fluctuating latencies).

@waghanza waghanza force-pushed the master branch 5 times, most recently from f5dbdbf to 80b7718 Compare June 5, 2018 11:56
@waghanza
Copy link
Collaborator Author

waghanza commented Jun 6, 2018

I've pushed a new README, with some results from wrk

Actually I only use the first rule (GET on /) to compute results (so this PR is not mergeable).

The reason I pushed the new results are to find the best way to display them.

@tbrand Do you find the results understandable ? (Results are based on a 5 minutes wrk siege)

@waghanza
Copy link
Collaborator Author

waghanza commented Jun 8, 2018

@tbrand I have updated the README, I think structure is clearer

DISCLAIMER results are computed with the first rule : GET on /

@OvermindDL1 the results seems more realistic 😜

@tbrand
Copy link
Collaborator

tbrand commented Jun 8, 2018

👍

Copy link
Collaborator

@OvermindDL1 OvermindDL1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just primarily some questions.

OS: Linux (version: 4.16.7-200.fc27.x86_64, arch: x86_64)
CPU Cores: 4
OS: Linux (version: 4.16.11-100.fc26.x86_64, arch: x86_64)
CPU Cores: 8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the literal number of cores, or is it the number of hyper-threaded cores? I showed both values in my benchmarker as it is an important value to state and I'd think it should be distinguished here as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it the number of physical cores -> my proc is AMD FX-8320E Eight-Core Processor
in fact, I'm running on different computers / kernels

README.md Outdated
37. [rails](https://github.com/rails/rails) (ruby)
35. [sinatra](https://github.com/sinatra/sinatra) (ruby)
36. [django](https://github.com/django/django) (python)
37. [akkahttp](https://github.com/akka/akka-http) (scala)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, scala's akka one should not be that slow! I wonder if the author forget to enable threading or so... o.O

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think results are wrong

README.md Outdated
| python | tornado | 726.349087 | 706.362335 | 718.087107 |
| nim | jester | 246.613563 | 246.253752 | 246.379956 |
| nim | mofuw | 147.627614 | 83.375723 | 108.593340 |
| ruby | rails | 0 | 5244.0 | 0.88MB |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, should this list be sorted by language name, or should it be sorted by req/s or so since the above is not listing the req/s on the language?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

req/s, but a think at a time ... (I'll sort after results stabilization)

README.md Outdated
OS: Linux (version: 4.16.7-200.fc27.x86_64, arch: x86_64)
CPU Cores: 4
OS: Linux (version: 4.16.11-100.fc26.x86_64, arch: x86_64)
CPU Cores: 8
```

### Ranking (Framework)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the req/s and maybe throutput be listed with these as well so people can see the differences in the ranking, not just the positional ranking? Perhaps it should be a markdown table too?

seventy_five: String,
ninety: String,
ninety_nine: String
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Showing the 4 or 5 9's percentile stat is also useful and standard as it shows the extreme outliers.

@waghanza
Copy link
Collaborator Author

waghanza commented Jun 8, 2018

@OvermindDL1 I have published other results on https://github.com/tbrand/which_is_the_fastest#ranking-framework

  • 16 threads
  • 15 seconds
  • 1000 connections

but still GET on / => next step is to combine all routes

Copy link
Collaborator

@OvermindDL1 OvermindDL1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

16 saturated threads on 4 native core is going to create a LOT of contention that is going to hobble any properly multi-threaded communicative libraries (which would be most) making the ones not as well multi-threaded look like they perform better than they actually do. 4 core itself is not enough for a good test regardless unless the test is being performed across two machines with a fiber interlink and the benchmark sieger (the one running wrk) has more cores than the server being tested (assuming the server being tested has dedicated 4 native cores).

README.md Outdated
```
OS: Linux (version: 4.16.7-200.fc27.x86_64, arch: x86_64)
OS: Linux (version: 4.16.13-300.fc28.x86_64, arch: x86_64)
CPU Cores: 4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 core is no where near enough for a good test! o.O

@waghanza waghanza changed the title [WIP] Use wrk as a proper sieger Use wrk as a proper sieger Jun 11, 2018
@waghanza waghanza changed the title Use wrk as a proper sieger Use wrk as a proper sieger Jun 11, 2018
@waghanza waghanza merged commit 1a556ed into the-benchmarker:master Jun 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants