Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update of TechEmpower web framework benchmark for latest Colossus version #660

Closed
plokhotnyuk opened this issue Jan 14, 2018 · 7 comments
Closed

Comments

@plokhotnyuk
Copy link

plokhotnyuk commented Jan 14, 2018

Please review my changes that is merged into benchmarks already: TechEmpower/FrameworkBenchmarks@72003b9

...and also pending PR with JVM options tuned for better throughput: TechEmpower/FrameworkBenchmarks#3184

Now waiting for final 15-th round (or next preview-4). Here are results of preview-3 that was run at Nov 2017: http://tfb-logs.techempower.com/round-15/preview-3/colossus

@DanSimon
Copy link
Contributor

Changes look good to me. Will be very interesting to see how the latest version performs as well as running on Scala 2.12.

@benblack86
Copy link
Contributor

We were going to update the code to use the same json library as everyone else, but if this is faster I suppose we can go with it. Thanks.

@plokhotnyuk
Copy link
Author

plokhotnyuk commented Jan 16, 2018

I have used wrk to send request on the same host with following commands for plain text & JSON tests (because it easy saturates my home 1Gb network when running from different host) :

wrk --latency -d 15 -c 256 --timeout 8 -t 8 http://localhost:9007/plaintext
wrk --latency -d 15 -c 256 --timeout 8 -t 8 http://localhost:9007/json

Here are results for top Scala & Java web frameworks:

|    Framework    |   Route    |  Req/sec  |
|-----------------|------------|-----------|
| Rapidoid (fast) | /plaintext | 691358.72 |
| Rapidoid (fast) | /json      | 656923.79 |
| Netty           | /plaintext | 656598.18 |
| Colossus        | /json      | 629668.62 |
| Netty           | /json      | 629367.79 |
| Colossus        | /plaintext | 628368.22 |
| Fintrospect     | /plaintext | 279797.49 |
| Finatra         | /plaintext | 272258.09 |
| Fintrospect     | /json      | 265387.21 |
| Finatra         | /json      | 262322.79 |

Environment:

Intel(R) Core(TM) i7-7700 CPU @ 3.6GHz (max 4.2GHz)
RAM 16Gb DDR4-2400
Ubuntu 16.04, Linux desktop 4.13.0-26-generic
Oracle JDK build 9.0.1+11 64-bit

IMHO it is ugly method to test HTTP servers, but it seems the same approach is taken for these benchmarks by maintainers: http://tfb-logs.techempower.com/round-15/preview-3/colossus/

I would prefer to spend couple of nights to get a red pill, setup realistic environment and test how systems behave under load by measuring response times properly with wrk2.

@benblack86
Copy link
Contributor

I would expect the plaintext to be faster than json since the service has to do less work; it doesn't have to create JSON.

I'm not a fan of the JSON test since at least for the case of colossus we are just testing the selected JOSN library. Changing the JSON library will change the results. Maybe we should only be in the plaintext test...

@plokhotnyuk
Copy link
Author

plokhotnyuk commented Jan 17, 2018

IMHO it is because error between runs is greater than time of serialization of response.

In the message_benchmark branch I have added a benchmark which compares serialization of plain text vs. JSON serialization: https://github.com/plokhotnyuk/jsoniter-scala/compare/message_benchmark

It can be started from the root of jsoniter-scala project directory by the following command:

sbt -no-colors clean 'benchmark/jmh:run -prof gc .*MessageBenchmark.*'

Here are its results on my notebook:

[info] MessageBenchmark.getBytes                                        thrpt    5  23783496.718 ± 411760.044   ops/s
[info] MessageBenchmark.getBytes:·gc.alloc.rate                         thrpt    5      1330.161 ±     22.505  MB/sec
[info] MessageBenchmark.getBytes:·gc.alloc.rate.norm                    thrpt    5        88.000 ±      0.001    B/op
[info] MessageBenchmark.getBytes:·gc.churn.PS_Eden_Space                thrpt    5      1288.581 ±    583.497  MB/sec
[info] MessageBenchmark.getBytes:·gc.churn.PS_Eden_Space.norm           thrpt    5        85.243 ±     38.384    B/op
[info] MessageBenchmark.getBytes:·gc.churn.PS_Survivor_Space            thrpt    5         0.087 ±      0.067  MB/sec
[info] MessageBenchmark.getBytes:·gc.churn.PS_Survivor_Space.norm       thrpt    5         0.006 ±      0.004    B/op
[info] MessageBenchmark.getBytes:·gc.count                              thrpt    5        19.000               counts
[info] MessageBenchmark.getBytes:·gc.time                               thrpt    5        17.000                   ms
[info] MessageBenchmark.writeJsoniter                                   thrpt    5  21217203.827 ± 434285.134   ops/s
[info] MessageBenchmark.writeJsoniter:·gc.alloc.rate                    thrpt    5       862.956 ±     17.460  MB/sec
[info] MessageBenchmark.writeJsoniter:·gc.alloc.rate.norm               thrpt    5        64.000 ±      0.001    B/op
[info] MessageBenchmark.writeJsoniter:·gc.churn.PS_Eden_Space           thrpt    5       871.731 ±    707.762  MB/sec
[info] MessageBenchmark.writeJsoniter:·gc.churn.PS_Eden_Space.norm      thrpt    5        64.683 ±     53.147    B/op
[info] MessageBenchmark.writeJsoniter:·gc.churn.PS_Survivor_Space       thrpt    5         0.075 ±      0.072  MB/sec
[info] MessageBenchmark.writeJsoniter:·gc.churn.PS_Survivor_Space.norm  thrpt    5         0.006 ±      0.005    B/op
[info] MessageBenchmark.writeJsoniter:·gc.count                         thrpt    5        13.000               counts
[info] MessageBenchmark.writeJsoniter:·gc.time                          thrpt    5        11.000                   ms

Both of them are too efficient to show impact on request handing.

@DanSimon
Copy link
Contributor

Ah, when you're using wrk you need to be using pipelining. This is not done by default and you need a lua script to have it work properly. See the benchmarking section in CONTRIBUTING.md, and luckily we have a working version of the script here.

In the techempower benchmarks, they do not use pipelining in the json test but they do in the plaintext test with a pipelining factor of 16. I don't know why they do it this way, but when we run benchmarks ourselves we need to make sure we replicate their parameters to get comparative results.

Personally I agree the json benchmark is not particularly useful for us, but I would prefer we be in as many tests as possible and if that's the case then I'd also prefer we just use whatever is fastest for us. A hello-world benchmark is never going to be truly representative of actual performance and I think the whole thing is just a publicity stunt, so we may as well play the game.

@plokhotnyuk
Copy link
Author

Using your script with pipeline depth = 16:

wrk --latency -d 15 -c 256 --timeout 8 -t 8 -s pipeline.lua http://localhost:9007/plaintext -- 16
wrk --latency -d 15 -c 256 --timeout 8 -t 8 -s pipeline.lua http://localhost:9007/json -- 16

I got following results on the same env.:

|    Framework    |   Route    |  Req/sec   |
|-----------------|------------|------------|
| Rapidoid (fast) | /plaintext | 2859868.14 |
| Rapidoid (fast) | /json      | 2597297.65 |
| Netty           | /plaintext | 2188014.11 |
| Colossus        | /plaintext | 2105560.54 |
| Colossus        | /json      | 2039088.10 |
| Netty           | /json      | 2013778.14 |
| Fintrospect     | /plaintext |  415463.89 |
| Finatra         | /plaintext |  387467.39 |
| Fintrospect     | /json      |  375411.84 |
| Finatra         | /json      |  371878.56 |

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants