Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Http4k-netty performs really badly on all benchmarks (with lots of errors) #141

Closed
adam-arold opened this issue Jun 28, 2018 · 17 comments · Fixed by #480
Closed

Http4k-netty performs really badly on all benchmarks (with lots of errors) #141

adam-arold opened this issue Jun 28, 2018 · 17 comments · Fixed by #480

Comments

@adam-arold
Copy link

I've checked the techempower benchmarks and it seems that http4k-netty performs really badly and has a lot of errors in the benchmarks. Is this an inherent problem with Netty or is the benchmark bad?

@daviddenton
Copy link
Member

daviddenton commented Jun 28, 2018

TBH, we are unsure. The Netty backend is:

  1. the one that does seem to have the most problems in the TEB - even though the routing implementation is identical to the other backends. Then again, the SunHttp implementation is also problematic in TEB (it registers absolutely no successful calls), but that isn't expected to be a prod-ready server.
  2. the most complicated of all the supported backends, although there isn't actually that much code in the backend!
  3. the least used of all the supported backends - we generally default to using Undertow or Jetty if we require websocket support.
  4. we haven't done any real load testing of it in any prod project (as opposed to undertow & jetty). The TEB doesn't especially give us much in the way of feedback about what is going on. Potentially it's a problem with the worker thread pool being exhausted and then subsequent requests being rejected, but I can't support that with evidence.

We would appreciate any help that the community can give in locating the source of the problem. :)

@adam-arold
Copy link
Author

I've got my hands full right now but if I start using http4k I'll check the TEB repo, maybe I can find something. According to TEB the best performing option is http4k-apache so I'll go with that one. Full disclosure: I'm interested in http4k because I want to use the SubstrateVM and http4k works on that VM out of the box (mostly).

@daviddenton
Copy link
Member

@adam-arold I second this approach - Apache would seem to be best option!

If there are tweaks to the Apache server implementation that we can make then please let us know - although previously it had been indicated that it works completely OOTB so "mostly" intrigues me..

@adam-arold
Copy link
Author

adam-arold commented Jun 28, 2018

It is not an Apache / http4k problem, but a Substrate problem (reflection is a bit problematic). More about this here.

@daviddenton
Copy link
Member

According to this tweet, Apache should work with no mod (once again, cannot confirm): http://twitter.com/gyula_voros/status/1009711381510283265

@adam-arold
Copy link
Author

adam-arold commented Jun 28, 2018

Hah, that's funny. I know Gyula personally. I think I'll just talk to him. 👍 I guess you are right. The article I linked was about Netty (which is problematic apparently 😃 ).

@strohel
Copy link

strohel commented Aug 24, 2020

I was able to reproduce http4k-netty benchmark problems on a different http4k microservice and my own benchmarking testbed that also uses wrk (HTTP load generator).

When looking at the traffic using Wireshark, I can see that server responses on netty have Connection: close header and no Content-length header (which is a probable cause of the Connection: close). All other tested server engines behave differently and more efficiently: they do serve Content-length header and keep the TCP connection open for next requests.

In addition to hurting performance, this seems to fool wrk's hand-written HTTP parser into reporting read errors for every request (even though the HTTP response seems correct to both me and Wireshark):

$ wrk -c1 -t1 -d1 --latency --timeout=15 'http://localhost:8080/city/v1/get?id=101748111&language=cs'
Running 1s test @ http://localhost:8080/city/v1/get?id=101748111&language=cs
  1 threads and 1 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     3.38ms    0.93ms  12.12ms   95.33%
    Req/Sec   289.45     33.24   330.00     81.82%
  Latency Distribution
     50%    3.21ms
     75%    3.51ms
     90%    3.82ms
     99%    8.65ms
  317 requests in 1.10s, 56.96KB read
  Socket errors: connect 0, read 317, write 0, timeout 0
Requests/sec:    288.12
Transfer/sec:     51.77KB

Perhaps there is something in the Http4k <-> Netty integration that prevents content-length from being correctly passed?

(Netty itself seems to be well capable of persistent connections when used by Ktor) CC @goodhoko.

@daviddenton
Copy link
Member

That's amazing investigation work - thank you!

I wonder if this can be easily fixed somehow - although I'm not totally sure how at the moment to best do this - there may be something built into the Netty APIs which will automatically set the header instead of us just manually doing it.

Presumably it should be easy to replicate locally with wrk? The Netty integration is actually only a single file which lives here, so could easily be modified outside the http4k codebase to try and work out the best way: https://github.com/http4k/http4k/blob/master/http4k-server-netty/src/main/kotlin/org/http4k/server/Netty.kt

@daviddenton
Copy link
Member

daviddenton commented Aug 24, 2020

I've speculatively made this PR (on a very small response) to set the content-length or the transfer-encoding. Second run on warm JVM:

 david@dirtyRetina  ~  wrk -c1 -t1 -d1 --latency --timeout=30 'http://localhost:9000'                                   ✔  3080  21:29:52
Running 1s test @ http://localhost:9000
  1 threads and 1 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   360.08us    2.24ms  25.89ms   97.94%
    Req/Sec    14.61k     2.64k   17.26k    81.82%
  Latency Distribution
     50%   58.00us
     75%   70.00us
     90%   96.00us
     99%   14.60ms
  15957 requests in 1.10s, 1.07MB read
Requests/sec:  14515.34
Transfer/sec:      0.97MB

PR: #480

@strohel Could you try the implementation on this branch and see what the effect is on your test (as it will be a bit more real world than my 5 minute effort.. :) )

@strohel
Copy link

strohel commented Aug 24, 2020

Thanks for a quick response, @daviddenton.

What would be the easiest way to use your branch in my Gradle project? I've tried jitpack.io to try luck, but the build fails there: https://jitpack.io/com/github/http4k/http4k/netty-set-transfer-encoding-or-content-length-ef6164ee52-1/build.log

@daviddenton
Copy link
Member

the easiest thing is to just copy the entire netty.kt file from the branch into your project and use that version inside the call to asServer()

@strohel
Copy link

strohel commented Aug 24, 2020

Cool, tried it now and I can confirm that the TCP connection is now persistent (netty even adds connection: keep-alive header) and wrk is not fooled anymore. Let me run more representative and lengthier benchmarks and report the results.

@strohel
Copy link

strohel commented Aug 24, 2020

The full benchmark report is here: https://storage.googleapis.com/strohel-pub/bench-http4k-server-engines-excerpt/bench-results.html

The summary is that the fix made Netty a very viable server engine. (yay!) Latency-wise it is a clear winner among other engines, at the cost of higher memory demand, apparently.

@daviddenton
Copy link
Member

that's great! I’ll merge that in and get it released, then get it upgraded into the tech empower benchmark suite. We should see the results in the next few days.

As an aside, it might also be worth testing the old apache engine (http4k-server-apache4) and the sunhttp engine that comes with core, just to see how they compare. I’ve a suspicion that apache4 might outperform apache5... 🙃

Thanks again!

@daviddenton
Copy link
Member

Fixed, to be released in 3.259.0

@strohel
Copy link

strohel commented Aug 26, 2020

As an aside, it might also be worth testing the old apache engine (http4k-server-apache4) and the sunhttp engine that comes with core, just to see how they compare. I’ve a suspicion that apache4 might outperform apache5... upside_down_face

Here you go: https://storage.googleapis.com/strohel-pub/bench-http4k-server-engines-excerpt/bench-results.html

  • SunHttp is definitely usable just for development (how can it always have > 40 ms latency when the endpoint handling itself if < 2 ms?)
  • apache4 seems to be a tiiiiny bit faster than apache (5), both have some high memory consuption under load issues (I'll open separate issue)

@bt-wil
Copy link

bt-wil commented Jul 4, 2023

I'm seeing similar behavior with slow performance. This occurs with the request.body is used as the response which generates a connection: close.

package org.example

import org.http4k.core.HttpHandler
import org.http4k.core.Request
import org.http4k.core.Response
import org.http4k.core.Status.Companion.OK
import org.http4k.server.Netty
import org.http4k.server.asServer

fun main() {
  val app: HttpHandler = { request: Request -> Response(OK).body(request.body) }
  app.asServer(Netty(8000)).start()
  println("Started")
}


$ wrk --latency http://localhost:8000/test
Running 10s test @ http://localhost:8000/test
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     8.16ms    7.61ms  80.02ms   95.12%
    Req/Sec   201.81    154.68   450.00     57.69%
  Latency Distribution
     50%    7.03ms
     75%    9.66ms
     90%   13.15ms
     99%   37.12ms
  1147 requests in 10.02s, 42.56KB read
  Socket errors: connect 0, read 1147, write 0, timeout 0
Requests/sec:    114.51
Transfer/sec:      4.25KB


$ curl --verbose http://localhost:8000/test
*   Trying 127.0.0.1:8000...
* Connected to localhost (127.0.0.1) port 8000 (#0)
> GET /test HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.79.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< connection: close
<
* Closing connection 0


$ curl --verbose http://localhost:8000/test -d 'foo'
*   Trying 127.0.0.1:8000...
* Connected to localhost (127.0.0.1) port 8000 (#0)
> POST /test HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.79.1
> Accept: */*
> Content-Length: 3
> Content-Type: application/x-www-form-urlencoded
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< connection: close
<
* Closing connection 0
foo

Performance looks good if the response is static.

package org.example

import org.http4k.core.HttpHandler
import org.http4k.core.Request
import org.http4k.core.Response
import org.http4k.core.Status.Companion.OK
import org.http4k.server.Netty
import org.http4k.server.asServer

fun main() {
  val app: HttpHandler = { _: Request -> Response(OK).body("OK") }
  app.asServer(Netty(8000)).start()
  println("Started")
}


$ wrk --latency http://localhost:8000/test
Running 10s test @ http://localhost:8000/test
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   324.41us    3.29ms  75.70ms   99.33%
    Req/Sec    53.84k    18.90k   91.49k    70.15%
  Latency Distribution
     50%   68.00us
     75%   94.00us
     90%  157.00us
     99%  465.00us
  1075995 requests in 10.10s, 65.67MB read
Requests/sec: 106500.73
Transfer/sec:      6.50MB


$ curl --verbose http://localhost:8000/test
*   Trying 127.0.0.1:8000...
* Connected to localhost (127.0.0.1) port 8000 (#0)
> GET /test HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.79.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< content-length: 2
< connection: keep-alive
<
* Connection #0 to host localhost left intact
OK


$ curl --verbose http://localhost:8000/test -d 'foo'
*   Trying 127.0.0.1:8000...
* Connected to localhost (127.0.0.1) port 8000 (#0)
> POST /test HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.79.1
> Accept: */*
> Content-Length: 3
> Content-Type: application/x-www-form-urlencoded
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< content-length: 2
< connection: keep-alive
<
* Connection #0 to host localhost left intact
OK

But any stream-like response will trigger the slow performance.

package org.example

import java.io.ByteArrayInputStream
import org.http4k.core.Body
import org.http4k.core.HttpHandler
import org.http4k.core.Request
import org.http4k.core.Response
import org.http4k.core.Status.Companion.OK
import org.http4k.server.Netty
import org.http4k.server.asServer

fun main() {
  val app: HttpHandler = { _: Request ->
    val body = Body(ByteArrayInputStream("OK".toByteArray()))
    Response(OK).body(body)
  }
  app.asServer(Netty(8000)).start()
  println("Started")
}

This issue appears in the current release (5.2.1.0), and is fine on 3.265.0 -- introduced in 3.266.0. Looks like it stems from https://github.com/http4k/http4k/pull/496/files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants