New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
possible HTTP load problem #17395
Comments
Hi and thanks for opening this issue. To reproduce the problem it should be enough to run the SearchService class and the corresponding Gatling test. On my machine (Windows 7, 64 Bit, 16GB RAM) around 40% of the requests end up in an error message saying
|
Would you mind using the |
@rkuhn that's also a good reminder to add the missing Java overloads for specifying settings. @win1imb The result of such a load test will very much depend on benchmarking tools and setup. "10000 concurrent users" can mean a lot of different things. E.g. if you setup Increasing the number of connections from this point will only mean that connection establishment and request handling will battle for resources which means that timeouts are likely. So, in summary, try to separate testing for performance (How many RPS are possible?) and scalability over connections (What is the performance impact of running requests over more connections?). From what I've seen scalability is ok while performance still needs to be improved. /cc @sirthias |
@rkuhn I had a quick look but I'm familiar with the internals of akka. If I've understood it correctly, I have to use the Http.get(system).bind(....) method, right? The one where I can specify the backlog size as parameter has a lot of other parameters I have no clue about, what to pass here. @jrudolph you're right, that's why I provided the stress test setup with gatling in my example project, too. As you said, increasing the number of connections will end up in a battle for resources. But if you have to choose a technology, one can cope with that and answer the requests even if not very fast and the other refuses the connections because it's not able to manage the battle for resources, which one would you use? In a world with SLAs and zero downtime, answering a request with 150ms (which was the mean of spring boot in my comparison) compared to 40% connection refused, I would choose the first option ;) Don't get me wrong, I like akka and actor based programming, especially what it does to handle concurrency and resilience in a very concise way. As I said, possibly the error sits in front of the computer. But at the end of the day, we have to be confident with the technology we use. |
BTW, had to change my username, so the code examples are now at https://github.com/n1ko-w1ll/akka-poc |
My plan was to look into this today, but might not be able to, depending on how the traveling goes. I’ll definitely try out your project to see what’s wrong. |
This assumes that the problem is related to latency (i.e. the processing pipeline is too long but the server has still some CPU capacity left). If you assume the problem is throughput related (i.e. the server just cannot keep up with work) then the result is not slow responses but a DOS scenario where response times will increase and eventually the server won't answer any more at all. If the problem is throughput-related then in spirit of reactivity you would actually choose the server that backpressures new connections (e.g. by letting connection attempts timeout and rely on them reconnecting with a backoff strategy) instead of a server which lets itself get overwhelmed with connections/requests. What I meant before with "From what I've seen scalability is ok while performance still needs to be improved." is that akka-http may currently be just too slow to run your example but it doesn't really depend on the number of connections. So, I suspect that all that you are seeing is a consequence of akka-http being too slow. That said, I agree with you that akka-http should be at least in the same ballpark performance-wise as spring. :) Btw. thanks for including the gatling script and sorry that I overlooked it before. |
@jrudolph okay, I agree with that. But I don't think akka-http is too slow. If I reduce the number of concurrent users to 1000 in the given test, akka-http is even a bit faster than spring. |
@n1ko-w1ll well, it is slow :) We first want to make it stable with the minimal set of features to be usable, but we will need to optimize after. I hope we get there soon! |
@drewhk but still faster than spring ;) |
@n1ko-w1ll As @rkuhn suggested Increasing the backlog will then probably help, but unfortunately it is next to impossible to do it in Java code right now. If you can create a Scala class in your project you could create your own version of |
@jrudolph that could be an option. Have to see when I can find time to test it. |
@jrudolph does the io for akka-http plug-able? |
@hepin1989 There is a certain kind of plugability that allows you to use the HTTP layers without TCP, if that's what you mean. This mechanism is (will be) used to support HTTPS. You can even use a transport layer that is implemented with other implementations of reactive-streams. The implementation itself depends on akka-stream. What kind of application do you have in mind? |
@jrudolph that really what I have in mind.I am thinking how about using https://github.com/netty/netty-tcnative for ssl and netty/aio for tcp but using akka-http for the programming. so all left to do is a reactive-streams around this and then could connect them all together right? |
@hepin1989 in theory, it could work and I'd be interested about the results. From our experience, "just a reactive-streams API around this" will still be lots of work and optimizing performance while mixing several stacks together will probably be even harder than trying to optimize just the single stack... |
@jrudolph yes that's it is,so currently we are building our game server plus gm around akka and spray. |
Currently it is extremely ugly to set the backlog parameter from Java, but I have hacked it together here. Running this will not succeed unless setting a few kernel parameters:
When doing so, the test result is quite different:
What we see here is that some requests time out because the server is being overloaded, and we possibly don’t handle all non-nominal processing conditions with perfect grace yet, but it is quite clear that Akka HTTP can handle quite some load already—we will definitely improve, though! Johannes, it might be interesting how only some of the requests get “stuck”, there is a clear latency gap between almost everything going nice and fast and some requests seem to starve. I don’t have the Spring comparison data, so it would be great if Niko could comment on these findings. |
@rkuhn maybe we could add some doc about the server side env setup. like |
Hm, that does not look very impressive yet. I mean, I don't know your hardware and mine is quite powerful for a laptop (i7-4800MQ @ 2.70 GHz with 16 GB RAM). The results for spring on my machine are quite similar and my spring application already implements a query parser with parboiled, translates the query to mongo criterias and retrieves the results from MongoDB.
|
By the way... the application will be deployed on cloudfoundry. I'm not sure if we can edit these environment variables there somehow :( |
@n1ko-w1ll That was what @drewhk was trying to say all along and what we have been very vocal about: you are testing a pre-release version of the first iteration of a new HTTP stack that has not yet been optimized at all. Of course this will not be representative of the performance when it reaches production quality. Concerning the parameters: you cannot possibly have tested successfully with 10000 clients with the MacOS X default limit of 256 open file descriptors per process and 12000 file descriptors in the whole system, independent of which HTTP stack is used, and to my knowledge none of the popular operating system kernels come preconfigured with limits that are suitable for your specific test case. Doing load tests and optimization will always require dedicated configuration of all aspects of the deployment. Another note on my measurement: the point was to show that the refused connections are indeed due to the limited queue of incoming connection requests, I have not tuned the JVM settings at all—in fact I don’t even know how much memory it had available. Finding out how to do these things with Maven goes beyond what time I can spend on this issue today, perhaps you can repeat the measurement with the patched version to get an even comparison. |
I'm not sure if it makes any difference, but as mentioned in my initial post, I'm running Windows 7, not MacOS X ;) I know that all of this is not released yet. I just want to point out, that it works out of the box without any adjustments on operating system level or any special configuration with spring boot and I wondered why this is not possible with akka when I initially tried it. It's the same for me, I don't think I have the time today to test this again. Maybe next week. |
Maybe we could make a online chart which show the benchmark for release iterations.cause akka have websocket and http now,best for microservice and rest facade. |
From this I conclude that on Windows these kernel settings are not needed, and all that was required was to switch the higher backlog parameter value in your test case. |
@rkuhn I tested your code on Windows and still received the "Connection refused" errors :( |
We’ll have to revisit this when we are in a position to actually perform real benchmarks and performance analyses, thanks for providing the test case. |
Closing as obsolete, we did a lot of work in the area recently, reopen if needed please |
Just for the recored (ticket can stay closed): I repeated the test with the current Akka version (2.4.10) on the same Linux box I used on March 25th. Again the performance increased significantly (about ~30 %):
This result is even better as the result using the Spring version:
TODO repeat the runs on a Windows box. |
reported on twitter by Niko Will (@win1imb), reproducer is at https://github.com/win1imb/akka-poc
The text was updated successfully, but these errors were encountered: