-
-
Notifications
You must be signed in to change notification settings - Fork 15.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance of HashedWheelTimer in Netty 4.0.19 was way much better than 4.0.20 #2571
Comments
@lokeshhctm we will need more details to help here. In all our benchmarks 4.0.20.Final out-performs 4.0.19.Final. |
How can i get the 4.0.x JAR up-to different commits between 4.0.19 and 4.0.20? I am not able to checkout commit number mentioned in 4.0 branch.Please let me know the commands to checkout 4.0.x commits. |
@lokeshhctm you will need to checkout the 4.0 branch and then reset to the specific commit. |
I tried that but not helping. I appreciate if you type commands please. |
Check the commit history on the 4.0 branch: https://github.com/netty/netty/commits/4.0 Then |
I may be doing wrong something: git clone -b 4.0 https://github.com/netty/netty.git Gettting an error "fatal: reference is not a tree: 1709113" |
Use git reset —hard $sha Am 17. Juni 2014 bei 17:41:52, lokeshhctm (notifications@github.com) schrieb: I may be doing wrong somehthing: git clone -b 4.0 https://github.com/netty/netty.git Gettting an error "fatal: reference is not a tree: 1709113" — |
Should be working though (it does for me)?! Which version of git do you use? |
git bisect might be useful in this situation although in our case the range isn't very wide. |
I used git bisect and found the exact commit which has issue and we ran into performance issues: Commit ID : 6248db5 |
@lokeshhctm awesome! Could you give me some more details how the HashedWheelTimer is used here ? /cc @slandelle |
To be very honest, i am not aware how it is being used. We are using Async HTTP Client to manage the connection pools and use the same to get back the response from Netty Provider (Netty NIO 4.0.x). What functionality internally uses this, don't know. |
Ok I will figure it out with @slandelle then :) Stay tuned. |
@normanmaurer Here's what we do (defaults):
|
|
.setMaxConnectionsTotal(defaultMaxConnections)
.setMaxConnectionsPerHost(totalConnections)
.setConnectionTimeoutInMs(defaultConnectionTimeout)
.setRequestTimeoutInMs(defaultSocketTimeout) Values? |
.setMaxConnectionsTotal(64000) |
OK, so if I get it right, you're creating 12K timeouts/s. That's just huge. |
@slandelle I will try to write a micro benchmark and test 4.0.19.Final + later |
@normanmaurer Let me know if I can help. |
@normanmaurer are you able to reproduce the issue? |
@lokeshhctm @slandelle so far I was not able to reproduce it with a micro benchmark :( If there would be any way to you could provide me with a reproducer / gc logs / profiler snapshot etc please do .... |
Even worse... current 4.0 branch out-performs the old HashedWheelTimer all time. |
@slandelle could you somehow help me to pin this down ? |
@normanmaurer I don't have more information than you have :( |
@lokeshhctm Do you observe some request timeouts and then, do you trigger so special time-expensive operations in AHC Handler's |
@lokeshhctm another thing that may give me some clue would be a GC log of running with 4.0.19.Final and the hwt_test branch. |
i have GC logs for 4.0.19 and current. Let me know how will send you. Git is not allowing to attach any file other than image. |
@lokeshhctm norman.maurer at googlemail dot com |
i just sent you GC logs on the email |
OK, so when you were talking about 12K QPS, that was for a whole cluster, not a single node, right? |
I am really sorry about this confusion. Actually QPS i mentioned above is the incoming QPS per box. With each incoming we send out several HTTP requests which is almost 10 per incoming request. 4.0.19 : 9000 |
@lokeshhctm seems like current does more GC pauses then 4.0.19.Final... This is quite interesting. Is it reproducible all the the time ? |
yes. the problem is reproducible all the time. |
@lokeshhctm ok thanks... This may help me :) We will see |
@lokeshhctm actually I missed that both logs cover different running times. Could you provide me the GC logs for both version while running them for the same time (like 5 mins). |
How many OPS did you get with 4.0.20.Final?
|
I just sent you the GC logs for 5 min interval for the same time. Currently on production i see: Box with 4.0.19 - 8200 outgoing |
@lokeshhctm not much I can see from the logs... I guess I need a reproducer or maybe profiler snapshots. Sorry :( |
i will try to write some reproducer and will update you |
Thanks!
|
@lokeshhctm any updates ? |
@normanmaurer i sent you an email. let me know if it is fine to talk |
@normanmaurer i tried after the fix you mentioned above but i still see issue, did you try with the sample i sent you on email? |
@lokeshhctm Does this sample involves AHC? If so, I'd gladly have it too. |
Yeah it involves AHC. here you go. /*
import java.util.Timer; /**
|
The issue has been fixed after the commit yesterday. |
Nice :)
|
Fixed by 08426d5 |
We are not using SSL but use several HTTP calls for every incoming requests to out server. Now i see that after i switch to latest Netty 4.0.20, seeing 10% high latency than 4.0.19 which is deployed on other servers. and hence 4.0.20 is not able to support that much QPS. i end up getting 10% less QPS support in 4.0.20 in comparison to 4.0.19.
The text was updated successfully, but these errors were encountered: