New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[UNDERTOW-1377] Test to reproduce concurrent dispatch failure #657
Conversation
Linux Build 2733 outcome was FAILURE using a merge of 1393a97 Failed tests
|
Windows Build 2486 outcome was FAILURE using a merge of 1393a97 Failed tests
|
Linux Build 2734 outcome was FAILURE using a merge of c32831f Failed tests
|
Windows Build 2487 outcome was FAILURE using a merge of c32831f Failed tests
|
Linux Build 2735 outcome was FAILURE using a merge of c32831f Failed tests
|
Windows Build 2488 outcome was FAILURE using a merge of c32831f Failed tests
|
@@ -241,11 +248,18 @@ private void resumeReads(boolean wakeup) { | |||
if(anyAreSet(state, FLAG_DATA_TO_UNWRAP) || wakeup || unwrappedData != null) { | |||
runReadListener(true); | |||
} else { | |||
delegate.getSourceChannel().resumeReads(); | |||
delegateSourceResumeReads(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invoking resumeReads off the IO thread here causes the failure.
Linux Build 2737 outcome was UNKNOWN using a merge of 77c493f |
Windows Build 2490 outcome was UNKNOWN using a merge of 77c493f |
Client gets a socket timeout where the server is unaware of an active request. This is only reproducible in parallel AND with an exchange.dispatch. I am unable to reproduce this failure without TLS.
Unsure if this is necessary. Previous commit fixes the issue that I can reproduce, it stands to reason that all other invocations should run on the IO thread as well.
77c493f
to
42f35ab
Compare
Linux Build 2739 outcome was FAILURE using a merge of 42f35ab Failed tests
|
Windows Build 2492 outcome was FAILURE using a merge of 42f35ab Failed tests
|
Thanks for the test case, but I don't think this is the correct fix (although the fact that this made the test passed gave me a big clue as to what the actual problem was). In SslReadReadyHandler there is a suspendReads() that is in the wrong place, and was happening after the read listener was invoked. In some circumstances this could happen after the exchange was completed by the work thread, which means the channel ended up suspended when it should be resumed to read the next request. |
Ah, that makes sense, thanks @stuartwdouglas! Once the fix is pushed is there any chance you could tag a release? I've run into this issue on production systems. |
Looks like this test has been merged to master |
2.0.11.Final is tagged |
Client gets a socket timeout where the server is unaware of
an active request. This is only reproducible in parallel AND
with an exchange.dispatch.
I am unable to reproduce this failure without TLS.