-
-
Notifications
You must be signed in to change notification settings - Fork 15.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SSLHandler may throw AssertionError if writes occur before channelAct… #8486
Conversation
…ive fired Motivation: If you attempt to write to a channel with an SslHandler prior to channelActive being called you can hit an assertion. In particular - if you write to a channel it forces some handshaking (through flush calls) to occur. The AssertionError only happens on Java11+. Modifications: - Replace assert by an "early return" in case of the handshake be done already. - Add unit test that verifies we do not hit the AssertionError anymore and that the future is correctly failed. Result: Fixes #8479.
@tbrooks8 PTAL as well... Thanks for reporting! |
@netty-bot test this please |
You can ignore the JDK12 failure on the CI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it should work but I would have personally thought that the right thing to do would be to defer writes/flushes until the channel was active. It looks like the flush is the key reason we get to this state: is it normal to want to flush a pipeline before it's considered active?
@@ -1745,9 +1745,12 @@ public void operationComplete(Future<Channel> future) throws Exception { | |||
// is in progress. See https://github.com/netty/netty/issues/4718. | |||
return; | |||
} else { | |||
if (handshakePromise.isDone()) { | |||
// If the handshake is done already lets just return directly as there is no need to trigger it again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we include the condition under which we can expect this, specifically that the handshake must have already happened/started (and failed?) before the channelActive event was fired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bryce-anderson yep.. Let me update the comment
@bryce-anderson the thing is the channel is active already but we did not propagate the channelActive(...) already so yep I think this is expected to work. The thing is that all the promises will fire before the even is propagated through the pipeline, so yep its expected to work |
@bryce-anderson made the scenario more clear in the comments... PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take your word on the event ordering issue: I believe that is what's happening but it still seems strange not to wait on the handshake until after we get the channelActive event, but I suspect it's not a big deal.
@bryce-anderson yeah it is really about sequencing here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this fixes my particular issue.
I have not spent a significant amount of time going through the SslHandler
code in a while. But I am suspicious that there being multiple different code paths to start a handshake is introducing other issues.
If the handshake at the engine level gets kickstarted by a flush, it looks like there is other setup that does not happen:
private void handshake(final Promise<Channel> newHandshakePromise) {
final Promise<Channel> p;
if (newHandshakePromise != null) {
final Promise<Channel> oldHandshakePromise = handshakePromise;
if (!oldHandshakePromise.isDone()) {
// There's no need to handshake because handshake is in progress already.
// Merge the new promise into the old one.
oldHandshakePromise.addListener(new FutureListener<Channel>() {
@Override
public void operationComplete(Future<Channel> future) throws Exception {
if (future.isSuccess()) {
newHandshakePromise.setSuccess(future.getNow());
} else {
newHandshakePromise.setFailure(future.cause());
}
}
});
return;
}
handshakePromise = p = newHandshakePromise;
} else if (engine.getHandshakeStatus() != HandshakeStatus.NOT_HANDSHAKING) {
// Not all SSLEngine implementations support calling beginHandshake multiple times while a handshake
// is in progress. See https://github.com/netty/netty/issues/4718.
return;
} else {
// Forced to reuse the old handshake.
p = handshakePromise;
assert !p.isDone();
}
// Begin handshake.
final ChannelHandlerContext ctx = this.ctx;
try {
engine.beginHandshake();
wrapNonAppData(ctx, false);
} catch (Throwable e) {
setHandshakeFailure(ctx, e);
} finally {
forceFlush(ctx);
}
applyHandshakeTimeout(p);
}
For example, in handshake(Promise)
it looks like you will leave the method early if the engine's handshaking status is something besides HandshakeStatus.NOT_HANDSHAKING
. Which means that a handshake timeout never gets scheduled.
@tbrooks8 yep seems like we should cleanup this a bit more. Let me take a look |
@tbrooks8 actually I think the current implementation is fine as if the handshake status is anything different we should have already finished the handshake or it should be still in progress which means a timeout should already have been scheduled. |
So to make it short I think what we have here as fix is fine... if there will be other issues we should fix these in a separate PR with a unit test included. |
I agree the fix here is fine for my case. But I think the following is possible currently (a different issue):
The |
Let me investigate this and open a new issue if needed.
Thanks for all the help btw 👌
… Am 10.11.2018 um 18:12 schrieb Tim Brooks ***@***.***>:
@normanmaurer -
I agree the fix here is fine for my case. But I think the following is possible currently (a different issue):
A client (like my case) initiates the handshake in a flush call. The client hello message is generated and flushed. The SSLEngine now returns a handshake status of NEED_UNWRAP.
The channelActive call happens, since the handshake status is not HandshakeStatus.NOT_HANDSHAKING (line 1740), the method returns immediately without scheduling a timeout. There is now no timeout for this handshake.
The flush route for starting a handshake does not called startHandshakeProcessing. That is only called when the handler is added and the channel is already active or in the channelActive call. The flush route allows for the SSLEngine handshake to start prior to these things happening. And then the status of the engine, causes the handshake(final Promise<Channel> newHandshakePromise) method to exit early.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@tbrooks8 #8493 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ive fired
Motivation:
If you attempt to write to a channel with an SslHandler prior to channelActive being called you can hit an assertion. In particular - if you write to a channel it forces some handshaking (through flush calls) to occur.
The AssertionError only happens on Java11+.
Modifications:
Result:
Fixes #8479.