Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSLHandler can fail if writes occur before channelActive fired #8479

Closed
Tim-Brooks opened this issue Nov 8, 2018 · 4 comments
Closed

SSLHandler can fail if writes occur before channelActive fired #8479

Tim-Brooks opened this issue Nov 8, 2018 · 4 comments
Assignees
Labels
Milestone

Comments

@Tim-Brooks
Copy link
Contributor

Expected behavior

Attempting to write to a channel with an SslHandler when the channel.writeable.isWritable() returns true should be allowed.

Actual behavior

If you attempt to write to a channel with an SslHandler prior to channelActive being called you can hit an assertion. In particular - if you write to a channel it forces some handshaking (through flush calls) to occur. If the handshake fails - the promise is set to failed. This happens around line 1573 in the SslHandler.java.

            if (handshakePromise.tryFailure(cause) || alwaysFlushAndClose) {
                SslUtils.handleHandshakeFailure(ctx, cause, notify);
            }

I can tell that channelActive has not been called because I am in the debugger and the handshakeStarted field returns false. However, ctx.channel().isWriteable() returns true. So I do not think I am doing anything incorrect.

Eventually netty gets around to calling channelActive. This call makes its way down to:

    private void handshake(final Promise<Channel> newHandshakePromise) {
        final Promise<Channel> p;
        if (newHandshakePromise != null) {
            final Promise<Channel> oldHandshakePromise = handshakePromise;
            if (!oldHandshakePromise.isDone()) {
                // There's no need to handshake because handshake is in progress already.
                // Merge the new promise into the old one.
                oldHandshakePromise.addListener(new FutureListener<Channel>() {
                    @Override
                    public void operationComplete(Future<Channel> future) throws Exception {
                        if (future.isSuccess()) {
                            newHandshakePromise.setSuccess(future.getNow());
                        } else {
                            newHandshakePromise.setFailure(future.cause());
                        }
                    }
                });
                return;
            }

            handshakePromise = p = newHandshakePromise;
        } else if (engine.getHandshakeStatus() != HandshakeStatus.NOT_HANDSHAKING) {
            // Not all SSLEngine implementations support calling beginHandshake multiple times while a handshake
            // is in progress. See https://github.com/netty/netty/issues/4718.
            return;
        } else {
            // Forced to reuse the old handshake.
            p = handshakePromise;
            assert !p.isDone();
        }

        // Begin handshake.
        final ChannelHandlerContext ctx = this.ctx;
        try {
            engine.beginHandshake();
            wrapNonAppData(ctx, false);
        } catch (Throwable e) {
            setHandshakeFailure(ctx, e);
        } finally {
           forceFlush(ctx);
        }
        applyHandshakeTimeout(p);
    }

via:

	at io.netty.handler.ssl.SslHandler.handshake(SslHandler.java:1747)
	at io.netty.handler.ssl.SslHandler.startHandshakeProcessing(SslHandler.java:1666)
	at io.netty.handler.ssl.SslHandler.channelActive(SslHandler.java:1807)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:213)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:199)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelActive(AbstractChannelHandlerContext.java:192)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelActive(DefaultChannelPipeline.java:1422)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:213)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:199)
	at io.netty.channel.DefaultChannelPipeline.fireChannelActive(DefaultChannelPipeline.java:941)
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:311)
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:341)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:632)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:544)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
	at java.base/java.lang.Thread.run(Thread.java:834)

There is an assertion at line 1747 that triggers because the handshake status is set to HandshakeStatus.NOT_HANDSHAKING and the handshake promise is done:

        } else if (engine.getHandshakeStatus() != HandshakeStatus.NOT_HANDSHAKING) {
            // Not all SSLEngine implementations support calling beginHandshake multiple times while a handshake
            // is in progress. See https://github.com/netty/netty/issues/4718.
            return;
        } else {
            // Forced to reuse the old handshake.
            p = handshakePromise;
            assert !p.isDone();
        }

Steps to reproduce

I reliably reproduce this by attaching a listener to the netty connect future that initiates an application level write. When the connection is complete we start a write the does enough to cause the handshake to fail before the channelActive is called. The reason the handshake is failing is a specific scenario we are testing:

javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate)
	at sun.security.ssl.HandshakeContext.<init>(HandshakeContext.java:163) ~[?:?]
	at sun.security.ssl.ClientHandshakeContext.<init>(ClientHandshakeContext.java:95) ~[?:?]
	at sun.security.ssl.TransportContext.kickstart(TransportContext.java:217) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.writeRecord(SSLEngineImpl.java:167) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.wrap(SSLEngineImpl.java:136) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.wrap(SSLEngineImpl.java:116) ~[?:?]

Which explains why the handshake is failing so fast.

The stacktrace for the connection callback -> write process is:

	at io.netty.handler.ssl.SslHandler.setHandshakeFailure(SslHandler.java:1573)
	at io.netty.handler.ssl.SslHandler.setHandshakeFailure(SslHandler.java:1542)
	at io.netty.handler.ssl.SslHandler.flush(SslHandler.java:776)
	at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
	at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
	at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
	at io.netty.handler.logging.LoggingHandler.flush(LoggingHandler.java:265)
	at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
	at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
	at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
	at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
	at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
	at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:802)
	at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:814)
	at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:794)
	at io.netty.channel.DefaultChannelPipeline.writeAndFlush(DefaultChannelPipeline.java:1066)
	at io.netty.channel.AbstractChannel.writeAndFlush(AbstractChannel.java:305)
	at org.elasticsearch.transport.netty4.Netty4TcpChannel.sendMessage(Netty4TcpChannel.java:142)

<Irrelevant Elasticsearch code>

org.elasticsearch.transport.netty4.Netty4TcpChannel.lambda$new$1(Netty4TcpChannel.java:67)
	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511)
	at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:504)
	at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:483)
	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:424)
	at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:103)
	at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84)
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:306)
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:341)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:632)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:544)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
	at java.base/java.lang.Thread.run(Thread.java:834)

Netty version

Netty 4.1.30-Final

JVM version (e.g. java -version)

openjdk version "11" 2018-09-25

OS version (e.g. uname -a)

Darwin 18.2.0 Darwin Kernel Version 18.2.0

@normanmaurer
Copy link
Member

@tbrooks8 thanks... I will try to find some time to fix this or if you are interested in providing a PR I am happy to review as well :)

@Tim-Brooks
Copy link
Contributor Author

I might have time tomorrow to put together a PR. I'll let you know.

@normanmaurer
Copy link
Member

normanmaurer commented Nov 9, 2018

@tbrooks8 don't worry I am able to reproduce with a unit test... Seems to only happen on Java11 tho :) Should have a fix soon.

@normanmaurer normanmaurer added this to the 4.1.32.Final milestone Nov 9, 2018
@normanmaurer normanmaurer self-assigned this Nov 9, 2018
normanmaurer added a commit that referenced this issue Nov 9, 2018
…ive fired

Motivation:

If you attempt to write to a channel with an SslHandler prior to channelActive being called you can hit an assertion. In particular - if you write to a channel it forces some handshaking (through flush calls) to occur.

The AssertionError only happens on Java11+.

Modifications:

- Replace assert by an "early return" in case of the handshake be done already.
- Add unit test that verifies we do not hit the AssertionError anymore and that the future is correctly failed.

Result:

Fixes #8479.
@normanmaurer
Copy link
Member

Should be fixed by #8486

normanmaurer added a commit that referenced this issue Nov 11, 2018
#8486)

Motivation:

If you attempt to write to a channel with an SslHandler prior to channelActive being called you can hit an assertion. In particular - if you write to a channel it forces some handshaking (through flush calls) to occur.

The AssertionError only happens on Java11+.

Modifications:

- Replace assert by an "early return" in case of the handshake be done already.
- Add unit test that verifies we do not hit the AssertionError anymore and that the future is correctly failed.

Result:

Fixes #8479.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants