org.jboss.netty.channel.ChannelFuture.awaitUninterruptibly() hangs when closing a channel using SSL/TLS #343

Closed
charlesk40 opened this Issue May 19, 2012 · 23 comments

5 participants

@charlesk40

First issue was asked here.
http://stackoverflow.com/questions/9804982/org-jboss-netty-channel-channelfuture-awaituninterruptibly-hangs-when-closing

We are also noticing the same problem with Netty 3.3.1.Final where awaitUninterruptibly hangs and channel doesn't seem to get closed.

As a work around, we tried this which seems to do the trick.


  • public class SslIdleStateHandler extends IdleStateHandler {

  • public SslIdleStateHandler(org.jboss.netty.util.Timer timer,
  • int readerIdleTimeSeconds, int writerIdleTimeSeconds,
  • int allIdleTimeSeconds) {
  • super(timer, readerIdleTimeSeconds, writerIdleTimeSeconds, allIdleTimeSeconds);
  • }

  • @Override
  • protected void channelIdle(
  • ChannelHandlerContext ctx, IdleState state, long lastActivityTimeMillis) throws Exception {
  • if (config.getSslContext()!=null){
  • ctx.getChannel().close(); //We close here which seems to do the trick.
  • }
  • super.channelIdle(ctx, state, lastActivityTimeMillis);
  • }
  • } }
@normanmaurer
The Netty Project member

@charlesk40 Let me try to understand whats going on later.. BTW, could you upgrade to netty 3.4.5.Final and see if you still see this ? Just to be sure we not fixed it..

@charlesk40

Tried with netty 3.4.5.Final and I can still see that awaitUninterruptibly hangs.

@normanmaurer
The Netty Project member
@normanmaurer normanmaurer was assigned May 19, 2012
@normanmaurer
The Netty Project member

@charlesk40 I tried to reproduce it via testcase but did not work.

Can you see if I did it the right way:
4bf2257

@charlesk40

Okay so by sending bad SSL request to a SSL enabled Server does close the Channel.

However, sending a good SSL request to a SSL enable Server with IdleStateHandler enabled cause this state.
(Triggering a timeout)

Trying to reproduce this with a testcase but no luck either.
However, I was able to locate where it might be causing the problem for Channel to NOT close in the SslHandler:

write(ctx, future, msg) method inside of an wrapNonAppData called from closeOutboundAndChannel method could fail to set the future to completed which may cause ClosingChannelFutureListener to NOT close the channel properly. I'm unable to reproduce this with an unit test but can cause it by using curl to send https request.

private ChannelFuture wrapNonAppData(ChannelHandlerContext ctx, Channel channel) throws SSLException {
ChannelFuture future = null;
ByteBuffer outNetBuf = bufferPool.acquire();

    SSLEngineResult result;
    try {
        for (;;) {
            synchronized (handshakeLock) {
                result = engine.wrap(EMPTY_BUFFER, outNetBuf);
            }

            if (result.bytesProduced() > 0) {
                outNetBuf.flip();
                ChannelBuffer msg = ChannelBuffers.buffer(outNetBuf.remaining());
                msg.writeBytes(outNetBuf.array(), 0, msg.capacity());
                outNetBuf.clear();

                future = future(channel);
                future.addListener(new ChannelFutureListener() {
                    public void operationComplete(ChannelFuture future)
                            throws Exception {
                        if (future.getCause() instanceof ClosedChannelException) {
                            synchronized (ignoreClosedChannelExceptionLock) {
                                ignoreClosedChannelException ++;
                            }
                        }
                    }
                });

                write(ctx, future, msg);
            }

            final HandshakeStatus handshakeStatus = result.getHandshakeStatus();
            handleRenegotiation(handshakeStatus);
            switch (handshakeStatus) {
            case FINISHED:
                setHandshakeSuccess(channel);
                runDelegatedTasks();
                break;
            case NEED_TASK:
                runDelegatedTasks();
                break;
            case NEED_UNWRAP:
                if (!Thread.holdsLock(handshakeLock)) {
                    // unwrap shouldn't be called when this method was
                    // called by unwrap - unwrap will keep running after
                    // this method returns.
                    unwrap(ctx, channel, ChannelBuffers.EMPTY_BUFFER, 0, 0);
                }
                break;
            case NOT_HANDSHAKING:
            case NEED_WRAP:
                break;
            default:
                throw new IllegalStateException(
                        "Unexpected handshake status: " + handshakeStatus);
            }

            if (result.bytesProduced() == 0) {
                break;
            }
        }
    } catch (SSLException e) {
        setHandshakeFailure(channel, e);
        throw e;
    } finally {
        bufferPool.release(outNetBuf);
    }

    if (future == null) {
        future = succeededFuture(channel);
    }

    return future;
}
@normanmaurer
The Netty Project member

@charlesk40 Can you show me how you can reproduce via curl ?

@normanmaurer
The Netty Project member

@charlesk40 also I think the code is correct as it will send the event upstream if it could not add the future and so it should be closed.

@charlesk40

I'm just using "cacert" option in curl to send using self-signed certicate. Example: curl -v --cacert outcert.pem https://localhost:8081/echo?sleep=1000 to send a request.

You mean it will send a downstream event.

Further tracing through the code, it shows that when write(ctx, future, msg);
gets invoked, it gets to the NioWorker thread which schedules for the write and marks future to completed. But "write" never happens (Don't know why at this moment. Still needs to investigate) and we never get notified which is causing the hang.

So, now the question is do I need to add a downstream handler when SslHandler invokes write(ctx, future, msg); to handle the event? (Like I have done with closing the channel myself from my first comment)

@normanmaurer
The Netty Project member

@charlesk40 to make it more easy can you upload your server code somewhere so I can just run it and use curl ?

@charlesk40

Below is a unit test that you can use to reproduce the issue. Just replace the code that generates the SSLContext.

public class SslCloseTestCase {

@Test
public void requireThatServerCanCloseToSslIdleRequestTestCaseForNetty() throws IOException, InterruptedException {
    HashedWheelTimer idleTimer = new HashedWheelTimer();
    ServerBootstrap sb = new ServerBootstrap(new NioServerSocketChannelFactory());
    Executor channelWorkerExecutor = Executors.newCachedThreadPool(Executors.defaultThreadFactory());
    SslKeyStore sslKeyStore = new SslKeyStore("src/test/resources/ssl_keystore_test.jks");
    sslKeyStore.setKeyStorePassword("secret");
    SSLContext sslContext = SslContextFactory.newInstance(sslKeyStore, sslKeyStore).getServerSSLContext();
    SSLEngine sse = sslContext.createSSLEngine();
    sse.setUseClientMode(false);
    sb.getPipeline().addFirst("ssl", new SslHandler(sse));
    sb.getPipeline().addLast("idler",new IdleStateHandler(idleTimer, 0, 0, 40));  //Setting the idle timeout to 40 seconds.
    TestIdleStateAwareChannelUpstreamHandler testIdleStateAwareChannelUpstreamHandler = new TestIdleStateAwareChannelUpstreamHandler();
    sb.getPipeline().addLast("handler", testIdleStateAwareChannelUpstreamHandler); 
    channelWorkerExecutor.execute(testIdleStateAwareChannelUpstreamHandler);
    Channel serverChannel = sb.bind(new InetSocketAddress(0));
    //Client
    ClientBootstrap cb = new ClientBootstrap(new NioClientSocketChannelFactory());
    SslKeyStore sslKeyStoreClient = new SslKeyStore("src/test/resources/ssl_keystore_test.jks");
    sslKeyStoreClient.setKeyStorePassword("secret");
    SSLContext sslContextClient = SslContextFactory.newInstanceFromTrustStore(sslKeyStoreClient).getServerSSLContext();
    SSLEngine sseClient = sslContextClient.createSSLEngine();
    sseClient.setUseClientMode(true);
    cb.getPipeline().addLast("ssl", new SslHandler(sseClient));
    Channel cc = cb.connect(serverChannel.getLocalAddress()).awaitUninterruptibly().getChannel();
    ChannelFuture f = cc.write(ChannelBuffers.copiedBuffer("encrypted", CharsetUtil.US_ASCII));
    f.await();
    cc.getCloseFuture().awaitUninterruptibly();
    cb.releaseExternalResources();
    sb.releaseExternalResources();
}

class TestIdleStateAwareChannelUpstreamHandler extends IdleStateAwareChannelUpstreamHandler implements Runnable {
    private volatile Channel serverChannel;
    private boolean readyToClose;
    @Override
    public void channelConnected(ChannelHandlerContext ctx, ChannelStateEvent e) throws Exception {
        // Called by Netty thread.
        if (serverChannel != null) {
            throw new IllegalStateException();
        }
        serverChannel = ctx.getChannel();
    }

    @Override
    public void messageReceived(ChannelHandlerContext ctx, MessageEvent e) {
        testIt(true);
    }

    private void testIt(boolean fail){
        SleepHandler sleepHandler = new SleepHandler();
        if (fail){
            //Same thread. This LOCKS f.awaitUninterruptibly!!!
            sleepHandler.run();
        }else{
            //New thread. Passes.
            Executor executor = Executors.newCachedThreadPool(Executors.defaultThreadFactory());
            executor.execute(sleepHandler);
        }
    }

    @Override
    public void channelIdle(ChannelHandlerContext ctx, IdleStateEvent e) throws Exception {
        System.out.println("Close channel");
        readyToClose = true;
    }

    @Override
    public void run() {
        while(true){
            if (readyToClose){
                ChannelFuture f = serverChannel.close();
                Assert.assertTrue(f.awaitUninterruptibly(TimeUnit.MILLISECONDS.convert(5, TimeUnit.SECONDS))); 
                serverChannel = null;
                break;
            }else{
                try {
                    Thread.sleep(5000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        }
    }

    class SleepHandler implements Runnable{
        @Override
        public void run() {
            while (true){

            }
        }
    }
}

}

@normanmaurer
The Netty Project member

@charlesk40 Thanks! Will have a look in a few..

@normanmaurer
The Netty Project member

@charlesk40 the problem is that you keep the thread/cpu busy. change while(true) to :

            while (!Thread.interrupted()) {

            }

And you will see it works

@charlesk40

Well, SleepHandler just mimics the handler that could take long time,
and in a real handler senario, while(true) loop is a valid usage.

@charlesk40

Something more like which does some job for long time.

    class SleepHandler implements Runnable{
        @Override
        public void run() {
            while (true){
                //do something here
                try {

// System.out.println("Sleeping...");
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}

@trustin
The Netty Project member

We see this issue when: for some reason SslHandler's close_notify message was not written completely and Netty didn't get any notification of write failure. The workaround is to set some timer and force-close the connection:

pipeline.remove(SslHandler.class);
channel.close();

In Netty 4, SslHandler uses its internal timer to schedule a force close.

@alainw

Are there plans to have the fix before Netty 4? Any idea of the timeline, or when Netty 4 will be released? Thanks.

@trustin
The Netty Project member

The first alpha of 4.0.0 is out. Announcement pending.

@trustin
The Netty Project member

Fixed in the 4.0.0.Alpha. (probably 4 or 5)

@trustin trustin closed this Oct 26, 2012
@normanmaurer
The Netty Project member

@cloudaloe do you have any easy way for me to reproduce ?

@cloudaloe

Sure. But I'm first ruling out the possibility that I need to explicitly initiate a handshake or explicitly wait for the automatically initiated handshake to finish before issuing .write on the channel. I found some mentions of that aspect and would like to rule it out first (here and related).

@cloudaloe

@normanmaurer My issue was something else (resolved through debugging with -Djavax.net.debug=ssl:handshake). Sorry for the irrelevance to this thread.

@normanmaurer
The Netty Project member

@cloudaloe you may want to share what your issue was just in case someone else has the same issue...

@cloudaloe

Sure, somewhere along rounds of refactoring while upgrading to Netty 4 Beta 1, my client side code was inadvertently left using setUseClientMode(false). It seems this was causing the handshake against my server component to just hang (without propagating any error) which could be observed using -Djavax.net.debug=ssl:handshake. Netty was actually consistent in not firing the listener for the handshake and in not letting the write operation through. Again, not relevant to the bug this thread is actually about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment