Using channel Local : java.lang.IllegalStateException: await() in I/O thread causes a dead lock or sudden performance drop. Use addListener() instead or call await() from a different thread. #387

fredericBregier · 2012-06-09T11:50:27Z

Hi,

Between version 3.5.0.beta and 3.5.0 final, a change occurs concerning Local Channel. This brings issue in my project.

Previously, there were no issue to have an await*() in such channel, since it is in thread only.
Now, it always sends the following error :

java.lang.IllegalStateException: await_() in I/O thread causes a dead lock or sudden performance drop. Use addListener() instead or call await_() from a different thread.

My point of view is that for Channel Local, it should not be necessary to add in addition a new thread layer between the channel handling and the work to be done. It could have IMHO some huge impact in performance to be obliged to add in addition such a thread. Most of the time, those Channel Local are already within a thread (and for me always) that is separated from a real Channel interface (networked one).

Is there any reason for this change ?

Not to say it will have a huge impact in my code, but if there is a real reason, ok I will do the necessary changes on my side, but I do think it seems unecessary for such Local Channel handling...

fredericBregier · 2012-06-09T11:54:02Z

Oh, and by the way, I just try to add a new ExecutionHandler in the pipeline to ensure I have one and ...
(I had one on server side, but none on client side, so I simply add one on client side)

The error still remains there ! So I suppose there is an issue in the detection ?

But it does not impact "Network" channels...

fredericBregier · 2012-06-09T13:39:50Z

The only thing I saw that might be related is the following fix:

7800187#src/main/java/org/jboss/netty
but not sure it is related... This seems related to the new :

public final void run() {
try {
DeadLockProofWorker.PARENT.set(executor);
runTask();
} finally {
DeadLockProofWorker.PARENT.remove();
}
}

I used a different ExecutorService but with the same Executor behind on Server and Client side.
I even try to use a very different Executor, but it does not work either.

Any idea ?

fredericBregier · 2012-06-09T13:51:22Z

Small correction on the behavior, here is the scenario I found out:

A network channel received a message (before the handler is an OrderedMemoryAwareThreadPoolExecutor)
It forwards it to a local channel
the local channel server (before the handler is an OrderedMemoryAwareThreadPoolExecutor) (LocalServerHandler.messageReceived)
then decide to send back a message to the remote client (network channel) (writeAbstractLocalPacket)
and to wait for the message to be sent (same writeAbstractLocalPacket)
Then the "checkDeadLock" is raizing an error.

But there is an OMATPE in each step, so why ?
I believe the check if checkDeadLock is not ok then ?

Here is the exception trace:

   Unexpected exception from downstream Ref Channel: [id: 0x52cab854, local:ephemeral => local:0]
    java.lang.IllegalStateException: await*() in I/O thread causes a dead lock or sudden performance drop. Use addListener() instead or call await*() from a different thread.
at org.jboss.netty.channel.DefaultChannelFuture.checkDeadLock(DefaultChannelFuture.java:343) ~[netty-3.5.0.Final.jar:na]
at org.jboss.netty.channel.DefaultChannelFuture.await(DefaultChannelFuture.java:230) ~[netty-3.5.0.Final.jar:na]
at openr66.protocol.utils.ChannelUtils.writeAbstractLocalPacket(ChannelUtils.java:278) ~[classes/:na]
at openr66.protocol.localhandler.LocalServerHandler.request(LocalServerHandler.java:1209) ~[classes/:na]
at openr66.protocol.localhandler.LocalServerHandler.messageReceived(LocalServerHandler.java:327) ~[classes/:na]
at org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:45) [netty-3.5.0.Final.jar:na]
at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:64) [netty-3.5.0.Final.jar:na]
at org.jboss.netty.handler.execution.OrderedMemoryAwareThreadPoolExecutor$ChildExecutor.run(OrderedMemoryAwareThreadPoolExecutor.java:315) [netty-3.5.0.Final.jar:na]
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [na:1.6.0_31]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [na:1.6.0_31]
at java.lang.Thread.run(Thread.java:662) [na:1.6.0_31]

So the exception is raized while the wait is really not called from an IO thread (see the stack trace)...

fredericBregier · 2012-06-09T14:28:05Z

Finally by making step by step, it seems it is more in ChannelEventRunnable that the issue is.
When it set the "DeadLockProofWorker.PARENT.set(executor);", the stack trace is the following:

    at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:64)
at org.jboss.netty.handler.execution.OrderedMemoryAwareThreadPoolExecutor$ChildExecutor.run(OrderedMemoryAwareThreadPoolExecutor.java:315)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

and later on when the wait occurs:

   DeadLockProofWorker.PARENT.get() != null        is true

So, have you any idea ?
I know this code is very new in the 3.5, but I do not know the impact of it...

normanmaurer · 2012-06-09T15:10:16Z

Could you have show me code to Reproduce it?

fredericBregier · 2012-06-09T15:11:29Z

And to finish with this from my analysis part:

I double check, and the Executor that is set in ChannelEventRunnable and checked later on is my Local OrderedMemoryAwareThreadPoolExecutor, not an Executor from Netty on IO thread part...

fredericBregier · 2012-06-09T15:12:46Z

Hmm, for a code to reproduce, I will try (I have to go to a meeting right now).
But I think that it seems that the set of Executor to PARENT should maybe check if the Executor is from a OMATPE before ? but how to do this ?

trustin · 2012-06-09T19:27:21Z

DefaultChannelFuture.setUseDeadLockChecker(false) will fix your problem although it's not recommended.

fredericBregier · 2012-06-09T21:10:22Z

@trustin I now this way, but as you said, it is not recommended.
And I think this might hide some other issues than just mine.
So I am trying right now to create an example to illustrate the issue. If I can, then I will reopen the issue, since it is really not fixed. What you proposed is only a bypass, not a fix.

fredericBregier · 2012-06-09T23:40:47Z

Hi,

I managed to make an almost clean example. Almost since the client does not finished correctly with the answer, but that was not the goal of this example.
The main goal is to illustrate what happens on server side.

The logic is :

client side : Http Codec that simply sends one request
server side : Http Codec on Network Channel with all necessary OMATPE
a) at reception it creates a connection to a Local Channel with all necessary OMATPE again (new one)
b) send a simple message such that the Local Channel server will be able to retrieve the Network Channel
c) the local channel server receives the message, and send back an answer to the client directly to Network Channel
d) the local channel server makes a wait on the write and close the local channel after (I know that I can do it differently, but this is only to illustrate the wait, not the close)

In Netty 3.5.beta, everything is ok (except the client that did not finished with the data, but that's out of concern here).
In Netty 3.5.final, I got the exception again on step 2d, but I should not. So I believe there is something to fix in Netty.

The code can be downloaded from the following link:

https://docs.google.com/open?id=0B-tOt1H9VwrFbkFTM3FyOVhvbE0

trustin · 2012-06-10T00:26:07Z

Please do not create a milestone when not necessary. Setting the milestone to 3.5.1.Final.

fredericBregier · 2012-06-10T10:56:53Z

First sorry, I mis type the milestone when I add my comment

I try to understand how to fix it. For the moment, I am able to fix it by testing if the executor is of type MATPE (and therefore also OMATPE and subtypes).

But I would like to highlight that I don't understand how a single object (static) could be shared among all threads when they want to test the executor behaviour is correct while this logic is local to one channel. So there might be another solution more stable or more clean...

By the way, I simplify the test example, and I uploaded it again to the following. I tried also by using only LocalChannel but no issue were encoutentered. So it means the issue comes when we have both Network and Local Channel (which I do to enable multiplexing).

https://docs.google.com/open?id=0B-tOt1H9VwrFQ2QzMm9ZUDFJVWc

normanmaurer · 2012-06-12T05:27:34Z

@fredericBregier sorry I was quite busy.. I will try to have a look today or tomorrow latest.

normanmaurer · 2012-06-12T18:18:45Z

@fredericBregier Will check tomorrow.. To busy today :(

…se-positives on deadlock detection. See #387

normanmaurer · 2012-06-13T06:09:04Z

@fredericBregier could you please retest with latest code from 3 branch ?

fredericBregier · 2012-06-13T07:17:11Z

@normanmaurer Yes I confirm that with the latest 3 branch it is ok.
Make the PARENT different mades the trick !
Thank you Norman !

I close this issue and the proposal.

dimmacro · 2013-04-10T06:54:03Z

@normanmaurer hey,I am facing the same issue under the same situation,and my deal way is using a temp param to save the Executor before call write and read from another netty server,after finishing the connection,then set the temp Executor back,just like this:
Executor temp = DeadLockProofWorker.PARENT.get();
DeadLockProofWorker.PARENT.set(null);
obj.getChannel().write(request).awaitUninterruptibly();
Object o = obj.getHandler().read(NettyConnectionPoolUtil.read_timeout, TimeUnit.MILLISECONDS);
DeadLockProofWorker.PARENT.set(temp);
though it works,but I doubt whether this way is OK?

trustin closed this as completed Jun 9, 2012

fredericBregier reopened this Jun 9, 2012

fredericBregier mentioned this issue Jun 10, 2012

proposal to fix issue in #387 #388

Closed

normanmaurer added a commit that referenced this issue Jun 13, 2012

Use a different ThreadLocal for ChannelExecutionEvents to prevent fal…

cb8fc7a

…se-positives on deadlock detection. See #387

ghost assigned normanmaurer Jun 13, 2012

fredericBregier closed this as completed Jun 13, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using channel Local : java.lang.IllegalStateException: await() in I/O thread causes a dead lock or sudden performance drop. Use addListener() instead or call await() from a different thread. #387

Using channel Local : java.lang.IllegalStateException: await() in I/O thread causes a dead lock or sudden performance drop. Use addListener() instead or call await() from a different thread. #387

fredericBregier commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

normanmaurer commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

trustin commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

trustin commented Jun 10, 2012

fredericBregier commented Jun 10, 2012

normanmaurer commented Jun 12, 2012

normanmaurer commented Jun 12, 2012

normanmaurer commented Jun 13, 2012

fredericBregier commented Jun 13, 2012

dimmacro commented Apr 10, 2013

Using channel Local : java.lang.IllegalStateException: await*() in I/O thread causes a dead lock or sudden performance drop. Use addListener() instead or call await*() from a different thread. #387

Using channel Local : java.lang.IllegalStateException: await*() in I/O thread causes a dead lock or sudden performance drop. Use addListener() instead or call await*() from a different thread. #387

Comments

fredericBregier commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

normanmaurer commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

trustin commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

fredericBregier commented Jun 9, 2012

trustin commented Jun 10, 2012

fredericBregier commented Jun 10, 2012

normanmaurer commented Jun 12, 2012

normanmaurer commented Jun 12, 2012

normanmaurer commented Jun 13, 2012

fredericBregier commented Jun 13, 2012

dimmacro commented Apr 10, 2013

Using channel Local : java.lang.IllegalStateException: await() in I/O thread causes a dead lock or sudden performance drop. Use addListener() instead or call await() from a different thread. #387

Using channel Local : java.lang.IllegalStateException: await() in I/O thread causes a dead lock or sudden performance drop. Use addListener() instead or call await() from a different thread. #387