Skip to content

druid broker deadlock case.  #13949

@myname00707

Description

@myname00707

my online env did not respond for query request, and i get broker stack, then get the stack and deadlock option below.

Found one Java-level deadlock:

"qtp199907649-183[scan_[hgs_test001]queryng-85a87803-a253-46d6-b854-8f1d8f6858a2]":
waiting to lock monitor 0x000055d414d9ac68 (object 0x00000000ead4c318, a java.lang.Object),
which is held by "HttpClient-Netty-Worker-15"
"HttpClient-Netty-Worker-15":
waiting to lock monitor 0x000055d40f104ee8 (object 0x00000000ead4c330, a java.lang.Object),
which is held by "qtp199907649-183[scan
[hgs_test001]_queryng-85a87803-a253-46d6-b854-8f1d8f6858a2]"

Java stack information for the threads listed above:

"qtp199907649-183[scan_[hgs_test001]_queryng-85a87803-a253-46d6-b854-8f1d8f6858a2]":
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.cleanUpWriteBuffer(AbstractNioWorker.java:398)
- waiting to lock <0x00000000ead4c318> (a java.lang.Object)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:128)
at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:84)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:779)
at org.jboss.netty.channel.Channels.write(Channels.java:725)
at org.jboss.netty.channel.Channels.write(Channels.java:686)
at org.jboss.netty.handler.ssl.SslHandler.wrapNonAppData(SslHandler.java:1111)
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1253)
- locked <0x00000000ead4c330> (a java.lang.Object)
at org.jboss.netty.handler.ssl.SslHandler.unwrapNonAppData(SslHandler.java:1167)
at org.jboss.netty.handler.ssl.SslHandler.closeOutboundAndChannel(SslHandler.java:1490)
at org.jboss.netty.handler.ssl.SslHandler.handleDownstream(SslHandler.java:514)
at org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:784)
at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:54)
at org.jboss.netty.handler.codec.http.HttpClientCodec.handleDownstream(HttpClientCodec.java:97)
at org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
at org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:582)
at org.jboss.netty.channel.Channels.close(Channels.java:812)
at org.jboss.netty.channel.AbstractChannel.close(AbstractChannel.java:205)
at org.apache.druid.java.util.http.client.pool.ChannelResourceFactory.close(ChannelResourceFactory.java:203)
at org.apache.druid.java.util.http.client.pool.ChannelResourceFactory.close(ChannelResourceFactory.java:48)
at org.apache.druid.java.util.http.client.pool.ResourcePool$ImmediateCreationResourceHolder.get(ResourcePool.java:238)
at org.apache.druid.java.util.http.client.pool.ResourcePool.take(ResourcePool.java:96)
at org.apache.druid.java.util.http.client.NettyHttpClient.go(NettyHttpClient.java:124)
at org.apache.druid.client.DirectDruidClient.run(DirectDruidClient.java:462)
at org.apache.druid.client.CachingClusteredClient$SpecificQueryRunnable.getSimpleServerResults(CachingClusteredClient.java:723)
at org.apache.druid.client.CachingClusteredClient$SpecificQueryRunnable.lambda$addSequencesFromServer$9(CachingClusteredClient.java:685)
at org.apache.druid.client.CachingClusteredClient$SpecificQueryRunnable$$Lambda$474/1645410643.accept(Unknown Source)
at java.util.TreeMap.forEach(TreeMap.java:1005)
at org.apache.druid.client.CachingClusteredClient$SpecificQueryRunnable.addSequencesFromServer(CachingClusteredClient.java:669)
at org.apache.druid.client.CachingClusteredClient$SpecificQueryRunnable.lambda$run$2(CachingClusteredClient.java:397)
at org.apache.druid.client.CachingClusteredClient$SpecificQueryRunnable$$Lambda$464/224122020.get(Unknown Source)
at org.apache.druid.java.util.common.guava.LazySequence.toYielder(LazySequence.java:46)
at org.apache.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:88)
at org.apache.druid.java.util.common.guava.WrappingSequence$2.get(WrappingSequence.java:84)
at org.apache.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:555)
at org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:410)
at org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:164)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:386)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
at java.lang.Thread.run(Thread.java:750)
"HttpClient-Netty-Worker-15":
at org.jboss.netty.handler.ssl.SslHandler.channelDisconnected(SslHandler.java:572)
- waiting to lock <0x00000000ead4c330> (a java.lang.Object)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:360)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:79)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.clearOpWrite(AbstractNioWorker.java:335)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:284)
- locked <0x00000000ead4c318> (a java.lang.Object)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:151)
at org.jboss.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:292)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:391)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:315)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)

Found 1 deadlock.

Affected Version

druid 0.21.1 version

Description

when i got the problem, it seems like that broker has been hanged for a long time, and from broker.log i can not get any usefull information.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions