Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Deadlock #183

Closed
a8156268 opened this issue Aug 13, 2021 · 1 comment
Closed

[BUG] Deadlock #183

a8156268 opened this issue Aug 13, 2021 · 1 comment
Labels

Comments

@a8156268
Copy link

a8156268 commented Aug 13, 2021

In a modified version, a deadlock occurred. I think the issue may still exist in standard version.
This deadlock involved two peers. They trying to flush data to each other, and thus blocking data-receiver and so flushing is always failed and failed into a dead loop.

Stack of one peer:

"6881.bt.net.message-dispatcher-10.61.97.171:-1" #63080 daemon prio=5 os_prio=0 tid=0x000000005ed13000 nid=0x14ef1 runnable [0x00007f04c6bcd000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
	at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
	at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
	at sun.nio.ch.IOUtil.write(IOUtil.java:51)
	at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
	- locked <0x00000001c9000078> (a java.lang.Object)
	at bt.net.pipeline.SocketChannelHandler.flush(SocketChannelHandler.java:148)
	- locked <0x0000000298e8e740> (a java.lang.Object)
	at bt.net.pipeline.SocketChannelHandler.send(SocketChannelHandler.java:71)
	at bt.net.SocketPeerConnection.postMessage(SocketPeerConnection.java:134)
	- locked <0x0000000298e8e790> (a bt.net.SocketPeerConnection)
	at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher$MessageDispatchingLoop.processSupplier(MultiThreadMessageDispatcher.java:179)
	at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher$MessageDispatchingLoop.run(MultiThreadMessageDispatcher.java:108)
	at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher.lambda$createAndSubmitTask$3(MultiThreadMessageDispatcher.java:289)
	at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher$$Lambda$746/1406993695.run(Unknown Source)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
	

"6881.bt.runtime.shutdown.worker-4" #63231 daemon prio=5 os_prio=0 tid=0x0000000004fdc800 nid=0x14f8f waiting for monitor entry [0x00007f04d0c49000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at bt.net.pipeline.SocketChannelHandler.close(SocketChannelHandler.java:169)
	- waiting to lock <0x0000000298e8e740> (a java.lang.Object)
	- locked <0x0000000298e8e750> (a java.lang.Object)
	at bt.net.SocketPeerConnection.close(SocketPeerConnection.java:166)
	at bt.net.SocketPeerConnection.closeQuietly(SocketPeerConnection.java:154)
	at bt.net.PeerConnectionPool$$Lambda$845/1275417506.accept(Unknown Source)
	at bt.net.Connections$$Lambda$799/596951729.accept(Unknown Source)
	at java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4707)
	at bt.net.Connections.visitConnections(PeerConnectionPool.java:334)
	at bt.net.PeerConnectionPool.shutdown(PeerConnectionPool.java:270)
	at bt.net.PeerConnectionPool$$Lambda$630/585840387.run(Unknown Source)
	at bt.runtime.BtRuntime.lambda$toRunnable$7(BtRuntime.java:324)
	at bt.runtime.BtRuntime$$Lambda$678/1139650115.run(Unknown Source)
	at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Stack of anther peer:


"6881.bt.net.data-receiver" #64560 prio=5 os_prio=0 tid=0x000000004a832000 nid=0x158c7 waiting for monitor entry [0x00007f18d6109000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at bt.net.pipeline.SocketChannelHandler.processInboundData(SocketChannelHandler.java:115)
        - waiting to lock <0x00000001979717f8> (a java.lang.Object)
        at bt.net.pipeline.SocketChannelHandler.read(SocketChannelHandler.java:82)
        at bt.net.pipeline.DefaultChannelPipeline$DefaultChannelHandlerContext.readFromChannel(DefaultChannelPipeline.java:181)
        at bt.net.DataReceivingLoop.processKey(DataReceivingLoop.java:187)
        at bt.net.DataReceivingLoop.run(DataReceivingLoop.java:128)
        at bt.net.DataReceivingLoop.lambda$null$1(DataReceivingLoop.java:65)
        at bt.net.DataReceivingLoop$$Lambda$679/99718958.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
		
"6881.bt.net.pool.cleaner" #64562 prio=5 os_prio=0 tid=0x000000003897d800 nid=0x158c9 waiting for monitor entry [0x00007f18d840b000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at bt.net.pipeline.SocketChannelHandler.close(SocketChannelHandler.java:169)
        - waiting to lock <0x00000001aff3dc48> (a java.lang.Object)
        - locked <0x00000001979717f8> (a java.lang.Object)
        at bt.net.SocketPeerConnection.close(SocketPeerConnection.java:166)
        at bt.net.SocketPeerConnection.closeQuietly(SocketPeerConnection.java:154)
        at bt.net.PeerConnectionPool.purgeConnection(PeerConnectionPool.java:264)
        at bt.net.PeerConnectionPool.access$200(PeerConnectionPool.java:48)
        at bt.net.PeerConnectionPool$Cleaner.lambda$run$0(PeerConnectionPool.java:249)
        at bt.net.PeerConnectionPool$Cleaner$$Lambda$795/1375733778.accept(Unknown Source)
        at bt.net.Connections$$Lambda$796/1447003786.accept(Unknown Source)
        at java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4707)
        at bt.net.Connections.visitConnections(PeerConnectionPool.java:334)
        at bt.net.PeerConnectionPool$Cleaner.run(PeerConnectionPool.java:239)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
		
		

"6881.bt.net.message-dispatcher-10.61.96.233:6881" #64794 daemon prio=5 os_prio=0 tid=0x0000000056929000 nid=0x159af runnable [0x00007f18c7e1e000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
        at sun.nio.ch.IOUtil.write(IOUtil.java:51)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
        - locked <0x00000001aff3dbf8> (a java.lang.Object)	writeLock
        at bt.net.pipeline.SocketChannelHandler.flush(SocketChannelHandler.java:148)
        - locked <0x00000001aff3dc48> (a java.lang.Object)  outboundBufferLock
        at bt.net.pipeline.SocketChannelHandler.send(SocketChannelHandler.java:71)
        at bt.net.SocketPeerConnection.postMessage(SocketPeerConnection.java:134)
        - locked <0x00000001aff3dc88> (a bt.net.SocketPeerConnection)   synchronized
        at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher$MessageDispatchingLoop.processSupplier(MultiThreadMessageDispatcher.java:179)
        at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher$MessageDispatchingLoop.run(MultiThreadMessageDispatcher.java:108)
        at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher.lambda$createAndSubmitTask$3(MultiThreadMessageDispatcher.java:289)
        at com.ctrip.flight.intl.common.bt.MultiThreadMessageDispatcher$$Lambda$743/1741032101.run(Unknown Source)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

In both peers, message-dispatcher thread fall in a dead loop:

                while (buffer.hasRemaining()) {
                    channel.write(buffer);
                }

possibly because tcp write buffer is full, and because remote peer's tcp read buffer is full, and because remote peers's data-receiver thread is blocked.

@a8156268 a8156268 added the bug label Aug 13, 2021
a8156268 pushed a commit to a8156268/bt that referenced this issue Aug 13, 2021
atomashpolskiy added a commit that referenced this issue Aug 13, 2021
Fix a dead-lock in SocketChannelHandler #183
@atomashpolskiy
Copy link
Owner

Thanks for reporting and making a PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants