Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data structure corruption and resource leak in ChannelOutboundBuffer #1596

Closed
trustin opened this issue Jul 17, 2013 · 2 comments
Closed

Data structure corruption and resource leak in ChannelOutboundBuffer #1596

trustin opened this issue Jul 17, 2013 · 2 comments
Assignees
Labels
Milestone

Comments

@trustin
Copy link
Member

trustin commented Jul 17, 2013

I observe frequently hanging tests in our CI machine although I could not reproduce. Looking from the heap dump, the circular array queue implementation seems to get corrupted on a certain condition. I also find any unflushed writes will leak when the channel is closed.

Full thread dump from CI:

Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.7-b01 mixed mode):
"defaultEventExecutorGroup-6-2" prio=10 tid=0x00007f85c8013000 nid=0x6251 waiting on condition [0x00007f861ccc0000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000db490380> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219)
    at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
    at java.lang.Thread.run(Thread.java:722)
"defaultEventExecutorGroup-6-1" prio=10 tid=0x00007f85b8011000 nid=0x6250 waiting on condition [0x00007f861d4c8000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000db4a0110> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219)
    at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:34)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
    at java.lang.Thread.run(Thread.java:722)
"testsuite-oio-boss-4-1" daemon prio=10 tid=0x00007f85d0036000 nid=0x624f waiting on condition [0x00007f861cdc1000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000db565c50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at io.netty.util.concurrent.SingleThreadEventExecutor.takeTask(SingleThreadEventExecutor.java:219)
    at io.netty.channel.ThreadPerChannelEventLoop.run(ThreadPerChannelEventLoop.java:51)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
    at java.lang.Thread.run(Thread.java:722)
"testsuite-nio-worker-3-3" daemon prio=10 tid=0x00007f85c0001000 nid=0x624e runnable [0x00007f861cec2000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
    - locked <0x00000000db484298> (a io.netty.channel.nio.SelectedSelectionKeySet)
    - locked <0x00000000db484288> (a java.util.Collections$UnmodifiableSet)
    - locked <0x00000000db4841d0> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
    at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:635)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:319)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
    at java.lang.Thread.run(Thread.java:722)
"testsuite-oio-worker-5-1" daemon prio=10 tid=0x00007f85d0035000 nid=0x624d runnable [0x00007f861cfc3000]
   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:150)
    at java.net.SocketInputStream.read(SocketInputStream.java:121)
    at java.net.SocketInputStream.read(SocketInputStream.java:107)
    at io.netty.buffer.UnpooledUnsafeDirectByteBuf.setBytes(UnpooledUnsafeDirectByteBuf.java:388)
    at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:857)
    at io.netty.channel.oio.OioByteStreamChannel.doReadBytes(OioByteStreamChannel.java:110)
    at io.netty.channel.socket.oio.OioSocketChannel.doReadBytes(OioSocketChannel.java:138)
    at io.netty.channel.oio.AbstractOioByteChannel.doRead(AbstractOioByteChannel.java:84)
    at io.netty.channel.oio.AbstractOioChannel$1.run(AbstractOioChannel.java:40)
    at io.netty.channel.ThreadPerChannelEventLoop.run(ThreadPerChannelEventLoop.java:53)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
    at java.lang.Thread.run(Thread.java:722)
"testsuite-nio-boss-1-2" daemon prio=10 tid=0x00007f85d0030800 nid=0x624c runnable [0x00007f861d0c4000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
    - locked <0x00000000db4b0e60> (a io.netty.channel.nio.SelectedSelectionKeySet)
    - locked <0x00000000db4b0e50> (a java.util.Collections$UnmodifiableSet)
    - locked <0x00000000db4b0d98> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
    at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:635)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:319)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
    at java.lang.Thread.run(Thread.java:722)
"testsuite-nio-worker-3-2" daemon prio=10 tid=0x00007f85d4002000 nid=0x624b runnable [0x00007f861d1c5000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
    - locked <0x00000000db4a02c0> (a io.netty.channel.nio.SelectedSelectionKeySet)
    - locked <0x00000000db4a02b0> (a java.util.Collections$UnmodifiableSet)
    - locked <0x00000000db4a01f8> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
    at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:635)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:319)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
    at java.lang.Thread.run(Thread.java:722)
"testsuite-nio-worker-3-1" daemon prio=10 tid=0x00007f85d0029800 nid=0x624a runnable [0x00007f861d2c6000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
    - locked <0x00000000db490530> (a io.netty.channel.nio.SelectedSelectionKeySet)
    - locked <0x00000000db490520> (a java.util.Collections$UnmodifiableSet)
    - locked <0x00000000db490468> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
    at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:635)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:319)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
    at java.lang.Thread.run(Thread.java:722)
"testsuite-nio-boss-1-1" daemon prio=10 tid=0x00007f85d001e800 nid=0x6249 runnable [0x00007f861d3c7000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
    - locked <0x00000000db4a0458> (a io.netty.channel.nio.SelectedSelectionKeySet)
    - locked <0x00000000db4a0448> (a java.util.Collections$UnmodifiableSet)
    - locked <0x00000000db4a0390> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
    at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:635)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:319)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
    at java.lang.Thread.run(Thread.java:722)
"Service Thread" daemon prio=10 tid=0x00007f863c24a000 nid=0x6244 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
"C1 CompilerThread3" daemon prio=10 tid=0x00007f863c247800 nid=0x6243 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
"C2 CompilerThread2" daemon prio=10 tid=0x00007f863c23d000 nid=0x6242 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" daemon prio=10 tid=0x00007f863c23b800 nid=0x6241 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" daemon prio=10 tid=0x00007f863c238800 nid=0x6240 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" daemon prio=10 tid=0x00007f863c236000 nid=0x623f waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
"Finalizer" daemon prio=10 tid=0x00007f863c1ec000 nid=0x623e in Object.wait() [0x00007f861e0f7000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000db4a8960> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
    - locked <0x00000000db4a8960> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)
"Reference Handler" daemon prio=10 tid=0x00007f863c1e9800 nid=0x623d in Object.wait() [0x00007f861e1f8000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000db498588> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:503)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
    - locked <0x00000000db498588> (a java.lang.ref.Reference$Lock)
"main" prio=10 tid=0x00007f863c008000 nid=0x6233 waiting on condition [0x00007f8642744000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at io.netty.testsuite.transport.socket.SocketEchoTest.testSimpleEcho0(SocketEchoTest.java:151)
    at io.netty.testsuite.transport.socket.SocketEchoTest.testSimpleEchoWithAdditionalExecutor(SocketEchoTest.java:76)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at io.netty.testsuite.transport.socket.AbstractSocketTest.run(AbstractSocketTest.java:65)
    at io.netty.testsuite.transport.socket.SocketEchoTest.testSimpleEchoWithAdditionalExecutor(SocketEchoTest.java:72)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
    at org.junit.rules.RunRules.evaluate(RunRules.java:20)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
    at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
    at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
    at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
    at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
    at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
    at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
"VM Thread" prio=10 tid=0x00007f863c1e1000 nid=0x623c runnable 
"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f863c015800 nid=0x6234 runnable 
"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f863c017800 nid=0x6235 runnable 
"GC task thread#2 (ParallelGC)" prio=10 tid=0x00007f863c019000 nid=0x6236 runnable 
"GC task thread#3 (ParallelGC)" prio=10 tid=0x00007f863c01b000 nid=0x6237 runnable 
"GC task thread#4 (ParallelGC)" prio=10 tid=0x00007f863c01d000 nid=0x6238 runnable 
"GC task thread#5 (ParallelGC)" prio=10 tid=0x00007f863c01e800 nid=0x6239 runnable 
"GC task thread#6 (ParallelGC)" prio=10 tid=0x00007f863c020800 nid=0x623a runnable 
"GC task thread#7 (ParallelGC)" prio=10 tid=0x00007f863c022800 nid=0x623b runnable 
"VM Periodic Task Thread" prio=10 tid=0x00007f863c24c800 nid=0x6245 waiting on condition 
JNI global references: 228
Heap
 PSYoungGen      total 35072K, used 11537K [0x00000000d7200000, 0x00000000db800000, 0x0000000100000000)
  eden space 31424K, 25% used [0x00000000d7200000,0x00000000d79bff50,0x00000000d90b0000)
  from space 3648K, 98% used [0x00000000db470000,0x00000000db7f4530,0x00000000db800000)
  to   space 5184K, 0% used [0x00000000dade0000,0x00000000dade0000,0x00000000db2f0000)
 ParOldGen       total 83712K, used 2832K [0x0000000085600000, 0x000000008a7c0000, 0x00000000d7200000)
  object space 83712K, 3% used [0x0000000085600000,0x00000000858c40a0,0x000000008a7c0000)
 PSPermGen       total 21248K, used 10540K [0x0000000080400000, 0x00000000818c0000, 0x0000000085600000)
  object space 21248K, 49% used [0x0000000080400000,0x0000000080e4b1f8,0x00000000818c0000)
@normanmaurer
Copy link
Member

@trustin I suspect it's a bug in the OIO transport

@ghost ghost assigned trustin Jul 17, 2013
@trustin
Copy link
Member Author

trustin commented Jul 17, 2013

Fixed via a8d67b0 and cc9b956

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants