Skip to content

Loading…

in 1.0.3 When many request come to hecotor pool . more than one node have problem. #647

Open
fbwotjq opened this Issue · 0 comments

1 participant

@fbwotjq

in 1.0.3 version. When many request come to hecotor pool . and more than one node have problem(doesn't send reponse but node is alive).

1 . application did't have socket connection
and i try this commnad ==> netstat -na | grep | wc - l ==> result is 0

2 . Thread used hector pool will be blocked ... and if application or cassandra didn't recover status. many thread will be Time_wait status.

3 . and hector pool status : blocked : 400, active: 0 : idle: 0

this is thread dump. all most thread are ...

"[CASSANDRA_JOB_WORKER-2]thread-91" prio=10 tid=0x0000000041e2a000 nid=0x7e01 waiting on condition [0x00007f56f6c3b000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006c6b0fe20> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:340)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.waitForConnection(ConcurrentHClientPool.java:114)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.borrowClient(ConcurrentHClientPool.java:82)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:238)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)
at com.xxx.xxx.xxxxxx.cassandra.dao.xxxxxxxxx.xxxxxxxxxxxxxxxxxxDaoCassandra.select(xxxxxxxxxxxxxxxxxxDaoCassandra.java:181)
at com.xxx.xxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx.xxxxxxxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxxxx.java:449)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.xxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxx.java:463)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.access$3400(xxxxxxxxxxxxxxxx.java:97)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx$xxxxxxxxxxxxxxxxxxxx.run(QueueManager.java:1506)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
"[CASSANDRA_JOB_WORKER-2]thread-90" prio=10 tid=0x0000000041e2a000 nid=0x7e01 waiting on condition [0x00007f56f6c3b000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006c6b0fe20> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:340)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.waitForConnection(ConcurrentHClientPool.java:114)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.borrowClient(ConcurrentHClientPool.java:82)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:238)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)
at com.xxx.xxx.xxxxxx.cassandra.dao.xxxxxxxxx.xxxxxxxxxxxxxxxxxxDaoCassandra.select(xxxxxxxxxxxxxxxxxxDaoCassandra.java:181)
at com.xxx.xxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx.xxxxxxxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxxxx.java:449)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.xxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxx.java:463)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.access$3400(xxxxxxxxxxxxxxxx.java:97)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx$xxxxxxxxxxxxxxxxxxxx.run(QueueManager.java:1506)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
"[CASSANDRA_JOB_WORKER-2]thread-89" prio=10 tid=0x0000000041e2a000 nid=0x7e01 waiting on condition [0x00007f56f6c3b000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006c6b0fe20> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:340)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.waitForConnection(ConcurrentHClientPool.java:114)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.borrowClient(ConcurrentHClientPool.java:82)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:238)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)
at com.xxx.xxx.xxxxxx.cassandra.dao.xxxxxxxxx.xxxxxxxxxxxxxxxxxxDaoCassandra.select(xxxxxxxxxxxxxxxxxxDaoCassandra.java:181)
at com.xxx.xxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx.xxxxxxxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxxxx.java:449)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.xxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxx.java:463)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.access$3400(xxxxxxxxxxxxxxxx.java:97)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx$xxxxxxxxxxxxxxxxxxxx.run(QueueManager.java:1506)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
"[CASSANDRA_JOB_WORKER-2]thread-87" prio=10 tid=0x0000000041e2a000 nid=0x7e01 waiting on condition [0x00007f56f6c3b000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006c6b0fe20> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:340)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.waitForConnection(ConcurrentHClientPool.java:114)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.borrowClient(ConcurrentHClientPool.java:82)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:238)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)
at com.xxx.xxx.xxxxxx.cassandra.dao.xxxxxxxxx.xxxxxxxxxxxxxxxxxxDaoCassandra.select(xxxxxxxxxxxxxxxxxxDaoCassandra.java:181)
at com.xxx.xxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx.xxxxxxxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxxxx.java:449)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.xxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxx.java:463)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.access$3400(xxxxxxxxxxxxxxxx.java:97)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx$xxxxxxxxxxxxxxxxxxxx.run(QueueManager.java:1506)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
"[CASSANDRA_JOB_WORKER-2]thread-86" prio=10 tid=0x0000000041e2a000 nid=0x7e01 waiting on condition [0x00007f56f6c3b000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006c6b0fe20> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:340)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.waitForConnection(ConcurrentHClientPool.java:114)
at me.prettyprint.cassandra.connection.ConcurrentHClientPool.borrowClient(ConcurrentHClientPool.java:82)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:238)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)
at com.xxx.xxx.xxxxxx.cassandra.dao.xxxxxxxxx.xxxxxxxxxxxxxxxxxxDaoCassandra.select(xxxxxxxxxxxxxxxxxxDaoCassandra.java:181)
at com.xxx.xxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx.xxxxxxxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxxxx.java:449)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.xxxxxxxxxxxxxxxxx(xxxxxxxxxxxxxxxx.java:463)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx.access$3400(xxxxxxxxxxxxxxxx.java:97)
at com.xxx.xxx.xxxxx.xxxxxxxxxxxxx$xxxxxxxxxxxxxxxxxxxx.run(QueueManager.java:1506)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

so. i guess that my application lose connection socket. but it didn't recover connection in many reason.
so i think. how about make new routing policy. standard is Blocked/client count

@fbwotjq fbwotjq added a commit to fbwotjq/hector that referenced this issue
@fbwotjq fbwotjq #647 code 9890a73
@fbwotjq fbwotjq added a commit to fbwotjq/hector that referenced this issue
@fbwotjq fbwotjq Issue #647 BlockedHealthBalncingPolicy 8e12f8d
@fbwotjq fbwotjq added a commit to fbwotjq/hector that referenced this issue
@fbwotjq fbwotjq #647 code 3acf1d1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.