We have applied text-indexes on our partial-upsert tables. During segment commit, we are observing that the CONSUMING -> ONLINE thread gets blocked for hours on this flow:
"HelixTaskExecutor-message_handle_thread_22" #132 daemon prio=5 os_prio=0 cpu=36015.35ms elapsed=367482.10s tid=0x00007efc6f8ed800 nid=0xf7 in Object.wait() [0x00007efc656fc000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(java.base@11.0.23/Native Method)
- waiting on <no object reference available>
at org.apache.lucene.index.IndexWriter.doWait(IndexWriter.java:5410)
- waiting to re-lock in wait() <0x00007f11147aaa10> (a org.apache.lucene.index.IndexWriter)
at org.apache.lucene.index.IndexWriter.abortMerges(IndexWriter.java:2721)
- waiting to re-lock in wait() <0x00007f11147aaa10> (a org.apache.lucene.index.IndexWriter)
at org.apache.lucene.index.IndexWriter.rollbackInternalNoCommit(IndexWriter.java:2469)
- waiting to re-lock in wait() <0x00007f11147aaa10> (a org.apache.lucene.index.IndexWriter)
at org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2449)
- locked <0x00007f11147b03d0> (a java.lang.Object)
at org.apache.lucene.index.IndexWriter.rollback(IndexWriter.java:2441)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1364)
at org.apache.pinot.segment.local.segment.creator.impl.text.LuceneTextIndexCreator.close(LuceneTextIndexCreator.java:195)
at org.apache.pinot.segment.local.realtime.impl.invertedindex.RealtimeLuceneTextIndex.close(RealtimeLuceneTextIndex.java:179)
at org.apache.pinot.segment.local.indexsegment.mutable.MutableSegmentImpl$IndexContainer.lambda$close$0(MutableSegmentImpl.java:1330)
at org.apache.pinot.segment.local.indexsegment.mutable.MutableSegmentImpl$IndexContainer$$Lambda$2460/0x00007ee57b296108.accept(Unknown Source)
at org.apache.pinot.segment.local.indexsegment.mutable.MutableSegmentImpl$IndexContainer$$Lambda$2461/0x00007ee57b295cb0.accept(Unknown Source)
at java.util.HashMap.forEach(java.base@11.0.23/HashMap.java:1337)
at org.apache.pinot.segment.local.indexsegment.mutable.MutableSegmentImpl$IndexContainer.close(MutableSegmentImpl.java:1338)
at org.apache.pinot.segment.local.indexsegment.mutable.MutableSegmentImpl.destroy(MutableSegmentImpl.java:990)
at org.apache.pinot.core.data.manager.realtime.RealtimeSegmentDataManager.doDestroy(RealtimeSegmentDataManager.java:1343)
at org.apache.pinot.segment.local.data.manager.SegmentDataManager.destroy(SegmentDataManager.java:82)
at org.apache.pinot.core.data.manager.BaseTableDataManager.closeSegment(BaseTableDataManager.java:393)
at org.apache.pinot.core.data.manager.BaseTableDataManager.releaseSegment(BaseTableDataManager.java:382)
at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromConsuming(SegmentOnlineOfflineStateModelFactory.java:142)
at jdk.internal.reflect.GeneratedMethodAccessor2034.invoke(Unknown Source)
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.23/DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(java.base@11.0.23/Method.java:566)
at org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:350)
at org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:278)
at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97)
at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49)
at java.util.concurrent.FutureTask.run(java.base@11.0.23/FutureTask.java:264)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.23/ThreadPoolExecutor.java:1128)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.23/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base@11.0.23/Thread.java:829)
Since partial-upsert tables wait for segment build to complete, the new CONSUMING segment is not added to the replica where this huge wait time is observed.
Some more details:
- Pinot version: 1.1
- This is not happening consistently but erratically in one of the replicas. This leads to consumption being blocked in the stuck replica.
- Restarting the affected host is the only way out now to get table freshness back up to latest segment.
@itschrispeck shared a similar issue in ES: elastic/elasticsearch#107513
We have applied text-indexes on our partial-upsert tables. During segment commit, we are observing that the CONSUMING -> ONLINE thread gets blocked for hours on this flow:
Since partial-upsert tables wait for segment build to complete, the new CONSUMING segment is not added to the replica where this huge wait time is observed.
Some more details:
@itschrispeck shared a similar issue in ES: elastic/elasticsearch#107513