Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closing IndexWriter might wait a long time for merges to complete #107513

Open
DaveCTurner opened this issue Apr 16, 2024 · 2 comments
Open

Closing IndexWriter might wait a long time for merges to complete #107513

DaveCTurner opened this issue Apr 16, 2024 · 2 comments
Labels
>bug :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. Team:Distributed Meta label for distributed team

Comments

@DaveCTurner
Copy link
Contributor

Another instance of #89821: I saw a case where a cluster took 3+ minutes to apply a cluster state, and on closer inspection it was waiting on aborted merges to complete here:

 0.0% [cpu=0.0%, other=0.0%] (0s out of 500ms) cpu usage by thread 'elasticsearch[es-es-index-8d94bb688-8n6dg][clusterApplierService#updateTask][T#1]'
  10/10 snapshots sharing following 29 elements
    java.base@21.0.2/java.lang.Object.wait0(Native Method)
    java.base@21.0.2/java.lang.Object.wait(Object.java:366)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.index.IndexWriter.doWait(IndexWriter.java:5500)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.index.IndexWriter.abortMerges(IndexWriter.java:2738)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.index.IndexWriter.rollbackInternalNoCommit(IndexWriter.java:2493)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2462)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.index.IndexWriter.rollback(IndexWriter.java:2454)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.index.engine.InternalEngine.closeNoLock(InternalEngine.java:2620)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.index.engine.Engine.close(Engine.java:1956)
...
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.index.shard.IndexShard.close(IndexShard.java:1718)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.index.IndexService.closeShard(IndexService.java:589)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.index.IndexService.removeShard(IndexService.java:569)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.index.IndexService.close(IndexService.java:365)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.indices.IndicesService.removeIndex(IndicesService.java:914)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.indices.cluster.IndicesClusterStateService.removeIndicesAndShards(IndicesClusterStateService.java:409)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:247)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:540)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:526)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:499)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:430)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:155)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:917)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:217)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:183)
    java.base@21.0.2/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    java.base@21.0.2/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    java.base@21.0.2/java.lang.Thread.runWith(Thread.java:1596)
    java.base@21.0.2/java.lang.Thread.run(Thread.java:1583)

The associated merge thread was busy here:

 3.0% [cpu=3.0%, other=0.0%] (15ms out of 500ms) cpu usage by thread 'elasticsearch[es-es-index-8d94bb688-8n6dg][[.ds-traces-apm-default-2024.04.15-000064][0]: Lucene Merge Thread #21]'
  10/10 snapshots sharing following 30 elements
...
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.store.BufferedChecksumIndexInput.readBytes(BufferedChecksumIndexInput.java:46)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.store.DataInput.readBytes(DataInput.java:73)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.store.ChecksumIndexInput.skipByReading(ChecksumIndexInput.java:79)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.store.ChecksumIndexInput.seek(ChecksumIndexInput.java:64)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.codecs.CodecUtil.checksumEntireFile(CodecUtil.java:619)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.codecs.lucene90.compressing.Lucene90CompressingStoredFieldsReader.checkIntegrity(Lucene90CompressingStoredFieldsReader.java:725)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.codecs.lucene90.compressing.Lucene90CompressingStoredFieldsWriter.merge(Lucene90CompressingStoredFieldsWriter.java:609)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:234)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.index.SegmentMerger$$Lambda/0x00000028029fb1f8.merge(Unknown Source)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:273)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:110)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5252)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4740)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:6541)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:639)
    app/org.elasticsearch.server@8.14.0/org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:118)
    app/org.apache.lucene.core@9.10.0/org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:700)

AFAICT today there's no mechanism to abort the big read that checksumEntireFile requires when the corresponding merge is aborted. In this context it is plausible that this read might take a few minutes to complete (the data was coming in chunks from S3). I think we should have such a mechanism.

@DaveCTurner DaveCTurner added >bug :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. labels Apr 16, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Meta label for distributed team label Apr 16, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue May 1, 2024
Moves the work to close an `IndexShard`, including any final flush and
waiting for merges to complete, off the cluster applier thread to avoid
delaying the application of cluster state updates.

Relates elastic#89821
Relates elastic#107513
Relates elastic#108096
Relates ES-8334
elasticsearchmachine pushed a commit that referenced this issue May 3, 2024
Moves the work to close an `IndexShard`, including any final flush and
waiting for merges to complete, off the cluster applier thread to avoid
delaying the application of cluster state updates.

Relates #89821 Relates #107513 Relates #108096 Relates ES-8334
@DaveCTurner
Copy link
Contributor Author

I opened apache/lucene#13354 because it looks like we'd need to change something in Lucene to address this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. Team:Distributed Meta label for distributed team
Projects
None yet
Development

No branches or pull requests

2 participants