New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal: DistributorDirectory should not invoke distributor when reading an existing file #7306

Closed
mikemccand opened this Issue Aug 18, 2014 · 1 comment

Comments

Projects
None yet
2 participants
@mikemccand
Contributor

mikemccand commented Aug 18, 2014

I noticed some hot threads doing this while computing node stats:

java.io.UnixFileSystem.getSpace(Native Method)
       java.io.File.getUsableSpace(File.java:1862)
       org.elasticsearch.index.store.distributor.AbstractDistributor.getUsableSpace(AbstractDistributor.java:60)
       org.elasticsearch.index.store.distributor.LeastUsedDistributor.doAny(LeastUsedDistributor.java:45)
       org.elasticsearch.index.store.distributor.AbstractDistributor.any(AbstractDistributor.java:52)
       org.elasticsearch.index.store.DistributorDirectory.getDirectory(DistributorDirectory.java:176)
       org.elasticsearch.index.store.DistributorDirectory.getDirectory(DistributorDirectory.java:144)
       org.elasticsearch.index.store.DistributorDirectory.fileLength(DistributorDirectory.java:113)
       org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:63)
       org.elasticsearch.common.lucene.Directories.estimateSize(Directories.java:43)
       org.elasticsearch.index.store.Store.stats(Store.java:174)
       org.elasticsearch.index.shard.service.InternalIndexShard.storeStats(InternalIndexShard.java:524)
       org.elasticsearch.action.admin.indices.stats.CommonStats.<init>(CommonStats.java:130)
       org.elasticsearch.action.admin.indices.stats.ShardStats.<init>(ShardStats.java:49)
       org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:195)
       org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:53)
       org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$ShardTransportHandler.messageReceived(TransportBroadcastOperationAction.java:338)
       org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$ShardTransportHandler.messageReceived(TransportBroadcastOperationAction.java:324)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:745)

Which is odd because why would we invoke the least_used distributor when checking fileLength (or opening for read, in other cases) an already-existing file? Seems like we should only check this when writing a new file.

Looking at line 176 of 1.x of DistributorDirectory.java, it looks like we do this to simplify concurrency (so we can use CHM.putIfAbsent), but I think we should fix this code to only invoke the distributor when it's writing a new file?

mikemccand added a commit that referenced this issue Aug 19, 2014

Core: DistributorDirectory shouldn't search for directory when readin…
…g existing file

This was causing too much work e.g. when pulling node stats or when
opening a new reader, because the least_used distributor would
unnecessarily check free disk space on all path.data entires every
time we try to open a file for reading or check its length.

Closes #7306

Closes #7323

mikemccand added a commit that referenced this issue Aug 19, 2014

Core: DistributorDirectory shouldn't search for directory when readin…
…g existing file

This was causing too much work e.g. when pulling node stats or when
opening a new reader, because the least_used distributor would
unnecessarily check free disk space on all path.data entires every
time we try to open a file for reading or check its length.

Closes #7306

Closes #7323

@clintongormley clintongormley changed the title from Core: DistributorDirectory should not invoke distributor when reading an existing file to Internal: DistributorDirectory should not invoke distributor when reading an existing file Sep 8, 2014

mikemccand added a commit that referenced this issue Sep 8, 2014

Core: DistributorDirectory shouldn't search for directory when readin…
…g existing file

This was causing too much work e.g. when pulling node stats or when
opening a new reader, because the least_used distributor would
unnecessarily check free disk space on all path.data entires every
time we try to open a file for reading or check its length.

Closes #7306

Closes #7323

@s1monw s1monw added >bug and removed >enhancement labels Sep 30, 2014

@s1monw

This comment has been minimized.

Contributor

s1monw commented Sep 30, 2014

FYI - I relabelled this since it's a bug

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015

Core: DistributorDirectory shouldn't search for directory when readin…
…g existing file

This was causing too much work e.g. when pulling node stats or when
opening a new reader, because the least_used distributor would
unnecessarily check free disk space on all path.data entires every
time we try to open a file for reading or check its length.

Closes elastic#7306

Closes elastic#7323
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment