Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal: DistributorDirectory should not invoke distributor when reading an existing file #7306

Closed
mikemccand opened this issue Aug 18, 2014 · 1 comment

Comments

@mikemccand
Copy link
Contributor

I noticed some hot threads doing this while computing node stats:

java.io.UnixFileSystem.getSpace(Native Method)
       java.io.File.getUsableSpace(File.java:1862)
       org.elasticsearch.index.store.distributor.AbstractDistributor.getUsableSpace(AbstractDistributor.java:60)
       org.elasticsearch.index.store.distributor.LeastUsedDistributor.doAny(LeastUsedDistributor.java:45)
       org.elasticsearch.index.store.distributor.AbstractDistributor.any(AbstractDistributor.java:52)
       org.elasticsearch.index.store.DistributorDirectory.getDirectory(DistributorDirectory.java:176)
       org.elasticsearch.index.store.DistributorDirectory.getDirectory(DistributorDirectory.java:144)
       org.elasticsearch.index.store.DistributorDirectory.fileLength(DistributorDirectory.java:113)
       org.apache.lucene.store.FilterDirectory.fileLength(FilterDirectory.java:63)
       org.elasticsearch.common.lucene.Directories.estimateSize(Directories.java:43)
       org.elasticsearch.index.store.Store.stats(Store.java:174)
       org.elasticsearch.index.shard.service.InternalIndexShard.storeStats(InternalIndexShard.java:524)
       org.elasticsearch.action.admin.indices.stats.CommonStats.<init>(CommonStats.java:130)
       org.elasticsearch.action.admin.indices.stats.ShardStats.<init>(ShardStats.java:49)
       org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:195)
       org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.shardOperation(TransportIndicesStatsAction.java:53)
       org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$ShardTransportHandler.messageReceived(TransportBroadcastOperationAction.java:338)
       org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$ShardTransportHandler.messageReceived(TransportBroadcastOperationAction.java:324)
       org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:745)

Which is odd because why would we invoke the least_used distributor when checking fileLength (or opening for read, in other cases) an already-existing file? Seems like we should only check this when writing a new file.

Looking at line 176 of 1.x of DistributorDirectory.java, it looks like we do this to simplify concurrency (so we can use CHM.putIfAbsent), but I think we should fix this code to only invoke the distributor when it's writing a new file?

mikemccand added a commit that referenced this issue Aug 19, 2014
…g existing file

This was causing too much work e.g. when pulling node stats or when
opening a new reader, because the least_used distributor would
unnecessarily check free disk space on all path.data entires every
time we try to open a file for reading or check its length.

Closes #7306

Closes #7323
mikemccand added a commit that referenced this issue Aug 19, 2014
…g existing file

This was causing too much work e.g. when pulling node stats or when
opening a new reader, because the least_used distributor would
unnecessarily check free disk space on all path.data entires every
time we try to open a file for reading or check its length.

Closes #7306

Closes #7323
@clintongormley clintongormley changed the title Core: DistributorDirectory should not invoke distributor when reading an existing file Internal: DistributorDirectory should not invoke distributor when reading an existing file Sep 8, 2014
mikemccand added a commit that referenced this issue Sep 8, 2014
…g existing file

This was causing too much work e.g. when pulling node stats or when
opening a new reader, because the least_used distributor would
unnecessarily check free disk space on all path.data entires every
time we try to open a file for reading or check its length.

Closes #7306

Closes #7323
@s1monw s1monw added >bug and removed >enhancement labels Sep 30, 2014
@s1monw
Copy link
Contributor

s1monw commented Sep 30, 2014

FYI - I relabelled this since it's a bug

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
…g existing file

This was causing too much work e.g. when pulling node stats or when
opening a new reader, because the least_used distributor would
unnecessarily check free disk space on all path.data entires every
time we try to open a file for reading or check its length.

Closes elastic#7306

Closes elastic#7323
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants