Shard size estimations for Slice API do not target shards #843

jbaiera · 2016-09-08T22:19:44Z

When passing a shard id to the RestClient#count method, I noticed that the request parameter it renders with it is &preference=0 when it should be &preference=_shards:0. According to the Elasticsearch documentation:

_shards:2,3: Restricts the operation to the specified shards.
Custom (string) value: A custom value will be used to guarantee that the same shards will be used for the same custom value.

When using &preference=0, the count method pulls back the count for the entire index instead of just the count for the shard it's targeting.

For example, if we have 100 documents in an index, and 5 shards, we can assume that about 20 documents will appear in each shard. If we set es.input.maxdocsperpartition to a value of 10, then one would assume that we should have about 10 input splits (20 docs/shard divided by 10 maximum docs / partition times 5 shards). Currently, since the count method returns the count for the entire index instead of on a shard by shard basis, we get 50 input splits (100 docs divided by 10 maximum docs/partition times 5 shards).

The text was updated successfully, but these errors were encountered:

jbaiera added bug :Rest v5.0.0-beta1 labels Sep 8, 2016

jbaiera closed this as completed in 432e8f5 Sep 8, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shard size estimations for Slice API do not target shards #843

Shard size estimations for Slice API do not target shards #843

jbaiera commented Sep 8, 2016

Shard size estimations for Slice API do not target shards #843

Shard size estimations for Slice API do not target shards #843

Comments

jbaiera commented Sep 8, 2016