You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
shard_size will enable to increase the accuracy of the returned term entries.
The size parameter defines how many top terms should be returned out
of the overall terms list. By default, the node coordinating the
search process will ask each shard to provide its own top size terms
and once all shards respond, it will reduces the results to the final list
that will then be sent back to the client. This means that if the number
of unique terms is greater than size, the returned list is slightly off
and not accurate (it could be that the term counts are slightly off and it
could even be that a term that should have been in the top size entries
was not returned).
The higher the requested size is, the more accurate the results will be,
but also, the more expensive it will be to compute the final results (both
due to bigger priority queues that are managed on a shard level and due to
bigger data transfers between the nodes and the client). In an attempt to
minimize the extra work that comes with bigger requested size we a shard_size parameter was introduced. The once defined, it will determine
how many terms the coordinating node is requesting from each shard. Once
all the shards responded, the coordinating node will then reduce them
to a final result which will be based on the size parameter - this way,
once can increase the accuracy of the returned terms and avoid the overhead
of streaming a big list of terms back to the client.
Note that shard_size cannot be smaller than size... if that's the case
elasticsearch will override it and reset it to be equal to size.
The text was updated successfully, but these errors were encountered:
…he "shard_size" is the number of term entries each shard will send back to the coordinating node. "shard_size" > "size" will increase the accuracy (both in terms of the counts associated with each term and the terms that will actually be returned the user) - of course, the higher "shard_size" is, the more expensive the processing becomes as bigger queues are maintained on a shard level and larger lists are streamed back from the shards.
closes#3821
…he "shard_size" is the number of term entries each shard will send back to the coordinating node. "shard_size" > "size" will increase the accuracy (both in terms of the counts associated with each term and the terms that will actually be returned the user) - of course, the higher "shard_size" is, the more expensive the processing becomes as bigger queues are maintained on a shard level and larger lists are streamed back from the shards.
closeselastic#3821
shard_size
will enable to increase the accuracy of the returned term entries.The
size
parameter defines how many top terms should be returned outof the overall terms list. By default, the node coordinating the
search process will ask each shard to provide its own top
size
termsand once all shards respond, it will reduces the results to the final list
that will then be sent back to the client. This means that if the number
of unique terms is greater than
size
, the returned list is slightly offand not accurate (it could be that the term counts are slightly off and it
could even be that a term that should have been in the top
size
entrieswas not returned).
The higher the requested
size
is, the more accurate the results will be,but also, the more expensive it will be to compute the final results (both
due to bigger priority queues that are managed on a shard level and due to
bigger data transfers between the nodes and the client). In an attempt to
minimize the extra work that comes with bigger requested
size
we ashard_size
parameter was introduced. The once defined, it will determinehow many terms the coordinating node is requesting from each shard. Once
all the shards responded, the coordinating node will then reduce them
to a final result which will be based on the
size
parameter - this way,once can increase the accuracy of the returned terms and avoid the overhead
of streaming a big list of terms back to the client.
Note that
shard_size
cannot be smaller thansize
... if that's the caseelasticsearch will override it and reset it to be equal to
size
.The text was updated successfully, but these errors were encountered: