[Performance] Batch shard requests lying in the queue #4763
Labels
discuss
Issues intended to help drive brainstorming and decision making
distributed framework
enhancement
Enhancement or improvement to existing feature or request
Indexing
Indexing, Bulk Indexing and anything related to indexing
An optimal bulk size is a function of memory, shard count, thread count, number of co-ordinating nodes etc. A lot of these factors change over time for a particular customer and customers may not revisit the bulk size numbers after a certain period of time. e.g. a cluster with 10 nodes and 10 shards may have an optimal 2000 bulk size with 200 as the shard level bulk size. But once the customer scales to 100 nodes and 100 shards- a 2000 bulk size would no longer be optimal due to huge co-ordination overhead and also the shard level bulk size is reduced to just 20- hence more fsync calls. One could argue that the customers themselves could set the bulk size well, but then a higher bulk size would mean that larger requests wait in the coordinator queue and hence increase the memory overhead.
The text was updated successfully, but these errors were encountered: