Sharding based on shard size or index size #16828

sherry-ger · 2016-02-26T18:48:34Z

Describe the feature:

It would be nice to be able to specify a target shard size in ES config file, and when a shard approaches that size, ES automatically create a new "shard" for that index. Our use case is a dynamic log size for our indices affected by organic traffic pattern changes and traffic shifts from one datacenter to a different one. These changes result in shards with undesired size, even up to 150 Gb, being relocated frequently, and causing instability to the entire cluster itself. Tuning the shard size is not an straight forward process and there may not be a single solution to the equation.

nik9000 · 2016-02-26T19:43:23Z

A few of the Elasticsearch committers have been kicking this idea around for a while. I wasn't the one who originated it but I'm online now so I'll describe it:

Right now if you have time based data like logs then we (Elastic the company, Elasticsearch committers, helpful people on discuss.elastic.co) recommend you go with an index per X time period. It makes you chose a few things:

What time period should you use? Daily was pretty traditional its not always right.
How many shards should these indexes have? More shards has higher write performance, sometimes lower search performance, and always puts more overhead on the cluster.
What do I do if my time periods don't have a predictable volume? I end up with some big time periods and some small ones and that is a pain.

So we have a plan to make all of these choices simpler! The basics go like this: create an index with enough shards to get maximal write performance. Something like (num_nodes - 1)/num_replicas shards. Eventually some even will occur (index gets to be a certain size probably) and we'll make a new index just like the old one automatically. Then we smash the old one down to one shard. Because the number of shards changes after the triggering event you get to live in the best of both worlds with regards to bullet number 2 above. Because this is dynamic bullets number 1 and 3 shouldn't be a problem either.

Now, you may ask lots of interesting questions. Like:

Why just one shard?: It's simpler to implement one shard given routing. I suppose you could go to more shards if you were careful to pick a number that doesn't break the routing but for this use case it doesn't seem worth it. From where I sit right now it seems fine just to have more time based indexes rather than more shards on a single time based index. Better, even.
How do you make this invisible to the client application? What if it wants to write to this magical growing index?: For now I don't want to make this invisible. I think its safer not to think of this as a nifty indexing behavior that you can opt in to rather than some perfect abstraction over and index. So when these new time based indexes are created they are just indexes. You can write to them if you want to, though I imagine most folks won't want to.
What if you want to change the mapping on a new index? This was a way that we recommended people roll changes into their time based indexes. Things like turning on doc_values or new analysis configuration.: This'll have to be supported somehow. Probably by creating the new write-optimized indexes using a template. The storage-optimized single shard indexes have to have the same mapping as the write-optimized indexes or the shard merging operation would be more non-trivial than it already is.
What about _optimize aka _force_merge?: We'll probably make that an optional part of the creation of the storage-optimized indexes from the write-optimized indexes. It might even be on by default. I dunno. It is a reasonable choice if you aren't going to be writing to the storage-optimized indexes. And I think that is the normal use case.
What rollover rules will you support at first?: Certainly index size and document count. Anything else is probably phase 2.
Will the rollover rules be exact or approximate?
: This is a leading question! They'll be approximate because they'll be calculated asynchronously on the shard level. So both their async nature and their shard by shard nature make them far from exact.

I'll add to this list/change it as I think of more things.

Does that cover it?

nik9000 · 2016-02-26T19:44:30Z

target shard size in ES config file

Almost certainly we'll do this as a part of the index's configuration, probably near where you'd set number_of_shards.

jasontedor · 2018-03-14T02:14:24Z

This is closed by the rollover and shrink APIs. Closing.

sherry-ger added >feature >enhancement labels Feb 26, 2016

nik9000 self-assigned this Feb 26, 2016

clintongormley added :Core/Infra/Core Core issues without another label Meta and removed >enhancement labels Feb 29, 2016

lukas-vlcek mentioned this issue Sep 15, 2016

creating default index template that can be used prior to common data… openshift/origin-aggregated-logging#238

Closed

jasontedor closed this as completed Mar 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sharding based on shard size or index size #16828

Sharding based on shard size or index size #16828

sherry-ger commented Feb 26, 2016

nik9000 commented Feb 26, 2016

nik9000 commented Feb 26, 2016

jasontedor commented Mar 14, 2018

Sharding based on shard size or index size #16828

Sharding based on shard size or index size #16828

Comments

sherry-ger commented Feb 26, 2016

nik9000 commented Feb 26, 2016

nik9000 commented Feb 26, 2016

jasontedor commented Mar 14, 2018