Creating Index painfully slow on cluster with large indices #18776

puuzll · 2016-06-08T07:29:40Z

I have a cluster with 2 nodes and approximate 10000 index.Each index has one replication. Creating index with some preset mapping on this cluster is painfully slow( about 1 minute to create a index with 1 replication). There is sufficient memory , java heap , cpu and disk when creating index. I use hot_threads api and find that 95% period of time is spended on running the following code on master node:

    at com.google.common.collect.Iterators$3.hasNext(Iterators.java:164)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyDeletedShards(IndicesClusterStateService.java:256)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:167)
    - locked <0x00000000f96ea8b0> (a java.lang.Object)
    at org.elasticsearch.cluster.service.InternalClusterService.runTasksForExecutor(InternalClusterService.java:610)
    at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:772)
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231)
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

is this a bug? Can I avoid this by setting any configuration?
Elasticsearch version:
2.3.3
JVM version:
1.8
OS version:
debian7

The text was updated successfully, but these errors were encountered:

jasontedor · 2016-06-08T19:46:47Z

is this a bug?

When you create an index, that causes a change to the routing table. A cluster state update task is submitted to the nodes in the cluster. When that cluster state update task arrives, each node must process the new routing table to see if they need to remove indices, delete shards, start shards, etc. Currently, applying deleted shards is O(number of indices * number of shards). I opened #18788 to address this.

However, you're still going to be hurting here. Having 10000 indices on two nodes with one replica is asking for pain. This means that you have at a minimum 10000 shards on each node if you have one shard per index, and maybe 50000 shards on each node if you're using the default number of shards per index. Either way, this is way too many shards. So #18788 is not meant to address your issue directly, just improve performance for the general case. You'll still need to do something about how many indices and shards that you have.

Can I avoid this by setting any configuration?

No.

jilen · 2016-10-08T01:33:15Z

@jasontedor I suffered from this. Is there any way to improve the index creation speed ?

jasontedor · 2016-10-10T11:00:55Z

@jilen Creating an index requires a cluster state update which can be a slow thing indeed. The issue here was about the degradation in index-creation speed as the number of indices increased; that's what #18788 addressed. I'd say that if you rely on index creation being fast, you probably have an architecture that needs to be reconsidered.

bleskes · 2016-10-10T12:04:45Z

@jilen to quantify what @jasontedor said (which is very true) - index creation is slow when compared to data level operations like indexing and search. You should expect it to run within a couple of seconds. Also note that we now wait (since 5.0) for the primaries to be fully allocated before responding to the call.

jilen · 2016-10-11T02:22:00Z

@jasontedor @bleskes I am now applying one-index-per-user pattern, there are actually more than 20k shards.

Parallel automatically index creationg(via bulk or update api) actually makes the cluster dead(no response).

What do you suggest for my situation ? Disable automatically index creation ?

nik9000 · 2016-10-11T03:28:03Z

Don't have an index per user.

On Oct 10, 2016 10:22 PM, "jilen" notifications@github.com wrote:

@jasontedor https://github.com/jasontedor I am now applying one-index-per-user
pattern, there are actually more than 20k shards.

Parallel automatically index creationg(via bulk or update api) actually
makes the cluster dead(no response).

What do you suggest for my situation ? Disable automatically index
creation ?

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#18776 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AANLovR2fcqDQQkHcEd0qSWIhqX2DI-9ks5qyvLNgaJpZM4IwrBH
.

NelsonBurton · 2021-07-29T03:33:30Z

An alternative to index per user, is to put all customers in one index, and use your customer's identifier as Elasticsearch's routingId, https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-routing-field.html . Works well in our system with millions of users, and 1 index.

jasontedor mentioned this issue Jun 8, 2016

Improve performance of applyDeletedShards #18788

Merged

jasontedor closed this as completed Jun 8, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating Index painfully slow on cluster with large indices #18776

Creating Index painfully slow on cluster with large indices #18776

puuzll commented Jun 8, 2016

jasontedor commented Jun 8, 2016 •

edited

Loading

jilen commented Oct 8, 2016

jasontedor commented Oct 10, 2016

bleskes commented Oct 10, 2016

jilen commented Oct 11, 2016 •

edited

Loading

nik9000 commented Oct 11, 2016

NelsonBurton commented Jul 29, 2021 •

edited

Loading

Creating Index painfully slow on cluster with large indices #18776

Creating Index painfully slow on cluster with large indices #18776

Comments

puuzll commented Jun 8, 2016

jasontedor commented Jun 8, 2016 • edited Loading

jilen commented Oct 8, 2016

jasontedor commented Oct 10, 2016

bleskes commented Oct 10, 2016

jilen commented Oct 11, 2016 • edited Loading

nik9000 commented Oct 11, 2016

NelsonBurton commented Jul 29, 2021 • edited Loading

jasontedor commented Jun 8, 2016 •

edited

Loading

jilen commented Oct 11, 2016 •

edited

Loading

NelsonBurton commented Jul 29, 2021 •

edited

Loading