Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alias creation operation goes very slow when we have more than 100000 aliases #16853

Closed
malaki12003 opened this issue Feb 29, 2016 · 7 comments
Closed
Labels
:Data Management/Indices APIs APIs to create and manage indices and templates discuss

Comments

@malaki12003
Copy link

I've developed a snippet of code so that it tries to create 100000 Aliases on Elasticsearch. I found that as long as number of Aliases was increasing the time takes for creation an Alias is raising as well. It seems complexity of this operation is O(n) that makes no sense.

@ebuildy
Copy link
Contributor

ebuildy commented Feb 29, 2016

Hmmmmmm

Can you post your method and machine configuration please?

@jasontedor
Copy link
Member

It seems complexity of this operation is O(n) that makes no sense.

Every alias change requires a cluster state update. Cluster state updates are published using cluster state diffs. The diff is calculated by checking every existing alias the before cluster state, and whether or not it has been removed in the after cluster state. This is clearly linear, so it makes perfect sense. The benefit of sending cluster state diffs vastly outweighs the advantages of not executing this linear time operation, especially since having 100000 aliases is an anti-pattern. What is more, even if this linear diff operation was removed, the published cluster state updates grow linearly in the number of aliases.

@malaki12003
Copy link
Author

In my snippet of code I make requests in serialized form (one by one). For the first request it takes around 10 ms and in a linear growth, it goes up to 170 ms for 40000th request. I need to create requests on demand, this is why I can't use bulk format. Please note that in the simple test case, I dedicated a machine that provided with 16 gig ram and a fast SSD hard without any Cluster configuration and other loads. So, it seems that the problem is not related to clustering. In real situation, it event can be worse.I have 2 clustered servers in production and more than 100000 Alias that are made on them. I found for some requests, latency is up to 40s. I really think it is not acceptable.
The elastic version is 2.1.1 with 100 shards. The problem is just for Alias operations (create/delete) and other operations like indexing& searching work well. In fact, I have a web application with more than 500000 users. I make a routing policy through aliasing for each user. For each user request I check that wheather the alias is create for the user or not. If there is no alias for the user, I make an alias request. What is your suggestion?

@jasontedor
Copy link
Member

The elastic version is 2.1.1 with 100 shards.

On two nodes, for a single index? If my reading of this situation is correct, that strikes me as another anti-pattern.

What is your suggestion?

I think that you should use filters directly on each request. With the information that you've given so far, I doubt that the routing is necessary.

@malaki12003
Copy link
Author

I have 1 index, 100 shards and a routing policy based on user ids like the following example:
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html#_examples_2
There are more than 500000 users in my system. So, absolutely I need routing parameter. According to this information,what is your suggestion?

@jasontedor
Copy link
Member

I have 1 index, 100 shards and a routing policy based on user ids like the following example: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html#_examples_2

That is way too many shards for a single index on one or two machines. You'll be fine if you drop this to something more reasonable like two shards. As a bonanza, you can almost surely drop the custom routing.

I'm also sorry that the documentation has led you astray here. :(

There are more than 500000 users in my system. So, absolutely I need routing parameter.

I don't think that you do, but I could be wrong. Why do you think that you do?

According to this information,what is your suggestion?

I think that you should drop the number of shards to two, I think that you do not need to use custom routing, and I think that instead of using aliases you should use a filter.

@clintongormley
Copy link

There are more than 500000 users in my system.

@malaki12003 we have long spoken about faking an index-per-user using aliases, but this phrasing was ill chosen as it makes people believe that aliases are free and will scale infinitely.

Unfortunately the truth is more prosaic. As you have found, alias creation scales linearly. Frankly, I'm impressed that you got to 100k! In earlier versions we struggled to get over 10k.

This model works with small numbers of "users" (perhaps we should talk about index-per-tenant instead?) but at the scale you're talking about, you'll have to take a different approach. The problem with aliases is that they are held in memory all the time on every node. But you don't have all 500k users on your site at the same time. Instead, you could move this logic client side and use the user_id to add a routing value and filter to every request.

Regarding the number of shards you have, ie 100. Think about this carefully. You're saying that you plan to grow to a cluster with 100 nodes, just for the primary shards. Another 100-200 nodes for the replicas. Do you really plan on a cluster of 300 nodes? And you're sure that you will never reindex (eg to change your mappings) while growing from two nodes to 300?

My suggestion is to start with a more realistic number of shards like 5 or 10.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Indices APIs APIs to create and manage indices and templates discuss
Projects
None yet
Development

No branches or pull requests

4 participants