Ability to decouple a document shard allocation with its ID #28206

antonpious · 2018-01-13T01:06:18Z

The current logic of allocating a document to a shard in a node by its Id should be enhanced to have another id for shard allocation and provide an api to allow a document to be directly allocated to a shard.

The reason being, elastic search provides both search as well as the ability to get by id.
Lets assume that we have two documents which are colors namely red and blue. The are 5 shards in the index and there are 5 nodes and each node has 1 shard. Now based on the id of these documents the document went to the shard no 2 which is present in node no2 got the document.

We are now having the searches and get by ids from elastic search. The get by id is trying to get the red and blue document at large scale along with the searches. The node 2 would now receive a high amount of request as the documents are present only in the node. The same would happen if the search always tries to get the same set of documents.

As the no. of documents that are being fetched by id on this node becomes higher and higher the node peaks in either memory or cpu leading to the node becoming stressed which in-turn slows the searches too as now the node doesn't have the resources to provide results for the search request too, leading to a gradual degradation of the cluster even if one node in the cluster peaks. If we had the ability to move certain documents to other nodes in the cluster having the other shards we could better utilize the resources of the entire cluster.

One node causing the cluster imbalance is described in the issue #27622

DaveCTurner · 2018-01-17T10:21:42Z

It sounds to me that you're trying to use the document id for something quite different from what it's designed for: by design, the id defines the shard to which each document belongs. You can add other identifiers to your documents and use these to find them, and this would allow Elasticsearch to spread the load more evenly across the nodes.

antonpious · 2018-01-19T05:50:45Z

@DaveCTurner it could be anything, even if we use another field like color within the document, the fact that this document say color=red, would always be in the same shard and the calls to obtain this document would go to the same node is the cause of the issue. If we now have X Documents identified by this field then the calls would always go to this node. So there should be another lever for us to move the document to other nodes for better balancing. Unless of course this can be incorporated in the product once the feature is in to automatically move it to a different shard for better allocation

On a separate note:

the id defines the shard to which each document belongs

I am not sure if this is how other software would see it, and Id is the id of the document, elastic chose to implement the shard logic which is unfortunate. As per the data store is concerned, it asked for the document to be inserted while inserting the document it asked for an id, so an id was provided, in the event an id is not provided an automatic identifier was given. This is the similar logic for a data table.

DaveCTurner · 2018-01-19T08:44:35Z

I am not sure if this is how other software would see it, and Id is the id of the document, elastic chose to implement the shard logic which is unfortunate.

This is reasonably standard, and a quick survey of other sharded datastores will turn up similar design decisions. If you have an external identifier that you don't want to have an impact on the allocation of documents then you should simply store it in a different field.

ywelsch · 2018-01-19T09:17:43Z

Maybe also have a look at the _routing field: https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-routing-field.html

antonpious · 2018-01-22T02:25:37Z

@ywelsch , This feature can only say at put time where the document should go, what is needed is the ability to move the document to a different shard after putting, in this case, we need to delete the document and then try to reindex it again by giving a different routing field value.

What is needed is a fundamental decoupling between documents and the shard allocation.
Shard placement is an internal feature of certain stores and should not be exposed to the clients, the clients are working with the data stores and to them they can identify a document by what they want to identify. The data stores should take this hit of where it needs to be placed for better resource management in its cluster.
Like @DaveCTurner said, asking the client to pass could be a first step but it need to move to the next generation to incorporate this as part of the product.

So the system should be intelligent to keep track of how many documents are going to shards and either inject the shard logic if most documents are going to the same shard or to move documents seamlessly between shards if the get loads on a particular node is heavy. This meta data state management should be part of the system and not pass on this complexity to the clients.

bleskes · 2018-03-20T11:10:05Z

Shard placement is an internal feature of certain stores and should not be exposed to the clients, the clients are working with the data stores and to them they can identify a document by what they want to identify

I agree with the sentiment. Shards are an internal detail and from a user perspective the documents are store in an in index. How to a document is assigned to a shard is internal to the system (although we expose some tooling to allow people to optimize some use cases). Elasticsearch choice of shard is designed as a function of the document id. Once placed, documents are never re-assigned a shard as that is a very costly operation (unlike a simple key value store).

Instead of discussing abstract system design, I would love to hear the specific data setup and how you feel it goes wrong when it comes to single document assignment. The issue you opened and linked to concerns how shards are allocated, which is a valid concern. I'm going to close this issue for now. We can open discussion on the forums if you want to discuss things further.

antonpious mentioned this issue Jan 13, 2018

Ability to have a memory and cpu water mark for shard movement similar to the disk water mark #27622

Closed

DaveCTurner added discuss :Allocation labels Jan 17, 2018

DaveCTurner removed the discuss label Jan 19, 2018

colings86 added the feedback_needed label Jan 19, 2018

lcawl added :Search/Search Search-related issues that do not fall into other categories and removed :Allocation labels Feb 13, 2018

clintongormley added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. and removed :Search/Search Search-related issues that do not fall into other categories labels Feb 13, 2018

bleskes closed this as completed Mar 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ability to decouple a document shard allocation with its ID #28206

Ability to decouple a document shard allocation with its ID #28206

antonpious commented Jan 13, 2018 •

edited

DaveCTurner commented Jan 17, 2018

antonpious commented Jan 19, 2018 •

edited

DaveCTurner commented Jan 19, 2018

ywelsch commented Jan 19, 2018

antonpious commented Jan 22, 2018

bleskes commented Mar 20, 2018

Ability to decouple a document shard allocation with its ID #28206

Ability to decouple a document shard allocation with its ID #28206

Comments

antonpious commented Jan 13, 2018 • edited

DaveCTurner commented Jan 17, 2018

antonpious commented Jan 19, 2018 • edited

DaveCTurner commented Jan 19, 2018

ywelsch commented Jan 19, 2018

antonpious commented Jan 22, 2018

bleskes commented Mar 20, 2018

antonpious commented Jan 13, 2018 •

edited

antonpious commented Jan 19, 2018 •

edited