Decide on Shard Replication Scheme #16

fhaynes · 2018-10-19T19:25:23Z

This relates to issue #14.

ElasticSearch Shards

This describes the basics of ES sharding.

Primary Shards

In ES, an index is created with a certain number of Primary shards (5 is common). These Primary shards are writable, and when data is written to an index, a Primary shard is chosen to receive it.

The number of primary shards for an index cannot be changed after creation.

Replica Shards

Each Primary shard has 0 or more Replica shards. After a Primary shard writes data, it is replicated to the Replica shards belonging to that Primary shard. Replica shards can be used to handle read requests, but not write. Ideally, all Replica shards are located on a different node/server.

Rebalancing

This means redistributing data amongst the cluster members. It is usually done in two situations.

Node Addition or Removal

If more nodes are added to a cluster, shards should be rebalanced to evenly distribute the load. This can be done in many ways. With consistent hashing, the algorithm itself will tell you what needs to be moved where.

With a leader architecture, a process will need to choose which shards to move where. This can be based on CPU, user-defined tags, memory usage or any other metadata we have about the state of the system.

Hot Shard Problem

It is common for a situation to arise where one node is overloaded because it holds shards for popular indices. This may be a temporary situation, or a longer-term issue. In this scenario, it is desireable to redistribute the hot shards to machines with less load.

Toshi

I suggest using Consul to track shard assignments along with leader election.

fhaynes mentioned this issue Oct 19, 2018

Cluster Tracking Issue #14

Closed

17 tasks

fhaynes closed this as completed Dec 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decide on Shard Replication Scheme #16

Decide on Shard Replication Scheme #16

fhaynes commented Oct 19, 2018

Decide on Shard Replication Scheme #16

Decide on Shard Replication Scheme #16

Comments

fhaynes commented Oct 19, 2018

ElasticSearch Shards

Primary Shards

Replica Shards

Rebalancing

Node Addition or Removal

Hot Shard Problem

Toshi