Cluster state delay can cause endless index request loop #12573

Closed
brwe opened this Issue Jul 31, 2015 · 3 comments

Projects

None yet

3 participants

@brwe
Contributor
brwe commented Jul 31, 2015

When a primary is relocating from node_1 to node_2, there can be a short time where the old primary is removed from the node already (closed, not deleted) but the new primary is still in POST_RECOVERY. In this state indexing requests might be sent back and forth between node_1 and node_2 endlessly.

Course of events:

  1. primary ([index][0]) relocates from node_1 to node_2

  2. node_2 is done recovering, moves its shard to IndexShardState.POST_RECOVERY and sends a message to master that the shard is ShardRoutingState.STARTED

    Cluster state 1: 
    node_1: [index][0] RELOCATING (ShardRoutingState), (STARTED from IndexShardState perspective on node_1) 
    node_2: [index][0] INITIALIZING (ShardRoutingState), (at this point already POST_RECOVERY from IndexShardState perspective on node_2) 
    
  3. master receives shard started and updates cluster state to:

    Cluster state 2: 
    node_1: [index][0] no shard 
    node_2: [index][0] STARTED (ShardRoutingState), (at this point still in POST_RECOVERY from IndexShardState perspective on node_2) 
    

    master sends this to node_1 and node_2

  4. node_1 receives the new cluster state and removes its shard because it is not allocated on node_1 anymore

  5. index a document

At this point node_1 is already on cluster state 2 and does not have the shard anymore so it forwards the request to node_2. But node_2 is behind with cluster state processing, is still on cluster state 1 and therefore has the shard in IndexShardState.POST_RECOVERY and thinks node_1 has the primary. So it will send the request back to node_1. This goes on until either node_2 finally catches up and processes cluster state 2 or both nodes OOM.

I will make a pull request with a test shortly

@brwe
Contributor
brwe commented Jul 31, 2015

here is a test that reproduces this: #12574

@clintongormley
Member

I think this will be closed by #15900

@ywelsch
Contributor
ywelsch commented Jan 27, 2016

I've opened #16274 to address this issue.

@ywelsch ywelsch added a commit that closed this issue Feb 2, 2016
@ywelsch ywelsch Prevent TransportReplicationAction to route request based on stale lo…
…cal routing table

Closes #16274
Closes #12573
Closes #12574
af1f637
@ywelsch ywelsch closed this in af1f637 Feb 2, 2016
@bleskes bleskes added a commit that referenced this issue Apr 7, 2016
@bleskes bleskes Update resliency page
#14252 , #7572 , #15900, #12573, #14671, #15281 and #9126 have all been closed/merged and will be part of 5.0.0.
557a3d1
@bleskes bleskes added a commit that referenced this issue Apr 7, 2016
@bleskes bleskes Update resiliency page (#17586)
#14252 , #7572 , #15900, #12573, #14671, #15281 and #9126 have all been closed/merged and will be part of 5.0.0.
8eee28e
@ywelsch ywelsch added a commit to ywelsch/elasticsearch that referenced this issue Jul 7, 2016
@ywelsch @ywelsch ywelsch + ywelsch Prevent TransportReplicationAction to route request based on stale lo…
…cal routing table

Closes #16274
Closes #12573
Closes #12574
e517829
@ywelsch ywelsch added a commit to ywelsch/elasticsearch that referenced this issue Jul 7, 2016
@ywelsch @ywelsch ywelsch + ywelsch Prevent TransportReplicationAction to route request based on stale lo…
…cal routing table

Closes #16274
Closes #12573
Closes #12574
7f14f4b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment