Concurrent node failures can cause unneeded cluster state publishing #8933

bleskes · 2014-12-12T19:54:56Z

When a node fails (or closes), the master processes the network disconnect event and removes the node from the cluster state. If multiple nodes fail (or shut down) in rapid succession, we process the events and remove the nodes one by one. During this process, the intermediate cluster states may cause the node fault detection to signal the failure of nodes that are not yet removed from the cluster state. While this is fine, it currently causes unneeded reroutes and cluster state publishing, which can be cumbersome in big clusters.

Closes #8804

…publishing When a node fails (or closes), the master processes the network disconnect event and removes the node from the cluster state. If multiple nodes fail (or shut down) in rapid succession, we process the events and remove the nodes one by one. During this process, the intermediate cluster states may cause the node fault detection to signal the failure of nodes that are not yet removed from the cluster state. While this is fine, it currently causes unneeded reroutes and cluster state publishing, which can be cumbersome in big clusters. Closes elastic#8804

s1monw · 2014-12-15T11:41:07Z

LGTM

…publishing When a node fails (or closes), the master processes the network disconnect event and removes the node from the cluster state. If multiple nodes fail (or shut down) in rapid succession, we process the events and remove the nodes one by one. During this process, the intermediate cluster states may cause the node fault detection to signal the failure of nodes that are not yet removed from the cluster state. While this is fine, it currently causes unneeded reroutes and cluster state publishing, which can be cumbersome in big clusters. Closes #8804 Closes #8933

…publishing When a node fails (or closes), the master processes the network disconnect event and removes the node from the cluster state. If multiple nodes fail (or shut down) in rapid succession, we process the events and remove the nodes one by one. During this process, the intermediate cluster states may cause the node fault detection to signal the failure of nodes that are not yet removed from the cluster state. While this is fine, it currently causes unneeded reroutes and cluster state publishing, which can be cumbersome in big clusters. Closes elastic#8804 Closes elastic#8933

bleskes added resiliency >enhancement review v2.0.0-beta1 v1.5.0 v1.4.2 labels Dec 12, 2014

bleskes closed this in d62bf5f Dec 15, 2014

bleskes changed the title ~~Discovery: concurrent node failures can cause unneeded cluster state pubishing~~ Discovery: concurrent node failures can cause unneeded cluster state publishing Dec 15, 2014

clintongormley changed the title ~~Discovery: concurrent node failures can cause unneeded cluster state publishing~~ Discovery: Concurrent node failures can cause unneeded cluster state publishing Dec 16, 2014

clintongormley added :Distributed/Discovery-Plugins Anything related to our integration plugins with EC2, GCP and Azure and removed review labels Mar 19, 2015

clintongormley changed the title ~~Discovery: Concurrent node failures can cause unneeded cluster state publishing~~ Concurrent node failures can cause unneeded cluster state publishing Jun 6, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrent node failures can cause unneeded cluster state publishing #8933

Concurrent node failures can cause unneeded cluster state publishing #8933

bleskes commented Dec 12, 2014

s1monw commented Dec 15, 2014

Concurrent node failures can cause unneeded cluster state publishing #8933

Concurrent node failures can cause unneeded cluster state publishing #8933

Conversation

bleskes commented Dec 12, 2014

s1monw commented Dec 15, 2014