Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resend failed shard messages when receiving a cluster state still referring to the failed shards #6881

Closed
wants to merge 1 commit into from

Conversation

bleskes
Copy link
Contributor

@bleskes bleskes commented Jul 15, 2014

In rare cases we may fail to send a shard failure event to the master, or there is no known master when the shard has failed (ex. a couple of node leave the cluster canceling recoveries and causing a master to step down at the same time). When that happens and a cluster state arrives from the (new) master we should resend the shard failure in order for the master to remove the shard from this node.

…still referring to them

In rare cases we may fail to send a shard failure event to the master, or there is no known master when the shard has failed (ex. a couple of node leave the cluster canceling recoveries and causing a master to step down at the same time). When that happens and a cluster state arrives from the (new) master we should resend the shard failure in order for the master to remove the shard from this node.
@bleskes bleskes added review and removed v1.4.0 labels Jul 15, 2014
@s1monw
Copy link
Contributor

s1monw commented Jul 15, 2014

LGTM

@s1monw s1monw removed the review label Jul 15, 2014
@bleskes bleskes closed this in d869163 Jul 16, 2014
bleskes added a commit that referenced this pull request Jul 16, 2014
…that still refers to them

In rare cases we may fail to send a shard failure event to the master, or there is no known master when the shard has failed (ex. a couple of node leave the cluster canceling recoveries and causing a master to step down at the same time). When that happens and a cluster state arrives from the (new) master we should resend the shard failure in order for the master to remove the shard from this node.

Closes #6881
bleskes added a commit that referenced this pull request Jul 16, 2014
…that still refers to them

In rare cases we may fail to send a shard failure event to the master, or there is no known master when the shard has failed (ex. a couple of node leave the cluster canceling recoveries and causing a master to step down at the same time). When that happens and a cluster state arrives from the (new) master we should resend the shard failure in order for the master to remove the shard from this node.

Closes #6881
@bleskes bleskes deleted the resend_fail_shard branch July 16, 2014 08:08
@clintongormley clintongormley changed the title [Infra] re-send failed shard messages when receiving a cluster state still referring to them Resiliency: Resend failed shard messages when receiving a cluster state still referring to the failed shards Jul 16, 2014
@clintongormley clintongormley changed the title Resiliency: Resend failed shard messages when receiving a cluster state still referring to the failed shards Resend failed shard messages when receiving a cluster state still referring to the failed shards Jun 7, 2015
mute pushed a commit to mute/elasticsearch that referenced this pull request Jul 29, 2015
…that still refers to them

In rare cases we may fail to send a shard failure event to the master, or there is no known master when the shard has failed (ex. a couple of node leave the cluster canceling recoveries and causing a master to step down at the same time). When that happens and a cluster state arrives from the (new) master we should resend the shard failure in order for the master to remove the shard from this node.

Closes elastic#6881
@clintongormley clintongormley added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. and removed :Cluster labels Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. >enhancement resiliency v1.3.0 v2.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants