Do not use a background thread to disconnect node which are removed from the ClusterState #7543

bleskes · 2014-09-02T21:29:01Z

After a node fails to respond to a ping correctly (master or node fault detection), they are removed from the cluster state through an UpdateTask. When a node is removed, a background task is scheduled using the generic threadpool to actually disconnect the node. However, in the case of temporary node failures (for example) it may be that the node was re-added by the time the task get executed, causing an untimely disconnect call. Disconnect is cheep and should be done during the UpdateTask.

…ing it (after a ping failure) After a node fails to respond to a ping correctly (master or node fault detection), they are removed from the cluster state through an UpdateTask. When a node is removed, a background task is scheduled using the generic threadpool to actually disconnect the node. However, in the case of temporary node failures (for example) it may be that the node was re-added by the time the task get executed. We should check for that.

kimchy · 2014-09-02T21:33:04Z

I think simpler solution is just to remove the execution on a different thread, disconnect should be super cheap

bleskes · 2014-09-02T21:38:39Z

@kimchy updated with another commit. I'll change the description before pushing (assuming no more feedback)

kimchy · 2014-09-02T21:49:58Z

just to be safe, I would wrap each disconnect in a try ... catch block, similar to the connect code before. Other than that, LGTM.

…e remove from the ClusterState After a node fails to respond to a ping correctly (master or node fault detection), they are removed from the cluster state through an UpdateTask. When a node is removed, a background task is scheduled using the generic threadpool to actually disconnect the node. However, in the case of temporary node failures (for example) it may be that the node was re-added by the time the task get executed, causing an untimely disconnect call. Disconnect is cheep and should be done during the UpdateTask. Closes #7543

bleskes added v1.4.0 labels Sep 2, 2014

remove background execution all together

a483a22

added try catch around disconnect

593b9d7

bleskes closed this in 1f8db67 Sep 3, 2014

bleskes changed the title ~~[Internal] verify node is no longer in ClusterState before disconnecting it (after a ping failure)~~ [Internal] Do not use a background thread to disconnect node which are removed from the ClusterState Sep 3, 2014

bleskes deleted the verify_before_disconnect_on_update_task branch September 3, 2014 06:53

clintongormley changed the title ~~[Internal] Do not use a background thread to disconnect node which are removed from the ClusterState~~ Internal: Do not use a background thread to disconnect node which are removed from the ClusterState Sep 8, 2014

clintongormley changed the title ~~Internal: Do not use a background thread to disconnect node which are removed from the ClusterState~~ Resiliency: Do not use a background thread to disconnect node which are removed from the ClusterState Sep 8, 2014

clintongormley added the resiliency label Sep 8, 2014

jpountz removed the review label Oct 21, 2014

clintongormley added the :Cluster label Jun 7, 2015

clintongormley changed the title ~~Resiliency: Do not use a background thread to disconnect node which are removed from the ClusterState~~ Do not use a background thread to disconnect node which are removed from the ClusterState Jun 7, 2015

clintongormley added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. and removed :Cluster labels Feb 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Do not use a background thread to disconnect node which are removed from the ClusterState #7543

Do not use a background thread to disconnect node which are removed from the ClusterState #7543

Uh oh!

bleskes commented Sep 2, 2014

Uh oh!

kimchy commented Sep 2, 2014

Uh oh!

bleskes commented Sep 2, 2014

Uh oh!

kimchy commented Sep 2, 2014

Uh oh!

Uh oh!

Do not use a background thread to disconnect node which are removed from the ClusterState #7543

Do not use a background thread to disconnect node which are removed from the ClusterState #7543

Uh oh!

Conversation

bleskes commented Sep 2, 2014

Uh oh!

kimchy commented Sep 2, 2014

Uh oh!

bleskes commented Sep 2, 2014

Uh oh!

kimchy commented Sep 2, 2014

Uh oh!

Uh oh!