Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not use a background thread to disconnect node which are removed from the ClusterState #7543

Closed

Conversation

Projects
None yet
4 participants
@bleskes
Copy link
Member

commented Sep 2, 2014

After a node fails to respond to a ping correctly (master or node fault detection), they are removed from the cluster state through an UpdateTask. When a node is removed, a background task is scheduled using the generic threadpool to actually disconnect the node. However, in the case of temporary node failures (for example) it may be that the node was re-added by the time the task get executed, causing an untimely disconnect call. Disconnect is cheep and should be done during the UpdateTask.

[Internal] verify node is no longer in ClusterState before disconnect…
…ing it (after a ping failure)

After a node fails to respond to a ping correctly (master or node fault detection), they are removed from the cluster state through an UpdateTask. When a node is removed, a background task is scheduled using the generic threadpool to actually disconnect the node. However, in the case of temporary node failures (for example) it may be that the node was re-added by the time the task get executed. We should check for that.

@bleskes bleskes added v1.4.0 labels Sep 2, 2014

@kimchy

This comment has been minimized.

Copy link
Member

commented Sep 2, 2014

I think simpler solution is just to remove the execution on a different thread, disconnect should be super cheap

@bleskes

This comment has been minimized.

Copy link
Member Author

commented Sep 2, 2014

@kimchy updated with another commit. I'll change the description before pushing (assuming no more feedback)

@kimchy

This comment has been minimized.

Copy link
Member

commented Sep 2, 2014

just to be safe, I would wrap each disconnect in a try ... catch block, similar to the connect code before. Other than that, LGTM.

@bleskes bleskes closed this in 1f8db67 Sep 3, 2014

bleskes added a commit that referenced this pull request Sep 3, 2014

[Internal] Do not use a background thread to disconnect node which ar…
…e remove from the ClusterState

After a node fails to respond to a ping correctly (master or node fault detection), they are removed from the cluster state through an UpdateTask. When a node is removed, a background task is scheduled using the generic threadpool to actually disconnect the node. However, in the case of temporary node failures (for example) it may be that the node was re-added by the time the task get executed, causing an untimely disconnect call. Disconnect is cheep and should be done during the UpdateTask.

Closes #7543

@bleskes bleskes changed the title [Internal] verify node is no longer in ClusterState before disconnecting it (after a ping failure) [Internal] Do not use a background thread to disconnect node which are removed from the ClusterState Sep 3, 2014

@bleskes bleskes deleted the bleskes:verify_before_disconnect_on_update_task branch Sep 3, 2014

bleskes added a commit that referenced this pull request Sep 8, 2014

[Internal] Do not use a background thread to disconnect node which ar…
…e remove from the ClusterState

After a node fails to respond to a ping correctly (master or node fault detection), they are removed from the cluster state through an UpdateTask. When a node is removed, a background task is scheduled using the generic threadpool to actually disconnect the node. However, in the case of temporary node failures (for example) it may be that the node was re-added by the time the task get executed, causing an untimely disconnect call. Disconnect is cheep and should be done during the UpdateTask.

Closes #7543

@clintongormley clintongormley changed the title [Internal] Do not use a background thread to disconnect node which are removed from the ClusterState Internal: Do not use a background thread to disconnect node which are removed from the ClusterState Sep 8, 2014

@clintongormley clintongormley changed the title Internal: Do not use a background thread to disconnect node which are removed from the ClusterState Resiliency: Do not use a background thread to disconnect node which are removed from the ClusterState Sep 8, 2014

@jpountz jpountz removed the review label Oct 21, 2014

@clintongormley clintongormley changed the title Resiliency: Do not use a background thread to disconnect node which are removed from the ClusterState Do not use a background thread to disconnect node which are removed from the ClusterState Jun 7, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.