New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abort restore_replica_count when node is removed from the cluster #8651
Milestone
Comments
asias
added a commit
to asias/scylla
that referenced
this issue
May 14, 2021
…m the cluster Consider the following procedure: - n1, n2, n3 - n3 is down - n1 runs nodetool removenode uuid_of_n3 to removenode from n3 the cluster - n1 is down in the middle of removenode operation Node n1 will set n3 to removing gossip status during removenode operation. Whenever existing nodes learn a node is in removing gossip status, they will call restore_replica_count to stream data from other nodes for the ranges n3 loses if n3 was removed from the cluster. If the streaming fails, the streaming will sleep and retry. The current max number of retry attempts is 5. The sleep interval starts at 60 seconds and increases 1.5 times per sleep. This can leave the cluster in a bad state. For example, nodes can go out of disk space if the streaming continues. We need a way to abort such streaming attempts. To abort the removenode operation and forcely remove the node, users can run `nodetool removenode force` on any existing nodes to move the node from removing gossip status to removed gossip status. However, the restore_replica_count will not be aborted. In this patch, a status checker is added in restore_replica_count, so that once a node is in removed gossip status, restore_replica_count will be aborted. This patch is for older releases without the new NODE_OPS_CMD infrastructure where such abort will happen automatically in case of error. Fixes scylladb#8651
asias
added a commit
to asias/scylla
that referenced
this issue
May 18, 2021
…m the cluster Consider the following procedure: - n1, n2, n3 - n3 is down - n1 runs nodetool removenode uuid_of_n3 to removenode from n3 the cluster - n1 is down in the middle of removenode operation Node n1 will set n3 to removing gossip status during removenode operation. Whenever existing nodes learn a node is in removing gossip status, they will call restore_replica_count to stream data from other nodes for the ranges n3 loses if n3 was removed from the cluster. If the streaming fails, the streaming will sleep and retry. The current max number of retry attempts is 5. The sleep interval starts at 60 seconds and increases 1.5 times per sleep. This can leave the cluster in a bad state. For example, nodes can go out of disk space if the streaming continues. We need a way to abort such streaming attempts. To abort the removenode operation and forcely remove the node, users can run `nodetool removenode force` on any existing nodes to move the node from removing gossip status to removed gossip status. However, the restore_replica_count will not be aborted. In this patch, a status checker is added in restore_replica_count, so that once a node is in removed gossip status, restore_replica_count will be aborted. This patch is for older releases without the new NODE_OPS_CMD infrastructure where such abort will happen automatically in case of error. Fixes scylladb#8651
asias
added a commit
to asias/scylla
that referenced
this issue
May 18, 2021
…m the cluster Consider the following procedure: - n1, n2, n3 - n3 is down - n1 runs nodetool removenode uuid_of_n3 to removenode from n3 the cluster - n1 is down in the middle of removenode operation Node n1 will set n3 to removing gossip status during removenode operation. Whenever existing nodes learn a node is in removing gossip status, they will call restore_replica_count to stream data from other nodes for the ranges n3 loses if n3 was removed from the cluster. If the streaming fails, the streaming will sleep and retry. The current max number of retry attempts is 5. The sleep interval starts at 60 seconds and increases 1.5 times per sleep. This can leave the cluster in a bad state. For example, nodes can go out of disk space if the streaming continues. We need a way to abort such streaming attempts. To abort the removenode operation and forcely remove the node, users can run `nodetool removenode force` on any existing nodes to move the node from removing gossip status to removed gossip status. However, the restore_replica_count will not be aborted. In this patch, a status checker is added in restore_replica_count, so that once a node is in removed gossip status, restore_replica_count will be aborted. This patch is for older releases without the new NODE_OPS_CMD infrastructure where such abort will happen automatically in case of error. Fixes scylladb#8651
tgrabiec
pushed a commit
that referenced
this issue
May 18, 2021
…m the cluster Consider the following procedure: - n1, n2, n3 - n3 is down - n1 runs nodetool removenode uuid_of_n3 to removenode from n3 the cluster - n1 is down in the middle of removenode operation Node n1 will set n3 to removing gossip status during removenode operation. Whenever existing nodes learn a node is in removing gossip status, they will call restore_replica_count to stream data from other nodes for the ranges n3 loses if n3 was removed from the cluster. If the streaming fails, the streaming will sleep and retry. The current max number of retry attempts is 5. The sleep interval starts at 60 seconds and increases 1.5 times per sleep. This can leave the cluster in a bad state. For example, nodes can go out of disk space if the streaming continues. We need a way to abort such streaming attempts. To abort the removenode operation and forcely remove the node, users can run `nodetool removenode force` on any existing nodes to move the node from removing gossip status to removed gossip status. However, the restore_replica_count will not be aborted. In this patch, a status checker is added in restore_replica_count, so that once a node is in removed gossip status, restore_replica_count will be aborted. This patch is for older releases without the new NODE_OPS_CMD infrastructure where such abort will happen automatically in case of error. Fixes #8651 Closes #8655
@asias should this be backported? (is this a response to a problem in the field?) |
Yes. I will send a PR for 2020.1 backport. Let's discuss there. |
denesb
pushed a commit
to denesb/scylla
that referenced
this issue
Oct 20, 2021
…m the cluster Consider the following procedure: - n1, n2, n3 - n3 is down - n1 runs nodetool removenode uuid_of_n3 to removenode from n3 the cluster - n1 is down in the middle of removenode operation Node n1 will set n3 to removing gossip status during removenode operation. Whenever existing nodes learn a node is in removing gossip status, they will call restore_replica_count to stream data from other nodes for the ranges n3 loses if n3 was removed from the cluster. If the streaming fails, the streaming will sleep and retry. The current max number of retry attempts is 5. The sleep interval starts at 60 seconds and increases 1.5 times per sleep. This can leave the cluster in a bad state. For example, nodes can go out of disk space if the streaming continues. We need a way to abort such streaming attempts. To abort the removenode operation and forcely remove the node, users can run `nodetool removenode force` on any existing nodes to move the node from removing gossip status to removed gossip status. However, the restore_replica_count will not be aborted. In this patch, a status checker is added in restore_replica_count, so that once a node is in removed gossip status, restore_replica_count will be aborted. This patch is for older releases without the new NODE_OPS_CMD infrastructure where such abort will happen automatically in case of error. Fixes scylladb#8651 Closes scylladb#8655 (cherry picked from commit 0858619) Backported for scylladb/scylla-enterprise#1745 Closes scylladb#1774
avikivity
pushed a commit
that referenced
this issue
Nov 2, 2021
…m the cluster Consider the following procedure: - n1, n2, n3 - n3 is down - n1 runs nodetool removenode uuid_of_n3 to removenode from n3 the cluster - n1 is down in the middle of removenode operation Node n1 will set n3 to removing gossip status during removenode operation. Whenever existing nodes learn a node is in removing gossip status, they will call restore_replica_count to stream data from other nodes for the ranges n3 loses if n3 was removed from the cluster. If the streaming fails, the streaming will sleep and retry. The current max number of retry attempts is 5. The sleep interval starts at 60 seconds and increases 1.5 times per sleep. This can leave the cluster in a bad state. For example, nodes can go out of disk space if the streaming continues. We need a way to abort such streaming attempts. To abort the removenode operation and forcely remove the node, users can run `nodetool removenode force` on any existing nodes to move the node from removing gossip status to removed gossip status. However, the restore_replica_count will not be aborted. In this patch, a status checker is added in restore_replica_count, so that once a node is in removed gossip status, restore_replica_count will be aborted. This patch is for older releases without the new NODE_OPS_CMD infrastructure where such abort will happen automatically in case of error. Fixes #8651 Closes #8655 (cherry picked from commit 0858619)
avikivity
pushed a commit
that referenced
this issue
Nov 2, 2021
…m the cluster Consider the following procedure: - n1, n2, n3 - n3 is down - n1 runs nodetool removenode uuid_of_n3 to removenode from n3 the cluster - n1 is down in the middle of removenode operation Node n1 will set n3 to removing gossip status during removenode operation. Whenever existing nodes learn a node is in removing gossip status, they will call restore_replica_count to stream data from other nodes for the ranges n3 loses if n3 was removed from the cluster. If the streaming fails, the streaming will sleep and retry. The current max number of retry attempts is 5. The sleep interval starts at 60 seconds and increases 1.5 times per sleep. This can leave the cluster in a bad state. For example, nodes can go out of disk space if the streaming continues. We need a way to abort such streaming attempts. To abort the removenode operation and forcely remove the node, users can run `nodetool removenode force` on any existing nodes to move the node from removing gossip status to removed gossip status. However, the restore_replica_count will not be aborted. In this patch, a status checker is added in restore_replica_count, so that once a node is in removed gossip status, restore_replica_count will be aborted. This patch is for older releases without the new NODE_OPS_CMD infrastructure where such abort will happen automatically in case of error. Fixes #8651 Closes #8655 (cherry picked from commit 0858619)
Backported to 4.4, 4.5. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Consider the following procedure:
Node n1 will set n3 to removing gossip status during removenode
operation. Whenever existing nodes learn a node is in removing gossip
status, they will call restore_replica_count to stream data from other
nodes for the ranges n3 loses if n3 was removed from the cluster. If
the streaming fails, the streaming will sleep and retry. The current
max number of retry attempts is 5. The sleep interval starts at 60
seconds and increases 1.5 times per sleep.
This can leave the cluster in a bad state. For example, nodes can go
out of disk space if the streaming continues. We need a way to abort
such streaming attempts.
To abort the removenode operation and forcely remove the node, users
can run
nodetool removenode force
on any existing nodes to move thenode from removing gossip status to removed gossip status. However,
the restore_replica_count will not be aborted.
The text was updated successfully, but these errors were encountered: