Skip to content

Commit

Permalink
storage_service: Abort restore_replica_count when node is removed fro…
Browse files Browse the repository at this point in the history
…m the cluster

Consider the following procedure:

- n1, n2, n3
- n3 is down
- n1 runs nodetool removenode uuid_of_n3 to removenode from n3 the
  cluster
- n1 is down in the middle of removenode operation

Node n1 will set n3 to removing gossip status during removenode
operation. Whenever existing nodes learn a node is in removing gossip
status, they will call restore_replica_count to stream data from other
nodes for the ranges n3 loses if n3 was removed from the cluster. If
the streaming fails, the streaming will sleep and retry. The current
max number of retry attempts is 5. The sleep interval starts at 60
seconds and increases 1.5 times per sleep.

This can leave the cluster in a bad state. For example, nodes can go
out of disk space if the streaming continues.  We need a way to abort
such streaming attempts.

To abort the removenode operation and forcely remove the node, users
can run `nodetool removenode force` on any existing nodes to move the
node from removing gossip status to removed gossip status. However,
the restore_replica_count will not be aborted.

In this patch, a status checker is added in restore_replica_count, so
that once a node is in removed gossip status, restore_replica_count
will be aborted.

This patch is for older releases without the new NODE_OPS_CMD
infrastructure where such abort will happen automatically in case of
error.

Fixes scylladb#8651
  • Loading branch information
asias committed May 14, 2021
1 parent 8480839 commit f0de3f2
Showing 1 changed file with 41 additions and 1 deletion.
42 changes: 41 additions & 1 deletion service/storage_service.cc
Expand Up @@ -2870,7 +2870,13 @@ future<> storage_service::restore_replica_count(inet_address endpoint, inet_addr
}
return seastar::async([this, endpoint, notify_endpoint] {
auto tmptr = get_token_metadata_ptr();
auto streamer = make_lw_shared<dht::range_streamer>(_db, tmptr, _abort_source, get_broadcast_address(), "Restore_replica_count", streaming::stream_reason::removenode);
abort_source as;
auto sub = _abort_source.subscribe([&as] {
if (!as.abort_requested()) {
as.request_abort();
}
});
auto streamer = make_lw_shared<dht::range_streamer>(_db, tmptr, as, get_broadcast_address(), "Restore_replica_count", streaming::stream_reason::removenode);
auto my_address = get_broadcast_address();
auto non_system_keyspaces = _db.local().get_non_system_keyspaces();
for (const auto& keyspace_name : non_system_keyspaces) {
Expand All @@ -2888,6 +2894,40 @@ future<> storage_service::restore_replica_count(inet_address endpoint, inet_addr
}
streamer->add_rx_ranges(keyspace_name, std::move(ranges_per_endpoint));
}
auto status_checker = seastar::async([this, endpoint, &as] {
slogger.info("restore_replica_count: Started status checker for removing node {}", endpoint);
while (!as.abort_requested()) {
auto status = _gossiper.get_gossip_status(endpoint);
// If the node to be removed is already in removed status, it has
// probably been removed forcely with `nodetool removenode force`.
// Abort the restore_replica_count in such case to avoid streaming
// attempt since the user has removed the node forcely.
if (status == sstring(versioned_value::REMOVED_TOKEN)) {
slogger.info("restore_replica_count: Detected node {} has left the cluster, status={}, abort restore_replica_count for removing node {}",
endpoint, status, endpoint);
if (!as.abort_requested()) {
as.request_abort();
}
return;
}
slogger.debug("restore_replica_count: Sleep and detect removing node {}, status={}", endpoint, status);
sleep_abortable(std::chrono::seconds(10), as).get();
}
});
auto stop_status_checker = defer([endpoint, &status_checker, &as] () mutable {
try {
slogger.info("restore_replica_count: Started to stop status checker for removing node {}", endpoint);
if (!as.abort_requested()) {
as.request_abort();
}
status_checker.get();
} catch (...) {
slogger.debug("restore_replica_count: Failed to stop status checker for removing node {}: {}",
endpoint, std::current_exception());
}
slogger.info("restore_replica_count: Finished to stop status checker for removing node {}", endpoint);
});

streamer->stream_async().then_wrapped([this, streamer, notify_endpoint] (auto&& f) {
try {
f.get();
Expand Down

0 comments on commit f0de3f2

Please sign in to comment.