storage_service: raft topology: do not throw error from fence_previou…

…s_coordinator() Throwing error kills the topology coordinator monitor fiber. Instead we retry the operation until it succeeds or the node looses its leadership. This is fine before for the operation to succeed quorum is needed and if the quorum is not available the node should relinquish its leadership. Fixes #15728 (cherry picked from commit 65bf587)
scylladb · Oct 29, 2023 · 2aa2976 · 2aa2976
1 parent 24efacf
commit 2aa2976
Showing 1 changed file with 2 additions and 2 deletions.
diff --git a/service/storage_service.cc b/service/storage_service.cc
@@ -2399,7 +2399,7 @@ future<> topology_coordinator::fence_previous_coordinator() {
     // Topology state machine moves to state S while RPC R is still running.
     // If RPC is idempotent that should not be a problem since second one executed by B will do nothing,
     // but better to be safe and cut off previous write attempt
-    while (true) {
+    while (!_as.abort_requested()) {
         try {
             auto guard = co_await start_operation();
             topology_mutation_builder builder(guard.write_timestamp());
@@ -2410,8 +2410,8 @@ future<> topology_coordinator::fence_previous_coordinator() {
             continue;
         } catch (...) {
             slogger.error("raft topology: failed to fence previous coordinator {}", std::current_exception());
-            throw;
         }
+        co_await seastar::sleep_abortable(std::chrono::seconds(1), _as);
     }
 }