Skip to content

Commit

Permalink
Merge 'raft_group0: trigger snapshot if existing snapshot index is 0'…
Browse files Browse the repository at this point in the history
… from Kamil Braun

The persisted snapshot index may be 0 if the snapshot was created in
older version of Scylla, which means snapshot transfer won't be
triggered to a bootstrapping node. Commands present in the log may not
cover all schema changes --- group 0 might have been created through the
upgrade upgrade procedure, on a cluster with existing schema. So a
deployment with index=0 snapshot is broken and we need to fix it. We can
use the new `raft::server::trigger_snapshot` API for that.

Also add a test.

Fixes scylladb#16683

Closes scylladb#17072

* github.com:scylladb/scylladb:
  test: add test for fixing a broken group 0 snapshot
  raft_group0: trigger snapshot if existing snapshot index is 0

(cherry picked from commit 181f68f)

Backport note: test_raft_fix_broken_snapshot had to be removed because
the "error injections enabled at startup" feature does not yet exist in
5.2.
  • Loading branch information
denesb authored and kbr-scylla committed Feb 1, 2024
1 parent ce0ed29 commit db58614
Showing 1 changed file with 19 additions and 1 deletion.
20 changes: 19 additions & 1 deletion service/raft/raft_group0.cc
Expand Up @@ -381,8 +381,26 @@ future<> raft_group0::start_server_for_group0(raft::group_id group0_id) {
// we ensure we haven't missed any IP update in the map.
load_initial_raft_address_map();
group0_log.info("Server {} is starting group 0 with id {}", my_id, group0_id);
co_await _raft_gr.start_server_for_group(create_server_for_group0(group0_id, my_id));
auto srv_for_group0 = create_server_for_group0(group0_id, my_id);
auto& persistence = srv_for_group0.persistence;
auto& server = *srv_for_group0.server;
co_await _raft_gr.start_server_for_group(std::move(srv_for_group0));
_group0.emplace<raft::group_id>(group0_id);

// Fix for scylladb/scylladb#16683:
// If the snapshot index is 0, trigger creation of a new snapshot
// so bootstrapping nodes will receive a snapshot transfer.
auto snap = co_await persistence.load_snapshot_descriptor();
if (snap.idx == raft::index_t{0}) {
group0_log.info("Detected snapshot with index=0, id={}, triggering new snapshot", snap.id);
bool created = co_await server.trigger_snapshot(&_abort_source);
if (created) {
snap = co_await persistence.load_snapshot_descriptor();
group0_log.info("New snapshot created, index={} id={}", snap.idx, snap.id);
} else {
group0_log.warn("Could not create new snapshot, there are no entries applied");
}
}
}

future<> raft_group0::join_group0(std::vector<gms::inet_address> seeds, bool as_voter) {
Expand Down

0 comments on commit db58614

Please sign in to comment.