Skip to content

Commit

Permalink
box: fix replica hang after applying already set name
Browse files Browse the repository at this point in the history
When the name is manually set on master by replace in _cluster space,
calling box.cfg on replica with the same name causes its hang. The
problem is the fact, that resubscribe is initiated and waiting for
APPLIER_REGISTERED status is started. As applier knows, that no
registration should be done, this never happens.

Let's don't initiate registration, when instance name is already set.

Needed for tarantool#8978

NO_DOC=bugfix
NO_CHANGELOG=not released yet
  • Loading branch information
Serpentian committed Oct 26, 2023
1 parent e578835 commit 6324daa
Show file tree
Hide file tree
Showing 2 changed files with 55 additions and 1 deletion.
9 changes: 9 additions & 0 deletions src/box/box.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2349,6 +2349,15 @@ box_set_instance_name(void)
diag_raise();
if (strcmp(cfg_instance_name, name) == 0)
return;
/**
* It's possible, that the name is set on master by the manual replace.
* Don't make all appliers to resubscribe in such case. Just update
* the saved cfg_instance_name.
*/
if (strcmp(INSTANCE_NAME, name) == 0) {
strlcpy(cfg_instance_name, name, NODE_NAME_SIZE_MAX);
return;
}
char old_cfg_name[NODE_NAME_SIZE_MAX];
strlcpy(old_cfg_name, cfg_instance_name, NODE_NAME_SIZE_MAX);
auto guard = make_scoped_guard([&]{
Expand Down
47 changes: 46 additions & 1 deletion test/replication-luatest/instance_name_test.lua
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ local fio = require('fio')
local replica_set = require('luatest.replica_set')
local server = require('luatest.server')
local t = require('luatest')
local g = t.group()
local g = t.group('with-names')

local function wait_for_death(instance)
t.helpers.retrying({}, function()
Expand Down Expand Up @@ -552,3 +552,48 @@ g.test_instance_name_new_uuid = function(lg)
box.cfg{replication = replication}
end, {lg.master.box_cfg.replication})
end

local g_no_name = t.group('no-names')

g_no_name.before_all = function(lg)
lg.replica_set = replica_set:new({})
local box_cfg = {
replication_timeout = 0.1,
replication = {
server.build_listen_uri('master', lg.replica_set.id),
server.build_listen_uri('replica', lg.replica_set.id),
},
}

lg.master = lg.replica_set:build_and_add_server({
alias = 'master',
box_cfg = box_cfg,
})

box_cfg.read_only = true
lg.replica = lg.replica_set:build_and_add_server({
alias = 'replica',
box_cfg = box_cfg,
})

lg.replica_set:start()
lg.replica_set:wait_for_fullmesh()
end

g_no_name.after_all = function(lg)
lg.replica_set:drop()
end

-- Test, that replica doesn't hang on name apply.
g_no_name.test_replica_hang = function(lg)
lg.master:exec(function()
box.space._cluster:update(2, {{'=', 3, 'replica'}})
end)

-- Wait for INSTANCE_NAME to be set.
lg.replica:wait_for_vclock_of(lg.master)
lg.replica:exec(function()
-- Test, that no hang happens.
box.cfg{instance_name = 'replica'}
end)
end

0 comments on commit 6324daa

Please sign in to comment.