[docdb] Persist latest master config on TS #1542

bmatican · 2019-06-13T17:07:29Z

Jira Link: DB-1653
Right now, if master information changes, we need to update the config values across all TS, otherwise, on restart, they will try w/e addresses they were initially configured with. This makes things like moving masters require changing the config on all TS. We already have a mechanism to propagate master quorum changes through TS heartbeats, so that TS can dynamically be kept up to date.

On the master side, we persist the RaftConfig information, so that on startup, we can read that and be able to:

effectively know who the other masters are
know we're part of this Raft group and initialize accordingly
ignore the master_addresses flag, if invalid or not matching our persisted data

On the TS side however, we rely solely on the --tserver_master_addrs flag. If for example, we specify that as empty we get:

F20190613 16:48:18 ../../src/yb/tserver/tablet_server_options.cc:70] No masters were specified in the mater/ aderesses flag '', but s minimum of one is required.

One thing we should be able to do is:

start RF3
stop a TS
start back that TS with an empty tserver_master_addrs flag and have it happily load the local data

The advanced option would be to be able to do a full move:

start an RF3 on nodes ABC
add tservers DEF to the cluster, with ABC as master addresses
start masters in shell mode on DEF and dynamically move masters there (while removing ABC from the raft group)
restart any of the DEF tservers and they should load the local data of DEF masters, rather than using the conf file list of ABC

We can discuss if it's relevant to maybe try the persisted data first, then use the config values as fallback.

cc @ajcaldera1 @mbautin @rahuldesirazu

The text was updated successfully, but these errors were encountered:

deeps1991 · 2019-09-15T14:37:53Z

@bmatican
I'm thinking of working on this issue, but since I'm new to the codebase, just had a few questions here to understand the background:

With this change, is the aim to make the master changes/moves faster... i.e. we do not want to make config changes across all tservers for every master move, which could potentially speed up recovery from failures?
I am assuming that you mean that every tserver with this change, would persist any changes to the master config that it receives via heartbeat. But it could so happen that the tserver crashed before it could persist the changes to the master config, or the master changes happened while the tserver was down. So we would always need to have a fallback option of changing the config across all tservers? This is the reasoning behind me asking (1)
Since add-node goes through the master, I would have assumed that all master nodes always know the presence and status of all tserver nodes, and that heartbeat is kind of initiated by the master. Is this true, and if not why? And if true, why would the tserver need to store any config about the masters at all? Couldn't it just wait for the first heartbeat from the master? From what I understand, discovery seems to be the responsibility of the master and not of tservers. Am I probably missing something? :)

amitanandaiyer · 2019-09-18T17:36:04Z

Bogdan is OOO. So, I'll try to expand a little on your comments:

@bmatican
I'm thinking of working on this issue, but since I'm new to the codebase, just had a few questions here to understand the background:

With this change, is the aim to make the master changes/moves faster... i.e. we do not want to make config changes across all tservers for every master move, which could potentially speed up recovery from failures?

yes. currently, temporary master leader changes are handled by the raft and do not require any config changes. But if a node is being taken out of service and a new node brought in to change the membership of the masters's raft group, we end up having to go and update the flag for each of the tservers.

I am assuming that you mean that every tserver with this change, would persist any changes to the master config that it receives via heartbeat. But it could so happen that the tserver crashed before it could persist the changes to the master config, or the master changes happened while the tserver was down. So we would always need to have a fallback option of changing the config across all tservers? This is the reasoning behind me asking (1)

Yes. we may need to "fall back" on the gflags option if a tserver has been down for sufficiently long time when the masters' config has changed.

Note though, that this fall back may not be necessary if there is an overlap between the old-config (before the tserver stopped) and the new config. i.e. for replacing 1 or 2 nodes from the masters quorum may still be possible to handle by going to all the old masters, and ensuring that the surviving master updates the tserver with the correct master membership.

But, the pathological case, in which there is no overlap between the old members of the master's raft group; and the new membership (say ABC ->DEF) we'd likely have to fall back on the gflag option for the tservers that were "down" during that period.

Since add-node goes through the master, I would have assumed that all master nodes always know the presence and status of all tserver nodes, and that heartbeat is kind of initiated by the master.

Are you talking about 'add-node' as adding a new node to the master's quorum? or adding a new tserver node. If it is the later, not all the master nodes will know about the newly added tserver.

TS heartbeats are initiated by the tservers. They will try to locate the leader among the master nodes and connect to the master. Only the "leader" master maintains the state for all the tservers and responds to the heartbeats (see catalog_manager.cc)

bmatican added kind/enhancement This is an enhancement of an existing feature area/docdb YugabyteDB core features master labels Jun 13, 2019

bmatican added this to To do in Master components via automation Jun 13, 2019

bmatican added this to To Do in YBase features via automation Jun 13, 2019

rkarthik007 removed the master label Jun 24, 2019

rkarthik007 added the help wanted We welcome your contributions for this issue! label Jul 11, 2019

bmatican added this to 2.1 release in yugabyted via automation Jan 30, 2020

bmatican moved this from To do to Short term in Master components May 9, 2020

bmatican mentioned this issue Jun 4, 2020

[docs] Troubleshooting for how to replace a master #4676

Closed

bmatican changed the title ~~Persist latest master config on TS~~ [docdb] Persist latest master config on TS Nov 18, 2020

bmatican assigned sanketkedia Dec 2, 2020

bmatican removed the help wanted We welcome your contributions for this issue! label Dec 2, 2020

yugabyte-ci added the priority/medium Medium priority issue label Jun 8, 2022

yugabyte-ci added priority/low Low priority and removed priority/medium Medium priority issue labels Mar 1, 2023

yugabyte-ci assigned lingamsandeep and unassigned sanketkedia Sep 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[docdb] Persist latest master config on TS #1542

[docdb] Persist latest master config on TS #1542

bmatican commented Jun 13, 2019 •

edited by yugabyte-ci

Loading

deeps1991 commented Sep 15, 2019

amitanandaiyer commented Sep 18, 2019 •

edited

Loading

[docdb] Persist latest master config on TS #1542

[docdb] Persist latest master config on TS #1542

Comments

bmatican commented Jun 13, 2019 • edited by yugabyte-ci Loading

deeps1991 commented Sep 15, 2019

amitanandaiyer commented Sep 18, 2019 • edited Loading

bmatican commented Jun 13, 2019 •

edited by yugabyte-ci

Loading

amitanandaiyer commented Sep 18, 2019 •

edited

Loading