Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docdb] Persist latest master config on TS #1542

Open
bmatican opened this issue Jun 13, 2019 · 2 comments
Open

[docdb] Persist latest master config on TS #1542

bmatican opened this issue Jun 13, 2019 · 2 comments
Assignees
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature priority/low Low priority

Comments

@bmatican
Copy link
Contributor

bmatican commented Jun 13, 2019

Jira Link: DB-1653
Right now, if master information changes, we need to update the config values across all TS, otherwise, on restart, they will try w/e addresses they were initially configured with. This makes things like moving masters require changing the config on all TS. We already have a mechanism to propagate master quorum changes through TS heartbeats, so that TS can dynamically be kept up to date.

On the master side, we persist the RaftConfig information, so that on startup, we can read that and be able to:

  • effectively know who the other masters are
  • know we're part of this Raft group and initialize accordingly
  • ignore the master_addresses flag, if invalid or not matching our persisted data

On the TS side however, we rely solely on the --tserver_master_addrs flag. If for example, we specify that as empty we get:

F20190613 16:48:18 ../../src/yb/tserver/tablet_server_options.cc:70] No masters were specified in the mater/ aderesses flag '', but s minimum of one is required.

One thing we should be able to do is:

  • start RF3
  • stop a TS
  • start back that TS with an empty tserver_master_addrs flag and have it happily load the local data

The advanced option would be to be able to do a full move:

  • start an RF3 on nodes ABC
  • add tservers DEF to the cluster, with ABC as master addresses
  • start masters in shell mode on DEF and dynamically move masters there (while removing ABC from the raft group)
  • restart any of the DEF tservers and they should load the local data of DEF masters, rather than using the conf file list of ABC

We can discuss if it's relevant to maybe try the persisted data first, then use the config values as fallback.

cc @ajcaldera1 @mbautin @rahuldesirazu

@bmatican bmatican added kind/enhancement This is an enhancement of an existing feature area/docdb YugabyteDB core features master labels Jun 13, 2019
@bmatican bmatican added this to To do in Master components via automation Jun 13, 2019
@bmatican bmatican added this to To Do in YBase features via automation Jun 13, 2019
@rkarthik007 rkarthik007 added the help wanted We welcome your contributions for this issue! label Jul 11, 2019
@deeps1991
Copy link
Contributor

@bmatican
I'm thinking of working on this issue, but since I'm new to the codebase, just had a few questions here to understand the background:

  1. With this change, is the aim to make the master changes/moves faster... i.e. we do not want to make config changes across all tservers for every master move, which could potentially speed up recovery from failures?
  2. I am assuming that you mean that every tserver with this change, would persist any changes to the master config that it receives via heartbeat. But it could so happen that the tserver crashed before it could persist the changes to the master config, or the master changes happened while the tserver was down. So we would always need to have a fallback option of changing the config across all tservers? This is the reasoning behind me asking (1)
  3. Since add-node goes through the master, I would have assumed that all master nodes always know the presence and status of all tserver nodes, and that heartbeat is kind of initiated by the master. Is this true, and if not why? And if true, why would the tserver need to store any config about the masters at all? Couldn't it just wait for the first heartbeat from the master? From what I understand, discovery seems to be the responsibility of the master and not of tservers. Am I probably missing something? :)

@amitanandaiyer
Copy link
Contributor

amitanandaiyer commented Sep 18, 2019

Bogdan is OOO. So, I'll try to expand a little on your comments:

@bmatican
I'm thinking of working on this issue, but since I'm new to the codebase, just had a few questions here to understand the background:

  1. With this change, is the aim to make the master changes/moves faster... i.e. we do not want to make config changes across all tservers for every master move, which could potentially speed up recovery from failures?

yes. currently, temporary master leader changes are handled by the raft and do not require any config changes. But if a node is being taken out of service and a new node brought in to change the membership of the masters's raft group, we end up having to go and update the flag for each of the tservers.

  1. I am assuming that you mean that every tserver with this change, would persist any changes to the master config that it receives via heartbeat. But it could so happen that the tserver crashed before it could persist the changes to the master config, or the master changes happened while the tserver was down. So we would always need to have a fallback option of changing the config across all tservers? This is the reasoning behind me asking (1)

Yes. we may need to "fall back" on the gflags option if a tserver has been down for sufficiently long time when the masters' config has changed.

Note though, that this fall back may not be necessary if there is an overlap between the old-config (before the tserver stopped) and the new config. i.e. for replacing 1 or 2 nodes from the masters quorum may still be possible to handle by going to all the old masters, and ensuring that the surviving master updates the tserver with the correct master membership.

But, the pathological case, in which there is no overlap between the old members of the master's raft group; and the new membership (say ABC ->DEF) we'd likely have to fall back on the gflag option for the tservers that were "down" during that period.

  1. Since add-node goes through the master, I would have assumed that all master nodes always know the presence and status of all tserver nodes, and that heartbeat is kind of initiated by the master.

Are you talking about 'add-node' as adding a new node to the master's quorum? or adding a new tserver node. If it is the later, not all the master nodes will know about the newly added tserver.

TS heartbeats are initiated by the tservers. They will try to locate the leader among the master nodes and connect to the master. Only the "leader" master maintains the state for all the tservers and responds to the heartbeats (see catalog_manager.cc)

@bmatican bmatican added this to 2.1 release in yugabyted via automation Jan 30, 2020
@bmatican bmatican moved this from To do to Short term in Master components May 9, 2020
@bmatican bmatican changed the title Persist latest master config on TS [docdb] Persist latest master config on TS Nov 18, 2020
@bmatican bmatican removed the help wanted We welcome your contributions for this issue! label Dec 2, 2020
@yugabyte-ci yugabyte-ci added the priority/medium Medium priority issue label Jun 8, 2022
@yugabyte-ci yugabyte-ci added priority/low Low priority and removed priority/medium Medium priority issue labels Mar 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature priority/low Low priority
Projects
Master components
  
Short term
YBase features
  
Backlog
yugabyted
  
2.3 release
Development

No branches or pull requests

7 participants