-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docdb] Persist latest master config on TS #1542
Comments
@bmatican
|
Bogdan is OOO. So, I'll try to expand a little on your comments:
yes. currently, temporary master leader changes are handled by the raft and do not require any config changes. But if a node is being taken out of service and a new node brought in to change the membership of the masters's raft group, we end up having to go and update the flag for each of the tservers.
Yes. we may need to "fall back" on the gflags option if a tserver has been down for sufficiently long time when the masters' config has changed. Note though, that this fall back may not be necessary if there is an overlap between the old-config (before the tserver stopped) and the new config. i.e. for replacing 1 or 2 nodes from the masters quorum may still be possible to handle by going to all the old masters, and ensuring that the surviving master updates the tserver with the correct master membership. But, the pathological case, in which there is no overlap between the old members of the master's raft group; and the new membership (say ABC ->DEF) we'd likely have to fall back on the gflag option for the tservers that were "down" during that period.
Are you talking about 'add-node' as adding a new node to the master's quorum? or adding a new tserver node. If it is the later, not all the master nodes will know about the newly added tserver. TS heartbeats are initiated by the tservers. They will try to locate the leader among the master nodes and connect to the master. Only the "leader" master maintains the state for all the tservers and responds to the heartbeats (see catalog_manager.cc) |
Jira Link: DB-1653
Right now, if master information changes, we need to update the config values across all TS, otherwise, on restart, they will try w/e addresses they were initially configured with. This makes things like moving masters require changing the config on all TS. We already have a mechanism to propagate master quorum changes through TS heartbeats, so that TS can dynamically be kept up to date.
On the master side, we persist the RaftConfig information, so that on startup, we can read that and be able to:
On the TS side however, we rely solely on the
--tserver_master_addrs
flag. If for example, we specify that as empty we get:One thing we should be able to do is:
tserver_master_addrs
flag and have it happily load the local dataThe advanced option would be to be able to do a full move:
We can discuss if it's relevant to maybe try the persisted data first, then use the config values as fallback.
cc @ajcaldera1 @mbautin @rahuldesirazu
The text was updated successfully, but these errors were encountered: