Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-select node_id when node joins cluster #2793

Closed
jcsp opened this issue Oct 27, 2021 · 7 comments
Closed

Auto-select node_id when node joins cluster #2793

jcsp opened this issue Oct 27, 2021 · 7 comments
Assignees
Labels

Comments

@jcsp
Copy link
Contributor

jcsp commented Oct 27, 2021

Currently, the node config (redpanda.yml) must include a unique node_id for each node.

This is a pain point for environments where the node configurations would otherwise be identical everywhere.

We should introduce the option for a join_request message to specify no node_id, and have the members_manager select the next available ID, then feed it back to the joiner in the response. The joining node would then store its ID in /var/lib/redpanda (e.g. in kvstore) and then use it in all subsequent operations.

The config route for specifying node ID should also remain:

  • User might want to replace a decommed node with the same ID rather than having an discontinuous list of IDs.
  • User on bare metal probably has servers numbered host001, host002 etc, and may want to explicitly map hostnames to IDs in the same order to reduce cognitive load for SREs when mapping back and forth.
@jcsp
Copy link
Contributor Author

jcsp commented May 24, 2022

I have thoughts about implementing this, #333, and a couple other things in one go, by adding a gossip discovery component:

  • Nodes gossip their UUID and cluster status (including whether they have joined a cluster and what their node_id is in the cluster)
  • Instead of a node with empty seed_servers acting as the initial cluster of 1, we can wait until the gossip state tells us about enough nodes to create an initial cluster of 3 (avoid creating clusters of 1 unless there really is just one node), with a little bit of careful design to minimize the chance of two clusters forming out of an initial population of nodes.
  • The cluster UUID would become a foundational thing, known to nodes before they ever start any raft groups. This would address the current awkwardness of clusters coming up with out an ID and then choosing one moments later.
  • The gossip mechanism would stay active through the life of the system as a low-level health monitoring fabric that doesn't rely on any central controller leader.

@scallister
Copy link

scallister commented May 24, 2022

we can wait until the gossip state tells us about enough nodes to create an initial cluster of 3

I like this suggestion. May be good to have a config variable such as MIN_CLUSTER_SIZE.

Would node id be deprecated in favor of UUID?

@jcsp
Copy link
Contributor Author

jcsp commented May 25, 2022

node_id would stay (integer node ID is part of the kafka protocol), but nodes wouldn't have one assigned until they have formally joined a cluster. In terms of configuration, it would become optional in redpanda.yaml & act as a hint during cluster formation: if a node advertises a "node id hint" that doesn't conflict with any other node, that's the ID it'll get.

In practice I would expect that for k8s replicaset type deployments, no node_id hint would be specified, but for bare metal deployments where people typically have numbered host names, there's going to be a natural human urge to make those node ids line up with the hostname orders.

@dotnwat
Copy link
Member

dotnwat commented May 26, 2022

members_manager select the next available ID, then feed it back to the joiner in the response.

i wonder if for k8s replica sets specifying a node-id in the config is more than a 'hint'--it's a hard requirement. iirc a gapless range of ids is what is expected. but really there must be some patterns for dealing with this; it's very inflexible.

The cluster UUID would become a foundational thing, known to nodes before they ever start any raft groups.

i can see this working pretty well. effectively the cluster uuid proves cluster membership, but on the off hand chance that when setting up a cluster you have a node that joins an unintended cluster then there won't be in any chance of this node causing disruption other than perhaps forcing someone to manually remove it from the wrong cluster.

@travisdowns
Copy link
Member

Yeah if we have the mechanism to prevent cluster formation until 3 members are present it might be nice to expose this as a configuration so that larger clusters can also wait until all their members have joined. This avoids a period during bootstrap where partitions might be created only the small number of first-joining nodes, and also overload related to having too few cluster members.

@twmb
Copy link
Contributor

twmb commented Aug 12, 2022

@jcsp am I reading the thread above correctly, we need the code to exist in redpanda first, and once redpanda handles an empty seed servers, we can change rpk to emit no seed servers. I'll move this too our own "awaiting other team" queue. cc @piyushredpanda

@jcsp
Copy link
Contributor Author

jcsp commented Aug 15, 2022

I think the above question (about seed servers, whereas this ticket is about node_ids) was asked on #333, answered here: #333 (comment)

For both these things, the question for RPK in sandbox-ish environments is going to be whether to use the old style or new style setup. I suspect the old way is going to make more sense for this: we don't expect a sandbox system to do things like removing volumes from nodes, so it's okay for all the nodes to just have static node_ids.

In production type systems, rpk behavior doesn't matter for k8s because the operator will write out the config explicitly anyway. On bare metal, the question is whether "rpk cluster config init" should output a node id or not. I think leaving it out is probably the right move here: that rules out the class of user errors where they generate a config and then forget to modify node_id: by leaving it blank it'll just get auto-selected always.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants