Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster: allow seeds-driven bootstrap when empty_seed_starts_cluster=1 #7390

Merged
merged 2 commits into from
Nov 24, 2022

Conversation

andrwng
Copy link
Contributor

@andrwng andrwng commented Nov 19, 2022

Previously, empty_seed_starts_cluster=1 meant that the node was configured to use root-driven bootstrap and that it should expect some singular node in the cluster to have an empty seeds list. This is the default, and was mainly there for compatibility reasons. However, this default is currently explicitly incompatible with seeds-driven bootstrap, and thus is incompatible with having all node configurations in the cluster being identical.

To encourage usage of seeds-driven bootstrap, and to allow homogeneous node configs by default, this commit makes empty_seed_starts_cluster=1 compatible with seeds-driven bootstrap. An empty seed_servers list still indicates that the configured node should be the root node, but seeds-driven bootstrap can be used with empty_seed_starts_cluster=1 if:

  • there exists no root node among any of the seeds, and
  • all seeds meet the existing requirements to perform seeds-driven bootstrap.

Backports Required

  • none - not a bug fix
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v22.3.x
  • v22.2.x
  • v22.1.x

UX Changes

Release Notes

Improvements

  • The seed_servers node configuration may now be identical on each node to form a cluster, even when empty_seed_starts_cluster is true (the default).

@mmaslankaprv
Copy link
Member

Would it be possible to encourage users do disable empty_seed_starts_cluster in existing deployments ? I am thinking about the cases where the empty seed server node is wiped out.

@mmaslankaprv
Copy link
Member

it looks that the CI failure may be related:

Module: rptest.tests.rpk_start_test 
Class:  RpkRedpandaStartTest 
Method: test_simple_start_three_with_root

Previously, empty_seed_starts_cluster=1 meant that the node was
configured to use root-driven bootstrap and that it should expect some
singular node in the cluster to have an empty seeds list. This is the
default, and was mainly there for compatibility reasons. However, this
default is currently explicitly incompatible with seeds-driven
bootstrap, and thus is incompatible with having all node configurations
in the cluster being identical.

To encourage usage of seeds-driven bootstrap, and to allow homogeneous
node configs by default, this commit makes empty_seed_starts_cluster=1
compatible with seeds-driven bootstrap. An empty seed_servers list
still indicates that the configured node should be the root node, but
seeds-driven bootstrap can be used with empty_seed_starts_cluster=1 if:
- there exists no root node among any of the seeds, and
- all seeds meet the existing requirements to perform seeds-driven
  bootstrap.
@andrwng
Copy link
Contributor Author

andrwng commented Nov 21, 2022

Would it be possible to encourage users do disable empty_seed_starts_cluster in existing deployments ? I am thinking about the cases where the empty seed server node is wiped out.

Over time we may be able to move folks away from root-driven bootstrap with documentation, but the argument for keeping support is that existing customer installers will likely expect root-driven bootstrap to continue to work.

It's true this means that such root-driven deployments are as unsafe as they were pre-22.3, but this PR at least means that there's the option to configure all nodes with a non-empty seeds list, which does avoid the wipeout troubles.

@dlex
Copy link
Contributor

dlex commented Nov 21, 2022

Even in root-driven bootstrap, there is a possibility to configure all nodes with non-empty seed_servers, which does avoid troubles as well (a "rootless" config). So I'm not really sure what is the improvement of this change. Is it to let users configure new clusters in seed-driven mode without setting empty_seed_starts_cluster propery? If so, I can think of more confusion users will have with this property, those ones who did care about it.

@andrwng
Copy link
Contributor Author

andrwng commented Nov 21, 2022

Even in root-driven bootstrap, there is a possibility to configure all nodes with non-empty seed_servers, which does avoid troubles as well (a "rootless" config).

The problem is that currently, the default configuration doesn't allow you to start up a cluster with homogeneous node configs.

Is it to let users configure new clusters in seed-driven mode without setting empty_seed_starts_cluster propery?

Yes.

If so, I can think of more confusion users will have with this property, those ones who did care about it.

I'm not sure in what ways this will be the case. If a user has successfully tuned their configurations such that they use empty_seed_starts_cluster=True, that will continue to work moving forward. If they have configured empty_seed_starts_cluster=False, that will also continue to work moving forward. The semantics IMO are actually more intuitive in that the configuration only controls whether an empty node starts a cluster (as the name implies), rather than changing between two modes of bootstrap.

This adds a simple test to make sure a cluster doesn't successfully form
if empty_seed_starts_cluster is False but the seed_servers are
configured with a root node.
src/v/cluster/cluster_discovery.cc Show resolved Hide resolved
@andrwng andrwng merged commit 915c711 into redpanda-data:dev Nov 24, 2022
@piyushredpanda
Copy link
Contributor

@andrwng : please don't forget to backport

@andrwng
Copy link
Contributor Author

andrwng commented Nov 24, 2022

/backport v22.3.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants