Concern about node count for minimal HA control plane with external etcd #42691

tjanson · 2023-08-23T12:53:04Z

The section External etcd topology on the page Options for Highly Available Topology of the kubeadm cluster setup section states:

A minimum of three hosts for control plane nodes and three hosts for etcd nodes are required for an HA cluster with this [external etcd] topology.

I may be mistaken, but wouldn't the minimum number of control plane nodes in this case be two? (Though perhaps not advisable, at least technically.) That gives us a redundant pair of each CP component (apiserver, controller-manager and scheduler), as well as the HA three node etcd cluster.

k8s-ci-robot · 2023-08-23T12:53:11Z

This issue is currently awaiting triage.

SIG Docs takes a lead on issue triage for this website, but any Kubernetes member can accept issues by applying the triage/accepted label.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tjanson · 2023-08-23T12:57:04Z

I see now this is a duplicate of (stale, closed) #33033.

/language en
/kind bug
/sig architecture

neolit123 · 2023-08-23T13:30:05Z

/sig cluster-lifecycle

there was a blog post at k8s.io about HA written by Steve Wong, but i cannot find it.

HA is an opinionated area in computing. 2 is considered the minimum, where the 2nd server is the fallback/redundancy server. however the argument here is that 2 is not really redundancy. 2 provides the fallback, yet 3 is really what provides the redundancy - i.e. "you have the backup of the backup, which may be redundant".

upstream kubeadm is just one k8s distribution with its recommendations of 3 CP nodes. yet other distributions like openshift also run 3 as the minimum HA:

At a minimum, an OpenShift cluster contains 2 worker nodes in addition to 3 control plane nodes.

https://access.redhat.com/solutions/5034771

personally, i would consider < 3 in k8s as non-HA, but users can make the choice.

/close

k8s-ci-robot · 2023-08-23T13:30:11Z

@neolit123: Closing this issue.

In response to this:

/sig cluster-lifecycle

there was a blog post at k8s.io about HA written by Steve Wong, but i cannot find it.

HA is an opinionated area in computing. 2 is considered the minimum, where the 2nd server is the fallback/redundancy server. however the argument here is that 2 is not really redundancy. 2 provides the fallback, yet 3 is really what provides the redundancy - i.e. "you have the backup of the backup, which may be redundant".

upstream kubeadm is just one k8s distribution with its recommendations of 3 CP nodes. yet other distributions like openshift also run 3 as the minimum HA:

At a minimum, an OpenShift cluster contains 2 worker nodes in addition to 3 control plane nodes.

https://access.redhat.com/solutions/5034771

personally, i would consider < 3 in k8s as non-HA, but users can make the choice.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tjanson · 2023-08-23T14:22:34Z

Excuse me for being blunt, but I don't think you've given this issue the consideration it deserves and requires. The key point is the distinction between etcd cluster nodes and Kubernetes control plane nodes (and their effect on HA), which your comment does not address and which you do not seem to have considered.

HA is an opinionated area in computing.

We're specifically discussing the HA requirements of the Kubernetes control plane. That is not a matter of opinion, but fact.

2 is considered the minimum, where the 2nd server is the fallback/redundancy server. however the argument here is that 2 is not really redundancy. 2 provides the fallback, yet 3 is really what provides the redundancy

Again excuse my bluntness, but that's an oversimplified, imprecise portrayal of HA in the context of Kubernetes. It is not sufficient to consider just these broad terms in a discussion of etcd and control plane components.

upstream kubeadm is just one k8s distribution with its recommendations of 3 CP nodes. yet other distributions like openshift also run 3 as the minimum HA

Yes, they do so because of a stacked etcd topology. The docs section this issue refers to is about a different topology (external etcd). That exact distinction is the entire point of the issue.

personally, i would consider < 3 in k8s as non-HA, but users can make the choice.

Again, this isn't about your (or anyone else's) personal opinion or recommendation, it is about the technical minimum of K8s control plane components/nodes.

I request that you reopen the issue.

neolit123 · 2023-08-23T15:02:34Z

Excuse me for being blunt, but I don't think you've given this issue the consideration it deserves and requires. The key point is the distinction between etcd cluster nodes and Kubernetes control plane nodes (and their effect on HA), which your comment does not address and which you do not seem to have considered.

my comment is specifically about the external etcd topology. in short, the recommendation of the maintainers is to have 3 cp machines even if etcd is not run on them. if users do not agree with our ideas of HA they can run less or more cp machines.

tjanson · 2023-08-23T15:43:05Z

I request that you reopen this issue so that a second org member can give their opinion. E.g., @sftim, who was active in the other issue (I'm also fine with reopening the stale #33033 instead of this issue).

neolit123 · 2023-08-23T16:07:30Z

/reopen

k8s-ci-robot · 2023-08-23T16:07:36Z

@neolit123: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sftim · 2023-08-23T16:37:02Z

A minimum of three hosts for control plane nodes and three hosts for etcd nodes are required for an HA cluster with this [external etcd] topology.

The minimum number of control plane nodes for Kubernetes to work is one. However, the minimum recommended number of etcd is three, because:

etcd recommends an odd number of cluster members (for the baseline healthy config)
three is the smallest odd number greater than two

So far, so uncontroversial. How about the API server, k-c-m, scheduler, etc?

For the external etcd topology, maybe you can get away with two further nodes, relying on the etcd cluster to support leader election etc. I'm a lead for Docs, not API machinery, so I can't comment authoritatively. However - it sounds plausible.

sftim · 2023-08-23T16:53:09Z

/retitle Concern about node count for minimal HA control plane with external etcd

sftim · 2023-08-23T16:54:07Z

@tjanson would you be happy to see #33033 reopened and this closed as a duplicate?

sftim · 2023-08-23T16:54:27Z

Ah, I see you would.
/triage duplicate
/close not-planned

k8s-ci-robot · 2023-08-23T16:54:34Z

@sftim: Closing this issue, marking it as "Not Planned".

In response to this:

Ah, I see you would.
/triage duplicate
/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Aug 23, 2023

k8s-ci-robot added language/en Issues or PRs related to English language kind/bug Categorizes issue or PR as related to a bug. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. labels Aug 23, 2023

k8s-ci-robot added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label Aug 23, 2023

k8s-ci-robot closed this as completed Aug 23, 2023

k8s-ci-robot reopened this Aug 23, 2023

k8s-ci-robot changed the title ~~Content error: Minimal HA control plane with external etcd~~ Concern about node count for minimal HA control plane with external etcd Aug 23, 2023

k8s-ci-robot added the triage/duplicate Indicates an issue is a duplicate of other open issue. label Aug 23, 2023

k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 23, 2023

sftim mentioned this issue Aug 23, 2023

Replica count wrong(?) in Options for Highly Available Topology page #33033

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concern about node count for minimal HA control plane with external etcd #42691

Concern about node count for minimal HA control plane with external etcd #42691

tjanson commented Aug 23, 2023

k8s-ci-robot commented Aug 23, 2023

tjanson commented Aug 23, 2023

neolit123 commented Aug 23, 2023

k8s-ci-robot commented Aug 23, 2023

tjanson commented Aug 23, 2023 •

edited

Loading

neolit123 commented Aug 23, 2023 •

edited

Loading

tjanson commented Aug 23, 2023 •

edited

Loading

neolit123 commented Aug 23, 2023

k8s-ci-robot commented Aug 23, 2023

sftim commented Aug 23, 2023

sftim commented Aug 23, 2023

sftim commented Aug 23, 2023

sftim commented Aug 23, 2023

k8s-ci-robot commented Aug 23, 2023

Concern about node count for minimal HA control plane with external etcd #42691

Concern about node count for minimal HA control plane with external etcd #42691

Comments

tjanson commented Aug 23, 2023

k8s-ci-robot commented Aug 23, 2023

tjanson commented Aug 23, 2023

neolit123 commented Aug 23, 2023

k8s-ci-robot commented Aug 23, 2023

tjanson commented Aug 23, 2023 • edited Loading

neolit123 commented Aug 23, 2023 • edited Loading

tjanson commented Aug 23, 2023 • edited Loading

neolit123 commented Aug 23, 2023

k8s-ci-robot commented Aug 23, 2023

sftim commented Aug 23, 2023

sftim commented Aug 23, 2023

sftim commented Aug 23, 2023

sftim commented Aug 23, 2023

k8s-ci-robot commented Aug 23, 2023

tjanson commented Aug 23, 2023 •

edited

Loading

neolit123 commented Aug 23, 2023 •

edited

Loading

tjanson commented Aug 23, 2023 •

edited

Loading