Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple server addresses in --server flag #7325

Open
brandond opened this issue Apr 20, 2023 · 11 comments
Open

Support multiple server addresses in --server flag #7325

brandond opened this issue Apr 20, 2023 · 11 comments
Labels
kind/enhancement An improvement to existing functionality
Milestone

Comments

@brandond
Copy link
Contributor

brandond commented Apr 20, 2023

K3s currently requires the user to configure an external load-balancer or DNS alias to provide a "fixed registration endpoint" for new nodes to join. In the case of a DNS alias, if the DNS lookup returns multiple entries, K3s (by way of the golang http client libs) only uses the first and will not retry other entries if that particular server isn't available.

Behind the scenes, the --server URI is used to seed the loadbalancer server list, so it should be possible to support multiple server addresses, either via the CLI, or by doing an explicit DNS lookup against the hostname when only a single server address is passed.

K3s should:

  • accept a comma-separated list of servers in the --server flag, or allow --server to be specified multiple times (as other StringSlice flags do)
  • If only a single server is provided, do a DNS lookup to expand the hostname into multiple URIs, if the DNS lookup returns multiple address records.

This was inspired by a discussion with @Oats87 where he was talking about some of the challenges of Rancher's v2prov, which currently always selects the first server for other nodes to join, as it is not reasonable to automate the creation of a fixed registration endpoint for rancher-managed clusters.

@brandond brandond added the kind/enhancement An improvement to existing functionality label Apr 20, 2023
@jhoelzel
Copy link

Is this not a typical henn and egg problem? In order to be HA you will need that load balancer anyway.

The whole premise of a node dying is that i can only come back with a new name anyway right? Its Cattle after all and therfore it should only be added to the load balancer itself that performs heath checks and simply doesnt route if a node dies?

Also, if you are not using an LB, who is making sure that the load is equally distributed?

In the case of a DNS server returning multiple IPs the problem should not only be the retrieval of IPs but that you actually need to implement a retry pattern for it?

@brandond
Copy link
Contributor Author

You don't need a load-balancer, just a DNS entry that points at one or more servers, or a predefined list of server names if you know what they are or will be.

Rancher currently just picks the first available server and points other nodes at that. Having nodes seeded with more than a single server address would improve the reliability of node bootstrap when not using an external load-balancer in front of the servers.

@jhoelzel
Copy link

Yes you are right that i don't need a load balancer, but I am more interested in how you are going to implement failure?
As far as I understand we have the HA setup so the API becomes highly available and i do think that k3s will not retry failed API requests, let alone go and get the next DNS entry from the server.

Furthermore I am not aware of any dns server that actually checks liveness or readyness. I mean there is bgp load balancing et al, but i don't think that is what you mean here and i am still unsure if most cloud providers are ever going to allow software defined networking in the way this would require.

Also in case of node failure how is the --server argument going to get upgraded on the end nodes?

@brandond
Copy link
Contributor Author

brandond commented Apr 26, 2023

The agents already have a very basic loadbalancer that just redials connections to a server until it finds an endpoint that it can establish a TCP connection to. You're correct that there is no health checking in the client itself, but the list of possible targets is kept in sync with the in-cluster kubernetes service endpoints once the client is connected to the cluster, and these endpoints are health-checked internally by Kubernetes.

The list is initially seeded with the server address provided on the CLI; all we need to do is add support for seeding that list with more than one address so that the loadbalancer has additional targets to try during initial bootstrap.

@jhoelzel
Copy link

I mean i get the point for initial seeding, but would i not need my load-balancer anyways so that kubectl and api requests in general are spread out across all active controls planes?

Also is this not exactly the problem that kube-vip et al has solved for us too? AFAIK Harvester uses this approach, where they just reassign the virtual IP for you.

Sorry if I am wasting your time, I have no horse in this race, because I always deploy at least one ha proxy anyways, i was just browsing the backlog and this piqued my interested =)

@brandond
Copy link
Contributor Author

brandond commented Apr 26, 2023

would i not need my load-balancer anyways so that kubectl and api requests in general are spread out across all active controls planes?

No. Even if you do put a load-balancer in front of your servers, again - the server URL is only used when joining the cluster, after that is done it reconnects directly to the Kubernetes service endpoints. This is all just for the k3s agent and the kubelet's connection to the apiserver; things running within the cluster use the in-cluster Kubernetes service endpoint via kube-proxy. Between the client load-balancer and kube-proxy, your load-balancer is unused (bypassed) 99.99% of the time, in favor of talking directly to one of the servers by IP.

The fact that an external load-balancer only ever serves a handful of requests when nodes are joining for the first time is why I generally recommend people just use DNS round-robin - or once this issue is resolved, pass a list of servers. Its overkill, and a waste of money if you're paying a cloud provider to host it for you.

@jhoelzel
Copy link

So in short, the external load balancer is only needed for external request from anything that uses a kubeconfig and therefore should not be a requirement for an HA k3s setup.

By providing the possibility to supply multiple --server endpoints or providing a dns name that would resolve to multiple ips in the node setup, you want to ensure that the dependency of a fixed registration endpoint shifts to the multiple endpoints because external HA communication might not even be required for the cluster and implies extra cost.

Thank you for explaining it to me! I can see now how this can be useful for a bare metal or edge deployments where IPs are not expected to change and one the primary control plane used in the config could be down due to updates or simply otherwise unavailable.

Formerly i was under the impression that that the kubeproxy also uses the endpoint provided for apiserver communication and hence was not only balancing externally but also internally. Thanks again for making it clear for me.

@brandond
Copy link
Contributor Author

brandond commented May 5, 2023

the external load balancer is only needed for external request from anything that uses a kubeconfig

Yes. If you're interacting with apiserver from outside the cluster anywhere other than the server nodes, you may still want to set up an external load-balancer or at least a DNS alias.

@caroline-suse-rancher caroline-suse-rancher added this to the Backlog milestone Nov 27, 2023
@matthewadams
Copy link

This would be valuable for us because we operate on fully air-gapped networks with no DNS whatsoever, where our cluster nodes only have static IPs.

@0xMALVEE
Copy link
Contributor

0xMALVEE commented May 21, 2024

You don't need a load-balancer, just a DNS entry that points at one or more servers, or a predefined list of server names if you know what they are or will be.

@brandond

if we have hard-coded DNS entries then that system can't do heal checks it will forward requests to the dead server as well. Isn't that a problem?

@brandond
Copy link
Contributor Author

k3s internally health-checks the load-balancer addresses. If they can't be connected to, the load-balancer will not use them.

That said, this issue isn't a priority at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement An improvement to existing functionality
Projects
Status: No status
Development

No branches or pull requests

5 participants