Use node name for service election and lease holder name instead of hostname #811

d-uzlov · 2024-04-09T11:35:54Z

Addresses this issue:

Kube-vip depends on nodename == hostname for service election, which easily breaks #810

After this change kube-vip should be able to work with any node names. I tried to remove all notions of hostnames from the code.

This PR adds vip_nodename env and nodeName command line argument.
kube-vip will fall back to hostname if it is not provided.
By default node name is injected via k8s downward API to all yaml deployments generated by kube-vip.

This PR also removes the logic for short name and FQDN matching, because it is now handled by k8s. Now kubevip only matches the exact node name value from config to exact node name value from endpoint/endpointslice.

This is a quite simple change but it affects a lot of things in the project.

I tested this change locally via this image: docker.io/daniluzlov/k8s-snippets:kube-vip-0.7.2-nodename2.
I tested both static pods for apiserver HA, and service loadbalancer with and without service election.
However, I can only test with ARP announcements.
I don't have the setup to test BGP.

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

d-uzlov · 2024-04-09T11:46:45Z

pkg/equinixmetal/utils.go

@@ -29,6 +29,7 @@ func findSelf(c *packngo.Client, projectID string) *packngo.Device {
 	// Go through devices
 	dev, _, _ := c.Devices.List(projectID, &packngo.ListOptions{})
 	for _, d := range dev {
+		// TODO do we need to replace os.Hostname with config.NodeName here?
 		me, _ := os.Hostname()


Maybe here we also want to use node name as reported by k8s, but since I don't have BGP setup to test it, I left it as is, to avoid breaking stuff.

d-uzlov · 2024-04-09T11:51:11Z

pkg/manager/watch_endpoints.go

+			if id == address.Hostname {
+				log.Debugf("[%s] found local endpoint - address: %s, hostname: %s", ep.label, address.IP, address.Hostname)
+				localEndpoints = append(localEndpoints, address.IP)
+				continue


Here and in watch_endpointslices.go I removed short-hostname-to-node-name matching because k8s is already providing the correct value.
First I check address.NodeName because documentation states "This can be used to determine endpoints local to a node".
address.Hostname doesn't seem relevant to me, but I left it here as a second check, just in case this is relevant in some setups. Maybe it should also be removed.

thebsdbox · 2024-04-09T14:30:28Z

Unfortunately this is going to break all kube-vip installations and upgrades moving forward, I think defaulting to detecting to hostname detection in the case of manually specifying one should probably remain.

Cellebyte · 2024-04-09T17:54:20Z

I would suggest keeping the old behavior and skip it if your new environment variable is mounted. Additionally when the code runs in control-plane mode host name is needed as there is no node object.

pkg/manager/manager.go

thebsdbox · 2024-04-10T09:22:44Z

This would be great, I think we'll just need to ensure that default/expected functionality remains :)

d-uzlov · 2024-04-10T10:32:47Z

Unfortunately this is going to break all kube-vip installations and upgrades moving forward, I think defaulting to detecting to hostname detection in the case of manually specifying one should probably remain.

When making these changes I thought of using kube-vip to generate new deployment files. This way there will be no issues during the upgrade.
But if incrementing the image version in an existing pod manifest should be a supported option, I guess we need to create a fallback to the old behavior.

I'll push an update commit for this.

Additionally when the code runs in control-plane mode host name is needed as there is no node object.

What do you mean by this?
I believe that downward API overall and node name in particular is available right from the start of the cluster, both for regular pods and for static pods.
I tried testing my image both in an existing cluster (upgrading control plane static pods from v0.7.2 one by one), and creating a new cluster with kube-vip static pod bootstrap, and I didn't see any issues in both cases.

thebsdbox · 2024-04-10T10:40:02Z

What do you mean by this?

When kube-vip is managing the HA VIP for the control plane (when being brought up by kubeadm), there is a particularly amusing/annoying scenario where kube-vip needs to come up "before" the API server is fully up. The reason is that kube-vip needs to have the HA address working in order for the kubeadm init process to complete successfully. Kube-vip will typically use a pod DNS spec to have 127.0.0.1 or the local address resolve to kubernetes so that the certificate SAN works. So the tl;dr is that the control plane side of things is a little bit complicated 😂

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

d-uzlov · 2024-04-10T10:50:31Z

Well, kube-vip is already using 127.0.0.1 kubernetes hosts entry, node hostname is not involved here.
And I tested that current set up works as it is.
So I guess using node name for leases and pod placement checks doesn't affect this, and we don't need to handle it any differently.

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

d-uzlov · 2024-04-10T11:03:42Z

@thebsdbox
Should I also modify unit/e2e tests to include node name option, and add description to documentation, or would you prefer to do it yourself?

thebsdbox · 2024-04-10T11:05:03Z

@thebsdbox
Should I also modify unit/e2e tests to include node name option, and add description to documentation, or would you prefer to do it yourself?

Yes please, that way we can be sure that it's working as expected 😀

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

d-uzlov · 2024-04-10T19:14:53Z

@thebsdbox I made sure that currently existing tests run fine for me locally, and tried to add tests for node names.

Ideally we would want to create nodes with modified names but kind doesn't seem to allow it, so I tried to at least test something.

For e2e tests I modified the main manifest and the etcd manifest to use the new node name env, and then added a new e2e test that uses manifest without this env, to check that old deployments will still work.

For service tests I figured out I can modify cluster a bit before running kube-vip.
I change hostnames so that they no longer match node names, so that, if you try to run kube-vip without the new node name env/flag, it won't find local pods and will fail to assign loadbalancer IPs to nodes when using ExternalTrafficPolicy: Local.
I then added the env to the daemonset config accordingly.

d-uzlov · 2024-04-10T19:19:38Z

I also wanted to update documentation but turns out it's in a separate repo, so I guess maybe later, after this PR is merged.
Well, this feature shouldn't need user configuration any way, it should just work passively. Documentation will just help users better understand what the new env/flag means.

Cellebyte · 2024-04-10T19:51:22Z

@thebsdbox we need to make sure that new features get also properly documented, I think we already got some drift between code and docs.

thebsdbox · 2024-04-10T20:15:51Z

@thebsdbox we need to make sure that new features get also properly documented, I think we already got some drift between code and docs.

100% my plan to separate the repos to make things easier, sadly made it worse. Perhaps updating the templates for a corresponding website PR would make sense.

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

thebsdbox · 2024-04-15T10:35:20Z

All tests are passing, I'll have one more look over but is this ready to merge?

d-uzlov · 2024-04-15T14:12:58Z

For me this PR is ready to merge, I don't have anything more to change.

d-uzlov requested a review from thebsdbox as a code owner April 9, 2024 11:35

use node name instead of hostname

e37091c

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

d-uzlov force-pushed the fix-nodename branch from ea1dd4e to e37091c Compare April 9, 2024 11:37

d-uzlov commented Apr 9, 2024

View reviewed changes

d-uzlov mentioned this pull request Apr 9, 2024

Service Election is case-sensitive to hostnames - breaking service elections #520

Open

thebsdbox reviewed Apr 10, 2024

View reviewed changes

pkg/manager/manager.go Outdated Show resolved Hide resolved

d-uzlov added 2 commits April 10, 2024 17:43

add fallback for hostname when node name is not available

f87665c

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

go fmt

e87ef6b

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

fix unit tests

cb1e63c

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

d-uzlov and others added 4 commits April 10, 2024 18:21

fix golangci-lint

8a80a72

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

fix e2e tests

59ad5ec

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

add hostname e2e test

c3e9a13

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

check node names in services test

d3da4d9

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

d-uzlov added 4 commits April 11, 2024 03:20

go fmt

877ee75

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

fix golangci-lint

8d607b8

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

fix golangci-lint better

29713ac

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

improve readability

04d8633

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>

thebsdbox merged commit bd3df6b into kube-vip:main Apr 17, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use node name for service election and lease holder name instead of hostname #811

Use node name for service election and lease holder name instead of hostname #811

d-uzlov commented Apr 9, 2024 •

edited

d-uzlov Apr 9, 2024

d-uzlov Apr 9, 2024 •

edited

thebsdbox commented Apr 9, 2024

Cellebyte commented Apr 9, 2024

thebsdbox commented Apr 10, 2024

d-uzlov commented Apr 10, 2024 •

edited

thebsdbox commented Apr 10, 2024

d-uzlov commented Apr 10, 2024

d-uzlov commented Apr 10, 2024

thebsdbox commented Apr 10, 2024

d-uzlov commented Apr 10, 2024 •

edited

d-uzlov commented Apr 10, 2024

Cellebyte commented Apr 10, 2024

thebsdbox commented Apr 10, 2024

thebsdbox commented Apr 15, 2024

d-uzlov commented Apr 15, 2024

Use node name for service election and lease holder name instead of hostname #811

Use node name for service election and lease holder name instead of hostname #811

Conversation

d-uzlov commented Apr 9, 2024 • edited

d-uzlov Apr 9, 2024

Choose a reason for hiding this comment

d-uzlov Apr 9, 2024 • edited

Choose a reason for hiding this comment

thebsdbox commented Apr 9, 2024

Cellebyte commented Apr 9, 2024

thebsdbox commented Apr 10, 2024

d-uzlov commented Apr 10, 2024 • edited

thebsdbox commented Apr 10, 2024

d-uzlov commented Apr 10, 2024

d-uzlov commented Apr 10, 2024

thebsdbox commented Apr 10, 2024

d-uzlov commented Apr 10, 2024 • edited

d-uzlov commented Apr 10, 2024

Cellebyte commented Apr 10, 2024

thebsdbox commented Apr 10, 2024

thebsdbox commented Apr 15, 2024

d-uzlov commented Apr 15, 2024

d-uzlov commented Apr 9, 2024 •

edited

d-uzlov Apr 9, 2024 •

edited

d-uzlov commented Apr 10, 2024 •

edited

d-uzlov commented Apr 10, 2024 •

edited