Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use node name for service election and lease holder name instead of hostname #811

Merged
merged 12 commits into from
Apr 17, 2024

Conversation

d-uzlov
Copy link
Contributor

@d-uzlov d-uzlov commented Apr 9, 2024

Addresses this issue:

After this change kube-vip should be able to work with any node names. I tried to remove all notions of hostnames from the code.

This PR adds vip_nodename env and nodeName command line argument.
kube-vip will fall back to hostname if it is not provided.
By default node name is injected via k8s downward API to all yaml deployments generated by kube-vip.

This PR also removes the logic for short name and FQDN matching, because it is now handled by k8s. Now kubevip only matches the exact node name value from config to exact node name value from endpoint/endpointslice.

This is a quite simple change but it affects a lot of things in the project.

I tested this change locally via this image: docker.io/daniluzlov/k8s-snippets:kube-vip-0.7.2-nodename2.
I tested both static pods for apiserver HA, and service loadbalancer with and without service election.
However, I can only test with ARP announcements.
I don't have the setup to test BGP.

@d-uzlov d-uzlov requested a review from thebsdbox as a code owner April 9, 2024 11:35
Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>
@@ -29,6 +29,7 @@ func findSelf(c *packngo.Client, projectID string) *packngo.Device {
// Go through devices
dev, _, _ := c.Devices.List(projectID, &packngo.ListOptions{})
for _, d := range dev {
// TODO do we need to replace os.Hostname with config.NodeName here?
me, _ := os.Hostname()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe here we also want to use node name as reported by k8s, but since I don't have BGP setup to test it, I left it as is, to avoid breaking stuff.

Comment on lines +98 to +101
if id == address.Hostname {
log.Debugf("[%s] found local endpoint - address: %s, hostname: %s", ep.label, address.IP, address.Hostname)
localEndpoints = append(localEndpoints, address.IP)
continue
Copy link
Contributor Author

@d-uzlov d-uzlov Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and in watch_endpointslices.go I removed short-hostname-to-node-name matching because k8s is already providing the correct value.
First I check address.NodeName because documentation states "This can be used to determine endpoints local to a node".
address.Hostname doesn't seem relevant to me, but I left it here as a second check, just in case this is relevant in some setups. Maybe it should also be removed.

@thebsdbox
Copy link
Collaborator

Unfortunately this is going to break all kube-vip installations and upgrades moving forward, I think defaulting to detecting to hostname detection in the case of manually specifying one should probably remain.

@Cellebyte
Copy link
Collaborator

I would suggest keeping the old behavior and skip it if your new environment variable is mounted. Additionally when the code runs in control-plane mode host name is needed as there is no node object.

pkg/manager/manager.go Outdated Show resolved Hide resolved
@thebsdbox
Copy link
Collaborator

This would be great, I think we'll just need to ensure that default/expected functionality remains :)

@d-uzlov
Copy link
Contributor Author

d-uzlov commented Apr 10, 2024

Unfortunately this is going to break all kube-vip installations and upgrades moving forward, I think defaulting to detecting to hostname detection in the case of manually specifying one should probably remain.

When making these changes I thought of using kube-vip to generate new deployment files. This way there will be no issues during the upgrade.
But if incrementing the image version in an existing pod manifest should be a supported option, I guess we need to create a fallback to the old behavior.

I'll push an update commit for this.

Additionally when the code runs in control-plane mode host name is needed as there is no node object.

What do you mean by this?
I believe that downward API overall and node name in particular is available right from the start of the cluster, both for regular pods and for static pods.
I tried testing my image both in an existing cluster (upgrading control plane static pods from v0.7.2 one by one), and creating a new cluster with kube-vip static pod bootstrap, and I didn't see any issues in both cases.

@thebsdbox
Copy link
Collaborator

What do you mean by this?

When kube-vip is managing the HA VIP for the control plane (when being brought up by kubeadm), there is a particularly amusing/annoying scenario where kube-vip needs to come up "before" the API server is fully up. The reason is that kube-vip needs to have the HA address working in order for the kubeadm init process to complete successfully. Kube-vip will typically use a pod DNS spec to have 127.0.0.1 or the local address resolve to kubernetes so that the certificate SAN works. So the tl;dr is that the control plane side of things is a little bit complicated 😂

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>
Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>
@d-uzlov
Copy link
Contributor Author

d-uzlov commented Apr 10, 2024

Well, kube-vip is already using 127.0.0.1 kubernetes hosts entry, node hostname is not involved here.
And I tested that current set up works as it is.
So I guess using node name for leases and pod placement checks doesn't affect this, and we don't need to handle it any differently.

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>
@d-uzlov
Copy link
Contributor Author

d-uzlov commented Apr 10, 2024

@thebsdbox
Should I also modify unit/e2e tests to include node name option, and add description to documentation, or would you prefer to do it yourself?

@thebsdbox
Copy link
Collaborator

@thebsdbox
Should I also modify unit/e2e tests to include node name option, and add description to documentation, or would you prefer to do it yourself?

Yes please, that way we can be sure that it's working as expected 😀

d-uzlov and others added 4 commits April 10, 2024 18:21
Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>
Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>
Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>
Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>
@d-uzlov
Copy link
Contributor Author

d-uzlov commented Apr 10, 2024

@thebsdbox I made sure that currently existing tests run fine for me locally, and tried to add tests for node names.

Ideally we would want to create nodes with modified names but kind doesn't seem to allow it, so I tried to at least test something.

For e2e tests I modified the main manifest and the etcd manifest to use the new node name env, and then added a new e2e test that uses manifest without this env, to check that old deployments will still work.

For service tests I figured out I can modify cluster a bit before running kube-vip.
I change hostnames so that they no longer match node names, so that, if you try to run kube-vip without the new node name env/flag, it won't find local pods and will fail to assign loadbalancer IPs to nodes when using ExternalTrafficPolicy: Local.
I then added the env to the daemonset config accordingly.

@d-uzlov
Copy link
Contributor Author

d-uzlov commented Apr 10, 2024

I also wanted to update documentation but turns out it's in a separate repo, so I guess maybe later, after this PR is merged.
Well, this feature shouldn't need user configuration any way, it should just work passively. Documentation will just help users better understand what the new env/flag means.

@Cellebyte
Copy link
Collaborator

@thebsdbox we need to make sure that new features get also properly documented, I think we already got some drift between code and docs.

@thebsdbox
Copy link
Collaborator

@thebsdbox we need to make sure that new features get also properly documented, I think we already got some drift between code and docs.

100% my plan to separate the repos to make things easier, sadly made it worse. Perhaps updating the templates for a corresponding website PR would make sense.

Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>
Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>
Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>
Signed-off-by: Danil Uzlov <36223296+d-uzlov@users.noreply.github.com>
@thebsdbox
Copy link
Collaborator

All tests are passing, I'll have one more look over but is this ready to merge?

@d-uzlov
Copy link
Contributor Author

d-uzlov commented Apr 15, 2024

For me this PR is ready to merge, I don't have anything more to change.

@thebsdbox thebsdbox merged commit bd3df6b into kube-vip:main Apr 17, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants