Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kube-vip depends on nodename == hostname for service election, which easily breaks #810

Closed
d-uzlov opened this issue Apr 9, 2024 · 2 comments

Comments

@d-uzlov
Copy link
Contributor

d-uzlov commented Apr 9, 2024

Describe the bug

It seems like kube-vip compares sh -c hostname to k8s node name when determining on which node a pod is located.
This works because k8s automatically assigns node name to sh -c hostname.
This doesn't work at all in case you alter node name in any way.

Kube-vip shouldn't depend on underlying hostname. Node names should only be compared to other node names.

The solution whould be to use k8s downward API: https://kubernetes.io/docs/concepts/workloads/pods/downward-api/, like suggested in a similar issue: #520 (comment)

To Reproduce

Steps to reproduce the behavior:

  1. Create single node kubeadm cluster: sudo kubeadm init --node-name my-custom-node-name
  2. Remove control plane taint: kl taint node my-custom-node-name node-role.kubernetes.io/control-plane:NoSchedule-
  3. Deploy kube-vip with service election enabled
  4. Deploy test service:
---
apiVersion: v1
kind: Service
metadata:
  name: apiserver-lb
  namespace: kube-system
spec:
  type: LoadBalancer
  externalTrafficPolicy: Local
  selector:
    component: kube-apiserver
  ports:
  - port: 6443
    targetPort: 6443
  1. Check service external IP: kubectl -n kube-system get svc apiserver-lb
  2. Check connection: curl https://10.3.7.3:6443 --insecure

Expected behavior

External IP is assugned.
Curl is able to connect to the service via the loadbalancer external IP.

Actual behavior

apiserver-lb gets the kube-vip.io/loadbalancerIPs annotation but its external IP is forever pending.
status.loadBalancer.ingress is never assigned.
curl fails with Connection timed out.

Environment (please complete the following information):

  • Kubernetes Version: 1.29.3
  • Kube-vip Version: 0.7.2

Kube-vip.yaml:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app.kubernetes.io/name: kube-vip-ds
    app.kubernetes.io/version: v0.7.2
  name: kube-vip-ds
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: kube-vip-ds
  template:
    metadata:
      labels:
        app.kubernetes.io/name: kube-vip-ds
        app.kubernetes.io/version: v0.7.2
    spec:
      containers:
      - args:
        - manager
        env:
        - name: vip_arp
          value: "true"
        - name: port
          value: "6443"
        - name: vip_cidr
          value: "32"
        - name: dns_mode
          value: first
        - name: svc_enable
          value: "true"
        - name: svc_leasename
          value: plndr-svcs-lock
        - name: svc_election
          value: "true"
        - name: vip_address
        - name: prometheus_server
          value: :2113
        image: ghcr.io/kube-vip/kube-vip:v0.7.2
        imagePullPolicy: Always
        name: kube-vip
        resources: {}
        securityContext:
          capabilities:
            add:
            - NET_ADMIN
            - NET_RAW
      hostNetwork: true
      serviceAccountName: kube-vip
  updateStrategy: {}
status:
  currentNumberScheduled: 0
  desiredNumberScheduled: 0
  numberMisscheduled: 0
  numberReady: 0

Additional context

There are a few issues that are likely related to this one:

@d-uzlov
Copy link
Contributor Author

d-uzlov commented Apr 9, 2024

Currently kube-vip attempts to handle FQDN hostname cases. For example, here:

// 1. Compare the hostname on the endpoint to the hostname
// 2. Compare the nodename on the endpoint to the hostname
// 3. Drop the FQDN to a shortname and compare to the nodename on the endpoint
// 1. Compare the Hostname first (should be FQDN)
log.Debugf("[%s] processing endpoint [%s]", ep.label, ep.endpoints.Subsets[subset].Addresses[address].IP)
if id == ep.endpoints.Subsets[subset].Addresses[address].Hostname {
log.Debugf("[%s] found local endpoint - address: %s, hostname: %s",
ep.label, ep.endpoints.Subsets[subset].Addresses[address].IP, ep.endpoints.Subsets[subset].Addresses[address].Hostname)
localEndpoints = append(localEndpoints, ep.endpoints.Subsets[subset].Addresses[address].IP)
} else {
// 2. Compare the Nodename (from testing could be FQDN or short)
if ep.endpoints.Subsets[subset].Addresses[address].NodeName != nil {
if id == *ep.endpoints.Subsets[subset].Addresses[address].NodeName {
log.Debugf("[%s] found local endpoint - address: %s, hostname: %s, node: %s",
ep.label, ep.endpoints.Subsets[subset].Addresses[address].IP, ep.endpoints.Subsets[subset].Addresses[address].Hostname,
*ep.endpoints.Subsets[subset].Addresses[address].NodeName)
localEndpoints = append(localEndpoints, ep.endpoints.Subsets[subset].Addresses[address].IP)
} else if shortnameErr == nil && shortname == *ep.endpoints.Subsets[subset].Addresses[address].NodeName {
log.Debugf("[%s] found local endpoint - address: %s, shortname: %s, node: %s",
ep.label, ep.endpoints.Subsets[subset].Addresses[address].IP, shortname, *ep.endpoints.Subsets[subset].Addresses[address].NodeName)
localEndpoints = append(localEndpoints, ep.endpoints.Subsets[subset].Addresses[address].IP)
}
}
}

// 1. Compare the hostname on the endpoint to the hostname
// 2. Compare the nodename on the endpoint to the hostname
// 3. Drop the FQDN to a shortname and compare to the nodename on the endpoint
// 1. Compare the Hostname first (should be FQDN)
log.Debugf("[%s] processing endpoint [%s]", ep.label, ep.endpoints.Endpoints[i].Addresses[j])
if ep.endpoints.Endpoints[i].Hostname != nil && id == *ep.endpoints.Endpoints[i].Hostname {
if *ep.endpoints.Endpoints[i].Conditions.Serving {
log.Debugf("[%s] found endpoint - address: %s, hostname: %s", ep.label, ep.endpoints.Endpoints[i].Addresses[j], *ep.endpoints.Endpoints[i].Hostname)
localEndpoints = append(localEndpoints, ep.endpoints.Endpoints[i].Addresses[j])
}
} else {
// 2. Compare the Nodename (from testing could be FQDN or short)
if ep.endpoints.Endpoints[i].NodeName != nil {
if id == *ep.endpoints.Endpoints[i].NodeName && *ep.endpoints.Endpoints[i].Conditions.Serving {
if ep.endpoints.Endpoints[i].Hostname != nil {
log.Debugf("[%s] found endpoint - address: %s, hostname: %s, node: %s", ep.label, ep.endpoints.Endpoints[i].Addresses[j], *ep.endpoints.Endpoints[i].Hostname, *ep.endpoints.Endpoints[i].NodeName)
} else {
log.Debugf("[%s] found endpoint - address: %s, node: %s", ep.label, ep.endpoints.Endpoints[i].Addresses[j], *ep.endpoints.Endpoints[i].NodeName)
}
localEndpoints = append(localEndpoints, ep.endpoints.Endpoints[i].Addresses[j])
// 3. Compare to shortname
} else if shortnameErr != nil && shortname == *ep.endpoints.Endpoints[i].NodeName && *ep.endpoints.Endpoints[i].Conditions.Serving {
log.Debugf("[%s] found endpoint - address: %s, shortname: %s, node: %s", ep.label, ep.endpoints.Endpoints[i].Addresses[j], shortname, *ep.endpoints.Endpoints[i].NodeName)
localEndpoints = append(localEndpoints, ep.endpoints.Endpoints[i].Addresses[j])
}
}
}

But it still doesn't work because in other places it doesn't:

  1. Here id goes straight into watchEndpoint, which could handle FQDN names but it doesn't because id = os.Hostname() is already a short name)

    id, err := os.Hostname()

    if err = sm.watchEndpoint(activeServiceLoadBalancer[string(svc.UID)], id, svc, &wg, provider); err != nil {

  2. Here hostname := os.Hostname() is a short name, but kubernetes.io/hostname can be set to either FQDN or an arbitrary custom string:

    hostname, err := os.Hostname()

    labelSelector := metav1.LabelSelector{MatchLabels: map[string]string{"kubernetes.io/hostname": hostname}}

@d-uzlov
Copy link
Contributor Author

d-uzlov commented May 3, 2024

Now that v0.8.0 is released I think we can close this.

@d-uzlov d-uzlov closed this as completed May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant