Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hairpin rules are not added when using IPVS with a cloudprovider enabled #30363

Closed
dkeightley opened this issue Dec 6, 2020 · 7 comments
Closed
Assignees
Labels
internal kind/bug Issues that are defects reported by users or that we know have reached a real release team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support
Milestone

Comments

@dkeightley
Copy link
Contributor

dkeightley commented Dec 6, 2020

What kind of request is this (question/bug/enhancement/feature request): bug

Steps to reproduce (least amount of steps as possible):

  • Create a node driver cluster with a cloud provider enabled, EC2 with the AWS cloud provider was used in this example
  • Enable IPVS, eg:
  services:
    kubeproxy:
      extra_args:
        ipvs-scheduler: rr
        proxy-mode: ipvs
  • Create a workload with a ClusterIP service (1 replica is best for testing)
  • The workload cannot reach itself via the ClusterIP or service DNS name using the service port (hairpin connectivity)

Result:

  • The hairpin iptables rule is not added to the KUBE-POSTROUTING chain.
# iptables -nvL -t nat | grep POSTROUTING -A5

Chain KUBE-POSTROUTING (1 references)
 pkts bytes target     prot opt in     out     source               destination
  187 18489 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            mark match ! 0x4000/0x4000
   88  6857 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            MARK xor 0x4000
   88  6857 MASQUERADE  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes service traffic requiring SNAT */
  • The following hairpin rule is expected:
 pkts bytes target     prot opt in     out     source               destination
    0     0 MASQUERADE  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* Kubernetes endpoints dst ip:port, source ip for solving hairpin purpose */ match-set KUBE-LOOP-BACK dst,dst,src
[..]
  • The ipset KUBE-LOOP-BACK table is not populated:
# docker exec kube-proxy ipset list KUBE-LOOP-BACK
Name: KUBE-LOOP-BACK
Type: hash:ip,port,ip
Revision: 5
Header: family inet hashsize 1024 maxelem 65536
Size in memory: 136
References: 0
Members: 

Other details that may be helpful:

This appears to relate to the IsLocal condition not being matched due to different names for the node being populated with the cloudprovider metadata, the difference in nodeName and kubernetes.io/hostname prevents the hairpin rule and ipset list being added to the nodes, as no “local” pods are detected as endpoints.

Environment information

  • Rancher version (rancher/rancher/rancher/server image tag or shown bottom left in the UI): v2.4.5
  • Installation option (single install/HA): HA

Cluster information

  • Cluster type (Hosted/Infrastructure Provider/Custom/Imported): EC2 / infrastructure
  • Kubernetes version (use kubectl version): v1.16.15

gz#11904
JIRA: SURE-2373, SURE-3284

@dkeightley dkeightley added kind/bug Issues that are defects reported by users or that we know have reached a real release internal labels Dec 6, 2020
@kinarashah
Copy link
Member

kinarashah commented Jan 19, 2021

Refer to workaround kubernetes/kubernetes#71851 (comment)

@kinarashah
Copy link
Member

kinarashah commented Feb 1, 2022

Root cause
kubelet uses nodename set by aws cloud provider, but kube-proxy doesn't. kube-proxy expects the hostname to be the same as the nodename, otherwise it doesn't set the right iptables rules.

What was fixed, or what changes have occurred

Areas or cases that should be tested

  • rke1 cluster with aws cloud provider and ipvs enabled to confirm the original issue is fixed
  • rke1 cluster without cloud provider to confirm kubelet and kube-proxy continue to get the correct hostname-override

What areas could experience regressions?

Are the repro steps accurate/minimal?
yes

@rishabhmsra
Copy link
Contributor

Re-opening the issue

  • Verified on rancher v2.6.3, KDM pointing to dev-2.6

Steps followed :

  • Provisioned a single node AWS node driver cluster (k8s v1.22.5-rancher2-2). Selected Amazon (In-Tree) cloud provider and added below args :
 services:
    kubeproxy:
      extra_args:
        ipvs-scheduler: rr
        proxy-mode: ipvs
  • Created a nginx deployment with a ClusterIP service (port 80).
  • Exec into nginx pod and ran curl to ClusterIp and getting Connection timed out
  • Whereas getting the correct response when running the curl to nginx ClusterIp from other pod.

iptables output as mentioned here.

# iptables -nvL -t nat | grep POSTROUTING -A5
--
Chain KUBE-POSTROUTING (1 references)
 pkts bytes target     prot opt in     out     source               destination         
   15   900 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            mark match ! 0x4000/0x4000
    7   420 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            MARK xor 0x4000
    7   420 MASQUERADE  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes service traffic requiring SNAT */ random-fully
docker exec kube-proxy ipset list KUBE-LOOP-BACK
Name: KUBE-LOOP-BACK
Type: hash:ip,port,ip
Revision: 6
Header: family inet hashsize 1024 maxelem 65536 bucketsize 12 initval 0x36182ee0
Size in memory: 208
References: 0
Number of entries: 0
Members:

@kinarashah
Copy link
Member

The issue is rke-tools is dependent on the RKE fix, but rancher 2.6.3 doesn't have this fix vendored. Looking into it to see if the fix can be in rke-tools alone.

@kinarashah
Copy link
Member

kinarashah commented Feb 2, 2022

The command args for kube-proxy don't pass cloud provider aws, so there isn't a way to override hostname-override flag without the corresponding RKE fix. Need to move this issue to v2.6.4 because that's when the RKE fix will be vendored to Rancher. cc @sowmyav27 @snasovich

Note:

@kinarashah
Copy link
Member

kinarashah commented Mar 4, 2022

Fix now available to test with the latest k8s versions (which have rke-tools v0.1.79):

  • v1.23.4-rancher1-1
  • v1.22.6-rancher1-2,
  • v1.21.9-rancher1-2,
  • v1.19.16-rancher1-4

Can be tested on v2.6-head (which vendors RKE v1.3.4-rc8 so has the RKE fix as well).

@rishabhmsra
Copy link
Contributor

Verifed this on rancher v2.6-head(4df2214), docker install.

Case 1 : AWS cloud provider enabled

Validation steps followed :

  • Created 4 downstream ec2 node driver cluster(single node), using k8s version -> v1.23.4-rancher1-1, v1.22.6-rancher1-2, v1.21.9-rancher1-2 and v1.19.16-rancher1-4. Selected Amazon (In-Tree) cloud provider and added below args :
services:
    kubeproxy:
      extra_args:
        ipvs-scheduler: rr
        proxy-mode: ipvs
  • Created nginx deployment with a ClusterIP service (port 80) in default ns.
  • Exec into nginx pod and ran curl to ClusterIp and getting the correct response.
  • SSH'ed into control plane, hairpin rule is also present :
Chain KUBE-POSTROUTING (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 MASQUERADE  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* Kubernetes endpoints dst ip:port, source ip for solving hairpin purpose */ match-set KUBE-LOOP-BACK dst,dst,src
   30  1800 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            mark match ! 0x4000/0x4000
   11   660 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            MARK xor 0x4000
   11   660 MASQUERADE  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes service traffic requiring SNAT */ random-fully

Case 2 : Verify kubelet and kube-proxy continue to get the correct hostname-override if cloud provider is not selected

Validation steps followed :

  • Created 4 downstream ec2 node driver cluster(single node), using k8s version -> v1.23.4-rancher1-1, v1.22.6-rancher1-2, v1.21.9-rancher1-2 and v1.19.16-rancher1-4.
  • SSH'ed into control plane node and verified the hostname-override arg value:
ps -ef | grep -i kubelet | grep -i over
root        8619    8599  2 06:59 ?        00:09:04 kubelet --resolv-conf=/etc/resolv.conf --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin --client-ca-file=/etc/kubernetes/ssl/kube-ca.pem --hostname-override=rishabh-cluster-1-23-none1 --kubeconfig=/etc/kubernetes/ssl/kubecfg-kube-node.yaml
...
ps -ef | grep -i kube-proxy | grep -i over
root        9262    9238  0 06:59 ?        00:00:03 kube-proxy --healthz-bind-address=127.0.0.1 --v=2 --hostname-override=rishabh-cluster-1-23-none1 --kubeconfig=/etc/kubernetes/ssl/kubecfg-kube-proxy.yaml
...
kubectl get node
NAME                         STATUS   ROLES                      AGE     VERSION
rishabh-cluster-1-23-none1   Ready    controlplane,etcd,worker   5h35m   v1.23.4
  • Similarly did this for other k8s cluster as well.

Result :

  • Both scenarios passed, hence closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal kind/bug Issues that are defects reported by users or that we know have reached a real release team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support
Projects
None yet
Development

No branches or pull requests

8 participants