Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch kube-proxy from iptables mode to ipvs mode #321

Closed
wants to merge 1 commit into from
Closed

Conversation

dghubble
Copy link
Member

@dghubble dghubble commented Oct 18, 2018

Kubernetes v1.11 marked kube-proxy IPVS mode as GA. Evalute its readiness

@dghubble
Copy link
Member Author

dghubble commented Oct 23, 2018

I've found a number of issues with kube-proxy IPVS mode. It is not yet ready for use (I will note IPVS itself has been ready since 2004).

I've reproduced these issues on various cloud providers and CNI providers (flannel and Calico), which suggests the issue may be in the underlying iptables rules colliding in some way, when kube-proxy uses IPVS mode. Even with IPVS mode, kube-proxy manages iptables rules.

Service IPs inaccesible

Cluster components cannot access services via service IP from some nodes (seemingly, those running a pod with a hostport). Here, kube-apiserver at 10.3.0.1 is just one example service IP, but the first sign of trouble.

Service IP unavailable from pods on workers (e.g. flannel, nginx-ingress, calico).

# kube-state-metrics
F1023 01:03:05.022783       1 main.go:112] Failed to create client: ERROR communicating with apiserver: Get https://10.3.0.1:443/version?timeout=32s: dial tcp 10.3.0.1:443: connect: connection refused

Service IP unavailable from worker nodes.

curl https://10.3.0.1:443
curl: (7) Failed to connect to 10.3.0.1 port 443: Connection refused

Notably, the kube-apiserver pod (i.e. IPVS real server) is routable.

Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  10.3.0.1:443 rr
  -> 10.132.46.207:6443           Masq    1      1          0
curl -k https://10.132.46.207:6443  -> JSON 401 Unauthorized (expected)

Observations:

Seems to be related to pods with hostPort 80 and 443 running on the affected nodes (e.g. nginx-ingress). Removing the nginx-ingress deployment, those hosts (and pods on them) can access apiserver via the service IP.

Issues:

From Google Groups:

On further investigation, this turns out to be basically a bug in the
CNI HostPort plugin; it was assuming that "--dst-type LOCAL" would cause
it to only receive packets addressed for the node's "real" IP addresses,
but at least with some CNI plugins, it ends up matching service IPs as
well, and so the HostPort processing was intercepting packets that it
didn't want. If this bug wasn't there then it wouldn't matter what order
the kube-proxy and CNI rules were added in because they wouldn't ever
match the same packets.

NodePort no longer accessible from node localhost

kube-proxy IPVS mode continues using iptables to setup NodePort Services. Creating a NodePort service with kube-proxy in iptables mode means the node port was accessible from a node, but its no longer is with IPVS mode.

ssh core@NODE
curl 127.0.0.1:30010  (no longer works with IPVS mode)
curl publicIP:30010    (still works)

Notably, kube-proxy logs show the NodePort is known.

# kube-proxy logs
I1022 02:22:31.581300       1 proxier.go:1657] Opened local port "nodePort for default/hello:http" (:30010/tcp)

Netstat shows the bound address on the host.

tcp6       0      0 :::30010                :::*                    LISTEN      -

Its a change in kube-proxy behavior. So maybe upstream ought to be informed.

kube-proxy spurious logs removing IPv6 address

kube-proxy IPVS tries to delete the IPv6 address on the interface it creates. IP parsing and address deletion is suspect.

E1023 01:32:19.750072       1 proxier.go:1591] Failed to unbind service addr fe80::d468:98ff:fe01:627f from dummy interface kube-ipvs0: error unbind address: fe80::d468:98ff:fe01:627f from interface: kube-ipvs0, err: cannot assign requested address

kube-ipvs0 interface always has MTU 1500

kube-proxy IPVS mode always creates a kube-ipvs0 interface with MTU 1500. I don't have clear evidence this causes problems, but the correct behavior is probably to detect the MTU of the default route's interface.

@dghubble
Copy link
Member Author

dghubble commented Oct 23, 2018

Filed the "kube-proxy spurious logs removing IPv6 address" issue and found the source (though a solution might take different approaches). kubernetes/kubernetes#70113

elemental-lf added a commit to elemental-lf/terraform-render-bootstrap that referenced this pull request Oct 25, 2018
@dghubble dghubble closed this Feb 16, 2019
dghubble added a commit to poseidon/terraform-render-bootstrap that referenced this pull request Oct 7, 2019
* Kubernetes v1.11 considered kube-proxy IPVS mode GA
* Many problems were found poseidon/typhoon#321
* Since then, major blockers seem to have been addressed
dghubble added a commit that referenced this pull request Oct 7, 2019
* Kubernetes v1.11 considered kube-proxy IPVS mode GA
* Many problems were found #321
* Since then, major blockers seem to have been addressed
dghubble added a commit that referenced this pull request Oct 15, 2019
* Kubernetes v1.11 considered kube-proxy IPVS mode GA
* Many problems were found #321
* Since then, major blockers seem to have been addressed
dghubble added a commit to poseidon/terraform-render-bootstrap that referenced this pull request Oct 16, 2019
* Kubernetes v1.11 considered kube-proxy IPVS mode GA
* Many problems were found poseidon/typhoon#321
* Since then, major blockers seem to have been addressed
dghubble added a commit that referenced this pull request Oct 16, 2019
* Kubernetes v1.11 considered kube-proxy IPVS mode GA
* Many problems were found #321
* Since then, major blockers seem to have been addressed
@dghubble dghubble deleted the ipvs branch October 16, 2019 06:26
dghubble added a commit that referenced this pull request Oct 16, 2019
* Kubernetes v1.11 considered kube-proxy IPVS mode GA
* Many problems were found #321
* Since then, major blockers seem to have been addressed
dghubble added a commit that referenced this pull request Oct 27, 2019
* Kubernetes v1.11 considered kube-proxy IPVS mode GA
* Many problems were found #321
* Since then, major blockers seem to have been addressed
Snaipe pushed a commit to aristanetworks/monsoon that referenced this pull request Apr 13, 2023
* Kubernetes v1.11 considered kube-proxy IPVS mode GA
* Many problems were found poseidon#321
* Since then, major blockers seem to have been addressed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant