Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restarting iptables kube-proxier causes connections to fail #75360

Closed
PaulFurtado opened this issue Mar 14, 2019 · 11 comments · Fixed by #76109
Closed

Restarting iptables kube-proxier causes connections to fail #75360

PaulFurtado opened this issue Mar 14, 2019 · 11 comments · Fixed by #76109
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@PaulFurtado
Copy link

What happened:
When restarting kube-proxy in iptables mode, there will be several seconds of timeouts because it flushes the KUBE-SERVICES nat chain and several others.

What you expected to happen:
Restarting kube-proxy should not impact traffic in any way.

How to reproduce it (as minimally and precisely as possible):

  1. Run an HTTP server behind a ClusterIP service
  2. Run an http client in a loop making requests to the HTTP server
  3. kill -15 the kube-proxy process on the client node
  4. Connections timeout for a few seconds until kube-proxy re-syncs the iptables rules

Anything else we need to know?:
Running with -v=6 makes it pretty clear what's happening:

I0314 05:35:28.808656       7 server_others.go:174] Tearing down inactive rules.
I0314 05:35:28.808673       7 iptables.go:419] running iptables -C [OUTPUT -t nat -m comment --comment handle ClusterIPs; NOTE: this must be before the NodePort rules -j KUBE-PORTALS-HOST]
I0314 05:35:28.812934       7 iptables.go:419] running iptables -C [PREROUTING -t nat -m comment --comment handle ClusterIPs; NOTE: this must be before the NodePort rules -j KUBE-PORTALS-CONTAINER]
I0314 05:35:28.816683       7 iptables.go:419] running iptables -C [OUTPUT -t nat -m addrtype --dst-type LOCAL -m comment --comment handle service NodePorts; NOTE: this must be the last rule in the chain -j KUBE-NODEPORT-HOST]
I0314 05:35:28.821715       7 iptables.go:419] running iptables -C [PREROUTING -t nat -m addrtype --dst-type LOCAL -m comment --comment handle service NodePorts; NOTE: this must be the last rule in the chain -j KUBE-NODEPORT-CONTAINER]
I0314 05:35:28.825080       7 iptables.go:419] running iptables -C [INPUT -t filter -m comment --comment Ensure that non-local NodePort traffic can flow -j KUBE-NODEPORT-NON-LOCAL]
I0314 05:35:28.826834       7 iptables.go:419] running iptables -F [KUBE-PORTALS-CONTAINER -t nat]
I0314 05:35:28.830456       7 iptables.go:419] running iptables -F [KUBE-PORTALS-HOST -t nat]
I0314 05:35:28.833272       7 iptables.go:419] running iptables -F [KUBE-NODEPORT-HOST -t nat]
I0314 05:35:28.836361       7 iptables.go:419] running iptables -F [KUBE-NODEPORT-CONTAINER -t nat]
I0314 05:35:28.839093       7 iptables.go:419] running iptables -F [KUBE-NODEPORT-NON-LOCAL -t filter]
I0314 05:35:28.840088       7 iptables.go:419] running iptables -C [OUTPUT -t nat -m comment --comment kubernetes service portals -j KUBE-SERVICES]
I0314 05:35:28.843330       7 iptables.go:419] running iptables -D [OUTPUT -t nat -m comment --comment kubernetes service portals -j KUBE-SERVICES]
I0314 05:35:28.847852       7 iptables.go:419] running iptables -C [PREROUTING -t nat -m comment --comment kubernetes service portals -j KUBE-SERVICES]
I0314 05:35:28.850976       7 iptables.go:419] running iptables -D [PREROUTING -t nat -m comment --comment kubernetes service portals -j KUBE-SERVICES]
I0314 05:35:28.855081       7 iptables.go:419] running iptables -C [POSTROUTING -t nat -m comment --comment kubernetes postrouting rules -j KUBE-POSTROUTING]
I0314 05:35:28.857992       7 iptables.go:419] running iptables -D [POSTROUTING -t nat -m comment --comment kubernetes postrouting rules -j KUBE-POSTROUTING]
I0314 05:35:28.862013       7 iptables.go:419] running iptables -F [KUBE-SERVICES -t nat]
I0314 05:35:28.865648       7 iptables.go:419] running iptables -X [KUBE-SERVICES -t nat]
I0314 05:35:28.869143       7 iptables.go:419] running iptables -F [KUBE-POSTROUTING -t nat]
I0314 05:35:28.872606       7 iptables.go:419] running iptables -X [KUBE-POSTROUTING -t nat]
I0314 05:35:29.159612       7 server.go:444] Version: v1.10.11

it says it's cleaning up inactive rules, but these are crucial to the iptables proxier. You can trivially reproduce by just running:

iptables -F KUBE-SERVICES -t nat

Environment:

  • Kubernetes version (use kubectl version): 1.10.11
  • Cloud provider or hardware configuration: amazon
  • OS (e.g: cat /etc/os-release): custom
  • Kernel (e.g. uname -a): 4.14.77-hs623.el6.x86_64
  • Install tools: custom
  • Others: iptables 1.6.2

We're running kube-proxy 1.10.11, but I've confirmed this issue with kube-proxy 1.13.4 too

@PaulFurtado PaulFurtado added the kind/bug Categorizes issue or PR as related to a bug. label Mar 14, 2019
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Mar 14, 2019
@PaulFurtado
Copy link
Author

@kubernetes/sig-network-bugs

@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 14, 2019
@k8s-ci-robot
Copy link
Contributor

@PaulFurtado: Reiterating the mentions to trigger a notification:
@kubernetes/sig-network-bugs

In response to this:

@kubernetes/sig-network-bugs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@andrewsykim
Copy link
Member

andrewsykim commented Mar 14, 2019

Possibly a bug in the proxy. It flushes iptabes if the proxier is able to use IPVS which seems like the wrong behavior to me if there are shared tables/chains between the IPVS and iptables proxier. One workaround I can think of right now is to unload the ipvs kernel module to skip this check.

@andrewsykim
Copy link
Member

/assign

@PaulFurtado
Copy link
Author

@andrewsykim oh, that's an interesting workaround, I'll give that a shot, thanks!

@andrewsykim
Copy link
Member

np, I'll try to work on the bug fix in the meantime!

@andrewsykim
Copy link
Member

/triage unresolved

@k8s-ci-robot k8s-ci-robot added the triage/unresolved Indicates an issue that can not or will not be resolved. label Mar 14, 2019
@PaulFurtado
Copy link
Author

PaulFurtado commented Mar 15, 2019

@andrewsykim note that unloading the ip_vs module is not actually enough of a workaround because kube-proxy will load the kernel module if it sees that it is available. If I unload the ip_vs module and then rename ip_vs.ko to something else in /lib/modules then it does the right thing.

Dug slightly further: the way it probes for modules is by actually running modprobe. So the simplest hack that allows us to keep ip_vs loaded for other things on the system is to put a modprobe script on its PATH that just always exits 1 for the ip_vs modules. (We can stop mounting the modules dir into the kube-proxy container in our kubernetes clusters, but we also run kube-proxy on non-kubernetes nodes via the init system, so the PATH hack works well enough there). This should hold us over fine until we get to a version with your fix in it.

@andrewsykim
Copy link
Member

Good to know, thanks for sharing!

@thockin thockin removed the triage/unresolved Indicates an issue that can not or will not be resolved. label Mar 21, 2019
@andrewsykim
Copy link
Member

/assign @vllry

@andrewsykim
Copy link
Member

andrewsykim commented Mar 21, 2019

Quick update on this issue from today's SIG Network call: we're going to try to get rid of the automatic proxy clean up altogether for v1.14.1 since this is considered a bug. @vllry is working on the KEP & implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
5 participants