Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iptables are spammed with entries in the KUBE-FIREWALL chain #82361

Closed
wjentner opened this issue Sep 5, 2019 · 31 comments
Closed

iptables are spammed with entries in the KUBE-FIREWALL chain #82361

wjentner opened this issue Sep 5, 2019 · 31 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@wjentner
Copy link

wjentner commented Sep 5, 2019

TL;DR:

iptables versions 1.8.1 and 1.8.2 have a bug causing rules of type

DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */

to be infinitely added to the KUBE-FIREWALL chain.
These iptable versions are currently in use in Debian buster.

Workaround:

Change alternatives to use iptables-legacy instead of iptables-nft:

update-alternatives --set iptables /usr/sbin/iptables-legacy

Also, be aware of: #71305

Solution:

Update the iptables package to >=1.8.3. At the time of writing this version is not distributed in the Debian stable package-list but distributed in buster-backports. There is no backport-fix for 1.8.2 or 1.8.1 available. A bug report has been filed already.

Thanks to @danwinship, the next kubernetes version (1.17) will contain a fix regarding the particular KUBE-FIREWALL spamming bug. See #81517 for more details.

Original description:

What happened: Our master node is spammed with iptable entries* in the KUBE-FIREWALL chain which eventually increases the load.

Showing the first few entries:

Chain KUBE-FIREWALL (2 references)
target     prot opt source               destination
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */
DROP       all  --  anywhere             anywhere             mark match 0x8000/0x8000 /* kubernetes firewall for dropping marked packets */

Right now we have 8576 entries. Before removing them the first time we had >30k entries. While observing the iptables we learned that every 1-2 seconds a new rule is added.

What you expected to happen:
No duplicate entries.

How to reproduce it (as minimally and precisely as possible): We use a single node environment for testing setup with kubeadm. We have upgraded several times.

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T09:18:22Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T09:09:21Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: bare-metal
  • OS (e.g: cat /etc/os-release):
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
  • Kernel (e.g. uname -a):
Linux 4.19.0-5-amd64 #1 SMP Debian 4.19.37-5+deb10u1 (2019-07-19) x86_64 GNU/Linux
  • Install tools: kubeadm
  • Network plugin and version (if this is a network-related bug):
    canal
calico/cni:v3.8.0
quay.io/coreos/flannel:v0.11.0
  • Others:
@wjentner wjentner added the kind/bug Categorizes issue or PR as related to a bug. label Sep 5, 2019
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Sep 5, 2019
@wjentner
Copy link
Author

wjentner commented Sep 5, 2019

/sig network

@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 5, 2019
@cmluciano
Copy link

@caseydavenport Any ideas on this one?

@athenabot
Copy link

/triage unresolved

Comment /remove-triage unresolved when the issue is assessed and confirmed.

🤖 I am a bot run by vllry. 👩‍🔬

@k8s-ci-robot k8s-ci-robot added the triage/unresolved Indicates an issue that can not or will not be resolved. label Sep 5, 2019
@vllry
Copy link
Contributor

vllry commented Sep 5, 2019

/assign @danwinship

@danwinship
Copy link
Contributor

What version of iptables do you have installed, and if 1.8.x, is it running in legacy or nft mode? (If it's iptables 1.8.1, try upgrading to 1.8.2 or 1.8.3.)

iptables.EnsureRule checks to see if a rule is present, and then adds it if it is not. It seems that on your system, it is always mistakenly deciding that the rule isn't actually present and so adding another copy.

@mnaser
Copy link

mnaser commented Sep 8, 2019

I'm running into a similar issue in an environment that runs CentOS 7. I can provide any backing information too if needed from my side. I am also using flannel without canal.

@vllry
Copy link
Contributor

vllry commented Sep 8, 2019

@mnaser as Dan said above, the iptables version information would be helpful.

It would also help if you could check if other chains/rules are having this problem, or only the KUBE-FIREWALL chain.

@bradfitz
Copy link

bradfitz commented Sep 8, 2019

This is happening to 6 of my nodes, all Debian buster (3 are masters, 3 are workers) on a freshly kube-admin bootstrapped v1.15.3 cluster w/ Cilium 1.6, with no kube-proxy.

# uname -a
Linux kc1b 4.19.0-5-amd64 #1 SMP Debian 4.19.37-5+deb10u2 (2019-08-08) x86_64 GNU/Linux

# lsb_release  -a
No LSB modules are available.
Distributor ID:	Debian
Description:	Debian GNU/Linux 10 (buster)
Release:	10
Codename:	buster

# dpkg -s iptables | grep ^Vers
Version: 1.8.2-4

# iptables --version
iptables v1.8.2 (nf_tables)

# iptables-save 2>&1 | uniq -c 
      1 # Generated by xtables-save v1.8.2 on Sun Sep  8 09:19:32 2019
      1 *nat
      1 :PREROUTING ACCEPT [0:0]
      1 :INPUT ACCEPT [0:0]
      1 :POSTROUTING ACCEPT [0:0]
      1 :OUTPUT ACCEPT [0:0]
      1 :KUBE-MARK-DROP - [0:0]
      1 :KUBE-MARK-MASQ - [0:0]
      1 :KUBE-POSTROUTING - [0:0]
      1 -A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
      1 -A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
      1 -A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
   6405 -A KUBE-POSTROUTING -m mark --mark 0x4000/0x4000 -m comment --comment "kubernetes service traffic requiring SNAT" -j MASQUERADE
      1 COMMIT
      1 # Completed on Sun Sep  8 09:19:32 2019
      1 # Generated by xtables-save v1.8.2 on Sun Sep  8 09:19:32 2019
      1 *filter
      1 :INPUT ACCEPT [0:0]
      1 :FORWARD ACCEPT [0:0]
      1 :OUTPUT ACCEPT [0:0]
      1 :KUBE-FIREWALL - [0:0]
      1 -A INPUT -j KUBE-FIREWALL
      1 -A OUTPUT -j KUBE-FIREWALL
   6379 -A KUBE-FIREWALL -m mark --mark 0x8000/0x8000 -m comment --comment "kubernetes firewall for dropping marked packets" -j DROP
      1 -A KUBE-FIREWALL -m mar# Warning: iptables-legacy tables present, use iptables-legacy-save to see them
      1 k --mark 0x8000/0x8000 -m comment --comment "kubernetes firewall for dropping marked packets" -j DROP
     25 -A KUBE-FIREWALL -m mark --mark 0x8000/0x8000 -m comment --comment "kubernetes firewall for dropping marked packets" -j DROP
      1 COMMIT
      1 # Completed on Sun Sep  8 09:19:33 2019

# update-alternatives --query iptables
Name: iptables
Link: /usr/sbin/iptables
Slaves:
 iptables-restore /usr/sbin/iptables-restore
 iptables-save /usr/sbin/iptables-save
Status: auto
Best: /usr/sbin/iptables-nft
Value: /usr/sbin/iptables-nft

Alternative: /usr/sbin/iptables-legacy
Priority: 10
Slaves:
 iptables-restore /usr/sbin/iptables-legacy-restore
 iptables-save /usr/sbin/iptables-legacy-save

Alternative: /usr/sbin/iptables-nft
Priority: 20
Slaves:
 iptables-restore /usr/sbin/iptables-nft-restore
 iptables-save /usr/sbin/iptables-nft-save

Oh, and iptables-legacy-save is here: https://gist.github.com/bradfitz/cc3b1961eb03c37583b853509ba8f2df

@praseodym
Copy link
Contributor

praseodym commented Sep 8, 2019

You’ll want to use iptables in legacy mode instead:

update-alternatives --set iptables /usr/sbin/iptables-legacy

#71305 has more details.

@wjentner
Copy link
Author

wjentner commented Sep 9, 2019

We have the same versions as @bradfitz .

We will now try the workaround as @praseodym suggested and observer further.

@danwinship
Copy link
Contributor

I'm not sure how using the wrong mode would cause this, unless nft mode was implementing "iptables -C" incorrectly, which I'm pretty sure that it does not in 1.8.2.

@bradfitz
Copy link

bradfitz commented Sep 9, 2019

I'm not sure how using the wrong mode would cause this

@danwinship, see #71305 for the details. In a nutshell, AIUI: there are a mix of iptables binaries being run in different modes, of which your host iptables is only one.

@thockin
Copy link
Member

thockin commented Sep 9, 2019

Can we close this as a dup of #71305 ?

@wjentner
Copy link
Author

wjentner commented Sep 9, 2019

@thockin we have applied the workaround and use iptables-legacy right now. The issue with duplicate rules did not occur so far.

However, as far as I understood iptables-nft should be supported in the future and I'm not quite sure whether this issue occurs as a result of using both (legacy & nft) at the same time or because of a bug in iptables-nft?

@danwinship
Copy link
Contributor

I am familiar with #71305. But the iptables rule in question here is only created by kubelet, so it wouldn't be affected by having kube-proxy use the wrong mode. Even if you were running containerized kubelet and kubelet was using the wrong mode, it still wouldn't fail in this way.

It looks like in iptables-nft 1.8.2, iptables -C is still broken for some rules. eg:

> iptables -A FOO -j MARK --set-xmark 0x8000
> iptables -C FOO -j MARK --set-xmark 0x8000 && echo exists
exists
> iptables -C FOO -j MARK --set-xmark 0x8001 && echo exists
exists

as compared with iptables-legacy or iptables-nft 1.8.3:

> iptables -C FOO -j MARK --set-xmark 0x8001 && echo exists
iptables: Bad rule (does a matching rule exist in that chain?).

still, this is the opposite of the bug that would be needed to cause the infinite rule propagation problem... Given that switching to iptables-legacy fixed the problem though, it seems like this is an iptables-nft bug.

@wjentner
Copy link
Author

wjentner commented Sep 9, 2019

What we could observe was that all (?) cali*-chains were held in iptables-legacy whereas KUBE* and some others were in iptables-nft.

This also caused a warning when doing iptables-nft -L saying that there are active rules in iptables-legacy.

@thockin
Copy link
Member

thockin commented Sep 9, 2019

@caseydavenport @danwinship @bowei

Do we need to have a special sig-net call to strategize on this? It's suddenly feeling a LOT more urgent...

@danwinship
Copy link
Contributor

danwinship commented Sep 9, 2019

OK, I was testing the wrong rule above: this does break with iptables-nft 1.8.2:

> iptables-nft -A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark -mark 0x8000 -j DROP
> iptables-nft -C KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark -mark 0x8000 -j DROP && echo exists
iptables: Bad rule (does a matching rule exist in that chain?).

So if you are using iptables 1.8.1 or 1.8.2 in nft mode, you will get infinite firewall drop rules. (It is fixed in 1.8.3.) This is not a mixing-nft-and-legacy bug, it's just "iptables 1.8.1-1.8.2 has bugs that make it not work with kubernetes".

@bradfitz
Copy link

bradfitz commented Sep 9, 2019

I assume you meant --mark with two hyphens?

Unfortunately Debian buster doesn't yet seem to have updated to 1.8.3 or backported the fix to 1.8.2:

# lsb_release  -a
No LSB modules are available.
Distributor ID:	Debian
Description:	Debian GNU/Linux 10 (buster)
Release:	10
Codename:	buster

# dpkg -s iptables | grep ^Version
Version: 1.8.2-4

# iptables-nft -N FOO
# iptables-nft -A FOO -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000 -j DROP
# iptables-nft -C FOO -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000 -j DROP && echo exists
iptables: Bad rule (does a matching rule exist in that chain?).

# iptables-legacy -N BAR
# iptables-legacy -A BAR -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000 -j DROP
# iptables-legacy -C BAR -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000 -j DROP && echo exists
exists

@thockin
Copy link
Member

thockin commented Sep 9, 2019

So TL;DR here is that Debian Buster is just broken for Kubernetes without enabling iptables legacy alternative.

@johscheuer
Copy link
Contributor

@wjentner for calico to use the nft backend you need to explicitly activate it with FELIX_IPTABLESBACKEND=NFT (see: projectcalico/calico#2322 (comment))

@athenabot
Copy link

@danwinship
If this issue has been triaged, please comment /remove-triage unresolved.

If you aren't able to handle this issue, consider unassigning yourself and/or adding the help-wanted label.

🤖 I am a bot run by vllry. 👩‍🔬

@danwinship
Copy link
Contributor

/remove-triage unresolved

@k8s-ci-robot k8s-ci-robot removed the triage/unresolved Indicates an issue that can not or will not be resolved. label Sep 17, 2019
@danwinship
Copy link
Contributor

@wjentner I see you have filed a bug with Debian. I don't really know Debian process very well... do you know what's likely to happen next? Is there some reason 1.8.3 is only in "buster-backports"? Is that like a temporary testing stage and the fixed package will eventually make its way into buster?

@wjentner
Copy link
Author

@danwinship I don't know either. So far I have not received any reply. I also don't know why 1.8.3 is not in the stable release yet but to cite backports.debian.org:

Backports are packages taken from the next Debian release (called "testing"), adjusted and recompiled for usage on Debian stable. Because the package is also present in the next Debian release, you can easily upgrade your stable+backports system once the next Debian release comes out. (In a few cases, usually for security updates, backports are also created from the Debian unstable distribution.)

Backports cannot be tested as extensively as Debian stable, and backports are provided on an as-is basis, with risk of incompatibilities with other components in Debian stable. Use with care!

It is therefore recommended to only select single backported packages that fit your needs, and not use all available backports.

Source: https://backports.debian.org/

Right now we are using the workaround to use iptables-legacy and so far we did not experience any further problems.

@danwinship
Copy link
Contributor

Oh, also, while iptables 1.8.2 continues to be problematic, the specific KUBE-FIREWALL spamming bug ought to be fixed in git master by #81517, which changes the way kubelet and kube-proxy monitor iptables. I was not planning to backport that PR though since it's not just a bugfix.

@wjentner
Copy link
Author

@danwinship excellent, thank you!
Do you know already in which version this will be released?

@danwinship
Copy link
Contributor

1.17

@aojea
Copy link
Member

aojea commented Nov 17, 2019

/close

This should be fixed now that @danwinship PR was merged #82966

@k8s-ci-robot
Copy link
Contributor

@aojea: Closing this issue.

In response to this:

/close

This should be fixed now that @danwinship PR was merged #82966

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@danwinship
Copy link
Contributor

This should be fixed now that @danwinship PR was merged #82966

Actually, no. #71305 was fixed by #82966. But this bug was fixed shortly pre-1.17 by #81517, so it probably should have been closed anyway. It's not fixed in 1.16 and earlier, and #81517 is large enough to be dubious as a backport. I had originally wanted to at least commit a warning to older branches, except that it's tricky to do because you can't just check the iptables version number or you'll get a false positive on RHEL 8. 🙁. I guess if we get more reports of people hitting this on older releases we can worry about doing something there.

wmfgerrit pushed a commit to wikimedia/operations-puppet that referenced this issue Jul 27, 2021
Due to [1] we need to deploy iptables from backports on Buster,
to avoid extremely long and repetitive iptables chains/rules
that affects performance.

[1] kubernetes/kubernetes#82361

Bug: T287238
Change-Id: I7c321c6988fe4a2009d50bc51bf38f1dac53137b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests