Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

option to bump iptables to ≥1.8.8 #8403

Closed
defo89 opened this issue Jan 11, 2024 · 11 comments · Fixed by #8416
Closed

option to bump iptables to ≥1.8.8 #8403

defo89 opened this issue Jan 11, 2024 · 11 comments · Fixed by #8416

Comments

@defo89
Copy link

defo89 commented Jan 11, 2024

Current Behavior

Due to custom rule calico/node fails to execute iptables-nft-save command and is failing readiness probe:

2024-01-11T12:49:55.774618460Z 2024-01-11 12:49:55.774 [ERROR][8003] felix/table.go 881: iptables-save failed because there are incompatible nft rules in the table.  Remove the nft rules to continue. ipVersion=0x4 table="raw"
2024-01-11T12:49:55.774628361Z 2024-01-11 12:49:55.774 [WARNING][8003] felix/table.go 830: Killing iptables-nft-save process after a failure error=iptables-save failed because there are incompatible nft rules in the table
2024-01-11T12:49:55.774737690Z 2024-01-11 12:49:55.774 [WARNING][8003] felix/table.go 840: iptables save failed error=signal: killed
2024-01-11T12:49:55.774752061Z 2024-01-11 12:49:55.774 [WARNING][8003] felix/table.go 778: iptables-nft-save command failed error=iptables-save failed because there are incompatible nft rules in the table ipVersion=0x4 stderr="" table="raw"

Assumption is that is due to the output of iptables-nft-save -t raw command in calico/node pod:

# iptables-nft-save -t raw
# Table `raw' is incompatible, use 'nft' tool.

Possible Solution

Bump iptables to ≥1.8.8. In these versions I am able to execute the command iptables-nft-save -t raw in a hostnetwork pod and get same rules output as on host.

Steps to Reproduce (for bugs)

calico/node Pod:

# iptables-nft --version
iptables v1.8.4 (nf_tables)

OS:

# iptables-nft --version
iptables v1.8.8 (nf_tables)

Context

Rule in question is added by another daemon (for path mtu discovery):

iptables-nft -t raw -I PREROUTING -i <uplink> -p icmp -m icmp --icmp-type 3/4 -j NFLOG --nflog-group 33
# iptables-nft-save -t raw
# Generated by iptables-nft-save v1.8.8 (nf_tables) on Thu Jan 11 13:04:51 2024
*raw
:PREROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A PREROUTING -i <uplink> -p icmp -m icmp --icmp-type 3/4 -j NFLOG --nflog-group 33
COMMIT
# Completed on Thu Jan 11 13:04:51 2024

Your Environment

  • Calico version: v3.27.0
  • Orchestrator version (e.g. kubernetes, mesos, rkt): k8s v1.28.4
  • Operating System and version (lab runs alpha but Flatcar Beta/Stable has same iptables version):
# cat /etc/os-release
NAME="Flatcar Container Linux by Kinvolk"
ID=flatcar
ID_LIKE=coreos
VERSION=3815.0.0
VERSION_ID=3815.0.0
BUILD_ID=2023-12-12-0332
SYSEXT_LEVEL=1.0
PRETTY_NAME="Flatcar Container Linux by Kinvolk 3815.0.0 (Oklo)"
@cyclinder
Copy link
Contributor

cyclinder commented Jan 12, 2024

The latest release of kube-proxy uses iptables v1.8.9, I believe bumping iptables to v1.8.9 is meaningful.

root@10-20-1-20:~# kubectl  exec -it  -n kube-system kube-proxy-hxkrd bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
error: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "981dab023a784a4694230dcd3becc8596f77b285843b220fdab80c0636fd4d55": OCI runtime exec failed: exec failed: unable to start container process: exec: "bash": executable file not found in $PATH: unknown
root@10-20-1-20:~# kubectl  exec -it  -n kube-system kube-proxy-hxkrd sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
# iptables -V
iptables v1.8.9 (nf_tables)
# exit
root@10-20-1-20:~# kubectl version
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.0

I would like to work on this.

/cc @caseydavenport

@defo89
Copy link
Author

defo89 commented Jan 16, 2024

I have played a bit with node/Dockerfile.amd64 and was able to confirm that the error is not happening with iptables ≥1.8.8. This of course is not a full fix since legacy case is not covered.

diff --git a/node/Dockerfile.amd64 b/node/Dockerfile.amd64
index b3f47e7b7..9941ba9f1 100644
--- a/node/Dockerfile.amd64
+++ b/node/Dockerfile.amd64
@@ -13,7 +13,7 @@
 # limitations under the License.
 ARG ARCH=x86_64
 ARG GIT_VERSION=unknown
-ARG IPTABLES_VER=1.8.4-17
+ARG IPTABLES_VER=1.8.8-6
 ARG LIBNFTNL_VER=1.1.5-4
 ARG IPSET_VER=7.11-6
 ARG RUNIT_VER=2.1.2
@@ -35,7 +35,7 @@ ARG IPSET_VER
 ARG RUNIT_VER
 ARG CENTOS_MIRROR_BASE_URL=http://linuxsoft.cern.ch/centos-vault/8.4.2105
 ARG LIBNFTNL_SOURCERPM_URL=${CENTOS_MIRROR_BASE_URL}/BaseOS/Source/SPackages/libnftnl-${LIBNFTNL_VER}.el8.src.rpm
-ARG IPTABLES_SOURCERPM_URL=${CENTOS_MIRROR_BASE_URL}/BaseOS/Source/SPackages/iptables-${IPTABLES_VER}.el8.src.rpm
+ARG IPTABLES_SOURCERPM_URL=https://iad.mirror.rackspace.com/centos-stream/9-stream/BaseOS/source/tree/Packages/iptables-${IPTABLES_VER}.el9.src.rpm
 ARG STREAM9_MIRROR_BASE_URL=https://iad.mirror.rackspace.com/centos-stream/9-stream
 ARG IPSET_SOURCERPM_URL=${STREAM9_MIRROR_BASE_URL}/BaseOS/source/tree/Packages/ipset-${IPSET_VER}.el9.src.rpm

@@ -165,7 +165,7 @@ RUN rm /etc/yum.repos.d/ubi.repo && \
     rpm --force -i /tmp/rpms/iptables-libs-${IPTABLES_VER}.el8.${ARCH}.rpm && \
     # Install compatible libnftnl version with selected iptables version
     rpm --force -i /tmp/rpms/libnftnl-${LIBNFTNL_VER}.el8.${ARCH}.rpm && \
-    rpm -i /tmp/rpms/iptables-${IPTABLES_VER}.el8.${ARCH}.rpm && \
+    rpm -i /tmp/rpms/iptables-nft-${IPTABLES_VER}.el8.${ARCH}.rpm && \
     # Install ipset version
     rpm --force -i /tmp/rpms/ipset-libs-${IPSET_VER}.el8.x86_64.rpm && \
     rpm -i /tmp/rpms/ipset-${IPSET_VER}.el8.x86_64.rpm && \
@@ -221,8 +221,8 @@ RUN chmod u+s /bin/mountns

 # Clean out as many files as we can from the filesystem.  We no longer need dnf or the platform python install
 # or any of its dependencies.
-ADD clean-up-filesystem.sh /
-RUN /clean-up-filesystem.sh

Log:

2024-01-16T10:43:52.162002831Z 2024-01-16 10:43:52.161 [INFO][80] felix/feature_detect_linux.go 170: Updating detected iptables features features=environment.Features{SNATFullyRandom:true, MASQFullyRandom:true, RestoreSupportsLock:true, ChecksumOffloadBroken:true, IPIPDeviceIsL3:true, KernelSideRouteFiltering:true} iptablesVersion=1.8.8 kernelVersion=6.1.66
2024-01-16T10:43:52.162141257Z 2024-01-16 10:43:52.161 [INFO][80] felix/table.go 344: Calculated old-insert detection regex. pattern="(?:-j|--jump) cali-|(?:-j|--jump) califw-|(?:-j|--jump) calitw-|(?:-j|--jump) califh-|(?:-j|--jump) calith-|(?:-j|--jump) calipi-|(?:-j|--jump) calipo-|(?:-j|--jump) felix-"
2024-01-16T10:43:52.162284671Z 2024-01-16 10:43:52.162 [INFO][80] felix/table.go 462: Enabling iptables-in-nftables-mode workarounds.
2024-01-16T10:43:52.162292968Z 2024-01-16 10:43:52.162 [INFO][80] felix/feature_detect_linux.go 410: Looked up iptables command backendMode="nft" candidates=[]string{"iptables-nft-restore", "iptables-restore"} command="iptables-nft-restore" ipVersion=0x4 saveOrRestore="restore"
2024-01-16T10:43:52.162432708Z 2024-01-16 10:43:52.162 [INFO][80] felix/feature_detect_linux.go 410: Looked up iptables command backendMode="nft" candidates=[]string{"iptables-nft-save", "iptables-save"} command="iptables-nft-save" ipVersion=0x4 saveOrRestore="save"

Pod (my PREROUTING rule is also there):

› k exec -it calico-node-mhn7f bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
Defaulted container "calico-node" out of: calico-node, upgrade-ipam (init), install-cni (init), mount-bpffs (init)
[root@node /]# iptables -V
iptables v1.8.8 (nf_tables)

# iptables-nft-save -t raw
# Generated by iptables-nft-save v1.8.8 (nf_tables) on Tue Jan 16 10:48:00 2024
*raw
:PREROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:cali-OUTPUT - [0:0]
:cali-PREROUTING - [0:0]
:cali-from-host-endpoint - [0:0]
:cali-rpf-skip - [0:0]
:cali-to-host-endpoint - [0:0]
-A PREROUTING -m comment --comment "cali:6gwbT8clXdHdC1b1" -j cali-PREROUTING
-A PREROUTING -i <uplink> -p icmp -m icmp --icmp-type 3/4 -j NFLOG --nflog-group 33
-A OUTPUT -m comment --comment "cali:tVnHkvAo15HuiPy0" -j cali-OUTPUT
-A cali-OUTPUT -m comment --comment "cali:njdnLwYeGqBJyMxW" -j MARK --set-xmark 0x0/0xf0000
-A cali-OUTPUT -m comment --comment "cali:rz86uTUcEZAfFsh7" -j cali-to-host-endpoint
-A cali-OUTPUT -m comment --comment "cali:pN0F5zD0b8yf9W1Z" -m mark --mark 0x10000/0x10000 -j ACCEPT
-A cali-PREROUTING -m comment --comment "cali:XFX5xbM8B9qR10JG" -j MARK --set-xmark 0x0/0xf0000
-A cali-PREROUTING -i cali+ -m comment --comment "cali:EWMPb0zVROM-woQp" -j MARK --set-xmark 0x40000/0x40000
-A cali-PREROUTING -m comment --comment "cali:PWuxTAIaFCtsg5Qa" -m mark --mark 0x40000/0x40000 -j cali-rpf-skip
-A cali-PREROUTING -m comment --comment "cali:fSSbGND7dgyemWU7" -m mark --mark 0x40000/0x40000 -m rpfilter --validmark --invert -j DROP
-A cali-PREROUTING -m comment --comment "cali:ImU0-4Rl2WoOI9Ou" -m mark --mark 0x0/0x40000 -j cali-from-host-endpoint
-A cali-PREROUTING -m comment --comment "cali:lV4V2MPoMBf0hl9T" -m mark --mark 0x10000/0x10000 -j ACCEPT
COMMIT
# Completed on Tue Jan 16 10:48:00 2024
[root@node /]#

@tomastigera
Copy link
Contributor

may also relate to #8025

@matthewdupre
Copy link
Member

I would like to work on this.
I'd be happy to take a PR for this @cyclinder - please give me a shout if you hit any unexpected problems (e.g. base image issues or similar). @mazdakn looked at a similar bump for ipset recently and would also be a good point of contact.

@mazdakn
Copy link
Member

mazdakn commented Jan 16, 2024

For ipset issue (#8372), we ended up changing the logic of parsing the ipset output instead of bumping ipset version, in order to prevent dealing with the same issue in future. I am not sure if this is doable here, but if possible, it would be a better option.

@cyclinder
Copy link
Contributor

cyclinder commented Jan 17, 2024

please give me a shout if you hit any unexpected problems

Thanks @matthewdupre @mazdakn, I'll start looking into this. As far as I know, kube-proxy keeps upgrading the version of iptables based on changing the base image, and you can find more details on https://github.com/kubernetes/release/tree/master/images/build/distroless-iptables, The latest release of kube-proxy uses iptables v1.8.9. It's good for us if we bump iptables to the latest.

@cyclinder
Copy link
Contributor

I'm try to building calico-node for amd64 in my local machine, but it failed due to clean-up-filesystem.sh, any advices? thanks!

#0 50.82 warning: file /usr/share/locale/en_GB/LC_MESSAGES/json-glib-1.0.mo: remove failed: No such file or directory
#0 51.08 Failed to disable unit, unit systemd-readahead-replay.service does not exist.
#0 51.08 Failed to disable unit, unit systemd-readahead-collect.service does not exist.
#0 51.14 warning: file /etc/rc.local: remove failed: No such file or directory
#0 51.38 install-info: No such file or directory for /usr/share/info/nettle.info
#0 51.73 install-info: No such file or directory for /usr/share/info/history.info
#0 51.73 install-info: No such file or directory for /usr/share/info/rluserman.info
#0 51.79 warning: file /usr/share/locale/en_GB/LC_MESSAGES/p11-kit.mo: remove failed: No such file or directory
#0 52.70 Binary is missing after RPM cleanup: /usr/sbin/ip6tables
------
Dockerfile.amd64:224
--------------------
 222 |     # or any of its dependencies.
 223 |     ADD clean-up-filesystem.sh /
 224 | >>> RUN /clean-up-filesystem.sh
 225 |
 226 |     # Copy everything into a fresh scratch image so that naive CVE scanners don't pick up binaries and libraries
--------------------
ERROR: failed to solve: process "/bin/sh -c /clean-up-filesystem.sh" did not complete successfully: exit code: 1
make: *** [Makefile:297: .calico_node.created-amd64] Error 1

@matthewdupre
Copy link
Member

matthewdupre commented Jan 17, 2024

What are you trying to run? make image? Most of the builds are intended to be ran by the go-build container, it should hopefully be straightforward if you start from the right make targets. make ut should also be useful (and should have enough dependencies to rebuild everything).

@mazdakn
Copy link
Member

mazdakn commented Jan 17, 2024

@cyclinder clean-up-filesystem.sh is executed to clean up of the file system of the Calico node image. There is a list of expected binaries in that script which should match with the binaries expressed in the Docker file. In this output it seems it's complaining about ip6tables not being present after executing of the script. Do you have a PR to share to check your changes?

@cyclinder
Copy link
Contributor

@matthewdupre Yes, I updated the Dockerfile.amd64 and ran make node.

@mazdakn Thanks for the details, I opened a draft PR, please see #8416

@matthewdupre
Copy link
Member

CC @hjiawei

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants