Kilo images bundle an outdated version of iptables causing reconciliation errors #388

TPXP · 2024-06-27T15:18:16Z

Hello,

We are running kilo on a cluster with two different node pools, one of which has the following system details:

  Kernel Version:             6.2.0-36-generic
  OS Image:                   Ubuntu 22.04.3 LTS 269b0b8ce7
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.7.13
  Kubelet Version:            v1.29.1
  Kube-Proxy Version:         v1.29.1

Our Grafana dashboard shows these nodes consistently face reconciliation errors

Kilo logs clearly point to an error in an iptables call.

{"caller":"mesh.go:262","component":"kilo","error":"failed to reconcile rules: failed to check if rule exists: failed to populate chains for table \"filter\": running [/sbin/iptables -t filter -S --wait]: exit status 1: iptables v1.8.4 (nf_tables): table `filter' is incompatible, use 'nft' tool.\n\n","level":"error","ts":"2024-06-27T14:54:04.579160946Z"}

We also see that our other nodes frequently have segmentation faults in the iptables binary, which correlate with times when kilo calls iptables:

dmesg

[26878597.899323] iptables[2537594]: segfault at 7fba32878dd0 ip 00007fba328a00ad sp 00007ffe13a7f4e8 error 4 in libnftnl.so.11.3.0[7fba3289c000+16000]
[26878597.899334] Code: 83 ec 18 0f b7 f6 b9 01 00 00 00 88 54 24 0c 48 8d 54 24 0c e8 b4 c5 ff ff 48 83 c4 18 c3 49 89 f8 48 83 c9 ff 48 89 d7 31 c0 <f2> ae 0f b7 f6 4c 89 c7 f7 d1 e9 94 c5 ff ff 0f b7 f6 31 c9 e9 8a

kilo

{"caller":"mesh.go:262","component":"kilo","error":"failed to reconcile rules: failed to check if rule exists: failed to populate chains for table \"nat\": running [/sbin/iptables -t nat -S --wait]: exit status -1: ","level":"error","ts":"2024-06-27T14:17:50.343571655Z"}

Node details

  Kernel Version:             5.15.0-79-generic
  OS Image:                   Ubuntu 22.04.4 LTS
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.6.33
  Kubelet Version:            v1.29.1
  Kube-Proxy Version:         v1.29.1

The libnftnl.so.11.3.0 file is not present on the host, it only exists in containers (find found it under /var/lib/containerd/ or /run/containerd). In fact, we found this file in the kilo container at /usr/lib/libnftnl.so.11.3.0.

Cursiously, running /sbin/iptables -t nat -S --wait from a shell inside the kilo container (docker.io/squat/kilo:0.6.0) works without causing a segfault or an error 🤔

The Kilo container image ships with iptables 1.8.4, which is a little old. iptables has seen some recent updates addressing the use "nft" tool instead error, and our other containers touching networking (mostly kube-router) use iptables v1.8.9. Under the hood, kube-router containers run ipset v7.17 while kilo provides ipset v7.6. Since everything seems to run smoothly with kube-router, I think upgrading the kilo image to ship these versions would help. Is there any reason to keep these old versions?

I see the kilo image relies on alpine. Maybe bumping it to Alpine v3.18 (seems like latest versions have a few annoying bugs - see cloudnativelabs/kube-router#1678) will help?

The text was updated successfully, but these errors were encountered:

squat · 2024-06-27T15:40:07Z

Thank you for the extremely detailed report. There is no reason for the old iptables packages, there simply haven't been error reports for the packages. The package versions are old because we are using old alpine base images. This should be easily addressed with a base image update.

TPXP · 2024-06-27T15:46:07Z

Thanks a lot for the very prompt reply, I agree upgrading the base image should indeed fix it 👍

squat · 2024-06-27T16:10:48Z

@TPXP can you try the newest Kilo tag to see if that fixes your issue? The newest tag will be 0122dec8f16a61518dd02899501a8e8756387b76. Should be built soon!

TPXP changed the title ~~Kilo containers bundle an outdated version of iptables causing reconciliation errors~~ Kilo images bundle an outdated version of iptables causing reconciliation errors Jun 27, 2024

squat mentioned this issue Jun 27, 2024

Docker: update iptables packages #389

Merged

squat closed this as completed in bd36158 Jun 27, 2024

squat closed this as completed in #389 Jun 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kilo images bundle an outdated version of iptables causing reconciliation errors #388

Kilo images bundle an outdated version of iptables causing reconciliation errors #388

TPXP commented Jun 27, 2024 •

edited

Loading

squat commented Jun 27, 2024

TPXP commented Jun 27, 2024

squat commented Jun 27, 2024

Kilo images bundle an outdated version of iptables causing reconciliation errors #388

Kilo images bundle an outdated version of iptables causing reconciliation errors #388

Comments

TPXP commented Jun 27, 2024 • edited Loading

squat commented Jun 27, 2024

TPXP commented Jun 27, 2024

squat commented Jun 27, 2024

TPXP commented Jun 27, 2024 •

edited

Loading