Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Networking/NAT not working on older iptables versions with v1.25.7+k3s1 #7096

Closed
noselasd opened this issue Mar 15, 2023 · 16 comments
Closed
Assignees
Labels
kind/bug Something isn't working kind/upstream-issue This issue appears to be caused by an upstream bug
Milestone

Comments

@noselasd
Copy link

Environmental Info:
K3s Version:
k3s version v1.25.7+k3s1 (f7c20e2)
go version go1.19.6

**Node(s) CPU architecture, OS, and Version: **

Linux dev01-spa 3.10.0-1160.83.1.el7.x86_64 #1 SMP Wed Jan 25 16:41:43 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Running CentOS Linux release 7.9.2009

Cluster Configuration:
Single node.

Describe the bug:
Pods can not access external networks. This works fine if going back to v1.25.6+k3s1
It turns out outgoing traffic from the host is not NATed anymore.

e.g. running tcpdump on the ens32 interface, which is the NIC owning the IP 192.168.170.101 of the host, that's attached to the
local subnet.

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens32, link-type EN10MB (Ethernet), capture size 262144 bytes
09:47:47.668554 ARP, Request who-has 192.168.170.101 (00:50:56:bd:82:75) tell 192.168.170.172, length 46
09:47:47.668566 ARP, Reply 192.168.170.101 is-at 00:50:56:bd:82:75, length 28
09:47:48.018543 IP 192.168.170.101.41440 > 192.168.170.226.9092: Flags [.], ack 17, win 15708, options [nop,nop,TS val 568150586 ecr 3186318162], length 0
09:47:48.069123 IP 10.42.0.18 > 8.8.8.8: ICMP echo request, id 7, seq 3, length 64
09:47:48.145339 STP 802.1w, Rapid STP, Flags [Proposal, Learn, Forward], bridge-id 8000.ec:02:73:2e:91:40.8017, length 43
09:47:49.069190 IP 10.42.0.18 > 8.8.8.8: ICMP echo request, id 7, seq 4, length 64

The line above with `10.42.0.18 > 8.8.8.8:? is the result of a pod manually doing a ping 8.8.8.8
The pod network seems not to be NAT'ed out, the 10.42.0.18 address should not appear on our internal network.

Steps To Reproduce:
Installed k3s with with curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--disable traefik" sh -
on CentOS Linux release 7.9.2009

Ran

  kubectl run -it --rm --restart=Never --image busybox tempbusybox -- sh

And manually ran ping 8.8.8.8 from this pod, which never get any replies (see the above tcpdump output)

Expected behavior:
With v1.25.6+k3s1 everything works ok on this node.

Installed with curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.25.6+k3s1 INSTALL_K3S_EXEC="--disable traefik" sh -
Ran

  kubectl run -it --rm --restart=Never --image busybox tempbusybox -- sh

And manually running ping 8.8.8.8 from the bosybox pod succeeds.

[root@dev01-spa ~]# tcpdump not port 22  -nnn -i ens32
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens32, link-type EN10MB (Ethernet), capture size 262144 bytes
10:01:03.070946 IP 192.168.170.164.5353 > 224.0.0.251.5353: 0 [4q] [1au] PTR (QU)? _hap._tcp.local. PTR (QU)? _hap._udp.local. PTR (QU)? _companion-link._tcp.local. PTR (QU)? _sleep-proxy._udp.local. (119)
10:01:03.070994 IP6 fe80::1474:5c2b:f047:36a2.5353 > ff02::fb.5353: 0 [4q] [1au] PTR (QU)? _hap._tcp.local. PTR (QU)? _hap._udp.local. PTR (QU)? _companion-link._tcp.local. PTR (QU)? _sleep-proxy._udp.local. (119)
10:01:03.109346 IP 192.168.170.101 > 8.8.8.8: ICMP echo request, id 7, seq 0, length 64
10:01:03.122974 IP 8.8.8.8 > 192.168.170.101: ICMP echo reply, id 7, seq 0, length 64

Now the external traffic gets NAT'ed, as seen with 192.168.170.101 > 8.8.8.8: above, 10.42.0..0/24 network is not leaking out of the host.

Additional context / logs:

@brandond
Copy link
Contributor

Any ideas @rbrtbnfgl ?

Have you rebooted the node since upgrading? I wonder if this is a case of rule order changing between releases.

@rbrtbnfgl
Copy link
Contributor

Could you check the nat iptables rules?
iptables -t nat -vnL

@noselasd
Copy link
Author

Any ideas @rbrtbnfgl ?

Have you rebooted the node since upgrading? I wonder if this is a case of rule order changing between releases.

No the node has not been rebooted. I've done a /usr/local/bin/k3s-uninstall.sh and checked that the nat tables was empty befure installing.

Could you check the nat iptables rules?
iptables -t nat -vnL

I've brought in k3s-v1.25.6+k3s1 again now on the node
Here's the output , and networking is working:

Chain PREROUTING (policy ACCEPT 17 packets, 3011 bytes)
 pkts bytes target     prot opt in     out     source               destination
   33  4921 KUBE-SERVICES  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */

Chain INPUT (policy ACCEPT 16 packets, 2927 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 79 packets, 4796 bytes)
 pkts bytes target     prot opt in     out     source               destination
  272 17846 KUBE-SERVICES  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */

Chain POSTROUTING (policy ACCEPT 79 packets, 4796 bytes)
 pkts bytes target     prot opt in     out     source               destination
  253 16790 FLANNEL-POSTRTG  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* flanneld masq */
  295 19222 KUBE-POSTROUTING  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes postrouting rules */

Chain FLANNEL-POSTRTG (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            mark match 0x4000/0x4000 /* flanneld masq */
   71  4260 RETURN     all  --  *      *       10.42.0.0/16         10.42.0.0/16         /* flanneld masq */
    3   264 MASQUERADE  all  --  *      *       10.42.0.0/16        !224.0.0.0/4          /* flanneld masq */
    0     0 RETURN     all  --  *      *      !10.42.0.0/16         10.42.0.0/24         /* flanneld masq */
    0     0 MASQUERADE  all  --  *      *      !10.42.0.0/16         10.42.0.0/16         /* flanneld masq */

Chain KUBE-KUBELET-CANARY (0 references)
 pkts bytes target     prot opt in     out     source               destination

Chain KUBE-MARK-DROP (0 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            MARK or 0x8000

Chain KUBE-MARK-MASQ (10 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            MARK or 0x4000

Chain KUBE-NODEPORTS (1 references)
 pkts bytes target     prot opt in     out     source               destination

Chain KUBE-POSTROUTING (1 references)
 pkts bytes target     prot opt in     out     source               destination
   79  4796 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            mark match ! 0x4000/0x4000
    0     0 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            MARK xor 0x4000
    0     0 MASQUERADE  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes service traffic requiring SNAT */

Chain KUBE-PROXY-CANARY (0 references)
 pkts bytes target     prot opt in     out     source               destination

Chain KUBE-SEP-3YLRQZ6UJQXS6T5Q (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  all  --  *      *       10.42.0.3            0.0.0.0/0            /* kube-system/kube-dns:metrics */
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:metrics */ tcp to:10.42.0.3:9153

Chain KUBE-SEP-ES43PXGNPFYDKWOB (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  all  --  *      *       192.168.170.101      0.0.0.0/0            /* default/kubernetes:https */
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* default/kubernetes:https */ tcp to:192.168.170.101:6443

Chain KUBE-SEP-GQFJJFHKHZPIAODM (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  all  --  *      *       10.42.0.3            0.0.0.0/0            /* kube-system/kube-dns:dns-tcp */
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:dns-tcp */ tcp to:10.42.0.3:53

Chain KUBE-SEP-HOT2XMKPNFS7SQ2N (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  all  --  *      *       10.42.0.3            0.0.0.0/0            /* kube-system/kube-dns:dns */
    0     0 DNAT       udp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:dns */ udp to:10.42.0.3:53

Chain KUBE-SEP-U5J636EHMZWW53MJ (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  all  --  *      *       10.42.0.4            0.0.0.0/0            /* kube-system/metrics-server:https */
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/metrics-server:https */ tcp to:10.42.0.4:10250

Chain KUBE-SERVICES (2 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-SVC-JD5MR3NA4I4DYORP  tcp  --  *      *       0.0.0.0/0            10.43.0.10           /* kube-system/kube-dns:metrics cluster IP */ tcp dpt:9153
    0     0 KUBE-SVC-Z4ANX4WAEWEBLCTM  tcp  --  *      *       0.0.0.0/0            10.43.92.148         /* kube-system/metrics-server:https cluster IP */ tcp dpt:443
    0     0 KUBE-SVC-NPX46M4PTMTKRN6Y  tcp  --  *      *       0.0.0.0/0            10.43.0.1            /* default/kubernetes:https cluster IP */ tcp dpt:443
    0     0 KUBE-SVC-TCOU7JCQXEZGVUNU  udp  --  *      *       0.0.0.0/0            10.43.0.10           /* kube-system/kube-dns:dns cluster IP */ udp dpt:53
    0     0 KUBE-SVC-ERIFXISQEP7F7OF4  tcp  --  *      *       0.0.0.0/0            10.43.0.10           /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:53
   27  1620 KUBE-NODEPORTS  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL

Chain KUBE-SVC-ERIFXISQEP7F7OF4 (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  tcp  --  *      *      !10.42.0.0/16         10.43.0.10           /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:53
    0     0 KUBE-SEP-GQFJJFHKHZPIAODM  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:dns-tcp -> 10.42.0.3:53 */

Chain KUBE-SVC-JD5MR3NA4I4DYORP (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  tcp  --  *      *      !10.42.0.0/16         10.43.0.10           /* kube-system/kube-dns:metrics cluster IP */ tcp dpt:9153
    0     0 KUBE-SEP-3YLRQZ6UJQXS6T5Q  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:metrics -> 10.42.0.3:9153 */

Chain KUBE-SVC-NPX46M4PTMTKRN6Y (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  tcp  --  *      *      !10.42.0.0/16         10.43.0.1            /* default/kubernetes:https cluster IP */ tcp dpt:443
    0     0 KUBE-SEP-ES43PXGNPFYDKWOB  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* default/kubernetes:https -> 192.168.170.101:6443 */

Chain KUBE-SVC-TCOU7JCQXEZGVUNU (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  udp  --  *      *      !10.42.0.0/16         10.43.0.10           /* kube-system/kube-dns:dns cluster IP */ udp dpt:53
    0     0 KUBE-SEP-HOT2XMKPNFS7SQ2N  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:dns -> 10.42.0.3:53 */

Chain KUBE-SVC-Z4ANX4WAEWEBLCTM (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  tcp  --  *      *      !10.42.0.0/16         10.43.92.148         /* kube-system/metrics-server:https cluster IP */ tcp dpt:443
    0     0 KUBE-SEP-U5J636EHMZWW53MJ  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/metrics-server:https -> 10.42.0.4:10250 */

Then ran /usr/local/bin/k3s-uninstall.sh and then the rules look like this:

# iptables -t nat -vnL
Chain PREROUTING (policy ACCEPT 88 packets, 12391 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain INPUT (policy ACCEPT 88 packets, 12391 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 165 packets, 9900 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain POSTROUTING (policy ACCEPT 165 packets, 9900 bytes)
 pkts bytes target     prot opt in     out     source               destination

Then Installed new version: curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--disable traefik" sh -
With k3s-v1.25.7+k3s1 (networking not working), the rules look like this:

Chain PREROUTING (policy ACCEPT 32 packets, 2091 bytes)
 pkts bytes target     prot opt in     out     source               destination
   44  3984 KUBE-SERVICES  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */

Chain INPUT (policy ACCEPT 5 packets, 342 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 73 packets, 4380 bytes)
 pkts bytes target     prot opt in     out     source               destination
  249 16760 KUBE-SERVICES  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */

Chain POSTROUTING (policy ACCEPT 101 packets, 6189 bytes)
 pkts bytes target     prot opt in     out     source               destination
  295 19681 KUBE-POSTROUTING  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes postrouting rules */

Chain FLANNEL-POSTRTG (0 references)
 pkts bytes target     prot opt in     out     source               destination

Chain KUBE-KUBELET-CANARY (0 references)
 pkts bytes target     prot opt in     out     source               destination

Chain KUBE-MARK-DROP (0 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            MARK or 0x8000

Chain KUBE-MARK-MASQ (8 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            MARK or 0x4000

Chain KUBE-NODEPORTS (1 references)
 pkts bytes target     prot opt in     out     source               destination

Chain KUBE-POSTROUTING (1 references)
 pkts bytes target     prot opt in     out     source               destination
  101  6189 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            mark match ! 0x4000/0x4000
    0     0 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            MARK xor 0x4000
    0     0 MASQUERADE  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes service traffic requiring SNAT */

Chain KUBE-PROXY-CANARY (0 references)
 pkts bytes target     prot opt in     out     source               destination

Chain KUBE-SEP-ES43PXGNPFYDKWOB (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  all  --  *      *       192.168.170.101      0.0.0.0/0            /* default/kubernetes:https */
    1    60 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* default/kubernetes:https */ tcp to:192.168.170.101:6443

Chain KUBE-SEP-J5U7JDFDTOWHXPVR (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  all  --  *      *       10.42.0.4            0.0.0.0/0            /* kube-system/kube-dns:dns */
    0     0 DNAT       udp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:dns */ udp to:10.42.0.4:53

Chain KUBE-SEP-K5XXJLQU5KR42YSX (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  all  --  *      *       10.42.0.4            0.0.0.0/0            /* kube-system/kube-dns:dns-tcp */
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:dns-tcp */ tcp to:10.42.0.4:53

Chain KUBE-SEP-OWVYFKC6LI5UONET (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  all  --  *      *       10.42.0.4            0.0.0.0/0            /* kube-system/kube-dns:metrics */
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:metrics */ tcp to:10.42.0.4:9153

Chain KUBE-SERVICES (2 references)
 pkts bytes target     prot opt in     out     source               destination
    1    60 KUBE-SVC-NPX46M4PTMTKRN6Y  tcp  --  *      *       0.0.0.0/0            10.43.0.1            /* default/kubernetes:https cluster IP */ tcp dpt:443
    0     0 KUBE-SVC-TCOU7JCQXEZGVUNU  udp  --  *      *       0.0.0.0/0            10.43.0.10           /* kube-system/kube-dns:dns cluster IP */ udp dpt:53
    0     0 KUBE-SVC-ERIFXISQEP7F7OF4  tcp  --  *      *       0.0.0.0/0            10.43.0.10           /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:53
    0     0 KUBE-SVC-JD5MR3NA4I4DYORP  tcp  --  *      *       0.0.0.0/0            10.43.0.10           /* kube-system/kube-dns:metrics cluster IP */ tcp dpt:9153
   58  3480 KUBE-NODEPORTS  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL

Chain KUBE-SVC-ERIFXISQEP7F7OF4 (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  tcp  --  *      *      !10.42.0.0/16         10.43.0.10           /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:53
    0     0 KUBE-SEP-K5XXJLQU5KR42YSX  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:dns-tcp -> 10.42.0.4:53 */

Chain KUBE-SVC-JD5MR3NA4I4DYORP (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  tcp  --  *      *      !10.42.0.0/16         10.43.0.10           /* kube-system/kube-dns:metrics cluster IP */ tcp dpt:9153
    0     0 KUBE-SEP-OWVYFKC6LI5UONET  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:metrics -> 10.42.0.4:9153 */

Chain KUBE-SVC-NPX46M4PTMTKRN6Y (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  tcp  --  *      *      !10.42.0.0/16         10.43.0.1            /* default/kubernetes:https cluster IP */ tcp dpt:443
    1    60 KUBE-SEP-ES43PXGNPFYDKWOB  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* default/kubernetes:https -> 192.168.170.101:6443 */

Chain KUBE-SVC-TCOU7JCQXEZGVUNU (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-MARK-MASQ  udp  --  *      *      !10.42.0.0/16         10.43.0.10           /* kube-system/kube-dns:dns cluster IP */ udp dpt:53
    0     0 KUBE-SEP-J5U7JDFDTOWHXPVR  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:dns -> 10.42.0.4:53 */

@rbrtbnfgl
Copy link
Contributor

With those rules is the NAT working?

@noselasd
Copy link
Author

working

With those rules is the NAT working?
Sorry if it wasn’t clear.
The first set of rules are from k3s-v1.25.6+k3s1 and everything is working

The second set of rules are from k3s-v1.25.7+k3s1 and NAT/networking is not working . I can switch back and forth between those two versions, uninstalling k3s inbetween , with consistent results: k3s-v1.25.6+k3s1 works, k3s-v1.25.7+k3s1 does not work

@rbrtbnfgl
Copy link
Contributor

On the second set of rules flannel rules are missing. Could you check K3s logs if there are any errors?

@noselasd
Copy link
Author

Yes, you're right - iptables fails to apply flanneld rules. Below is the last few lines of journalctl -u k3s

CentOS 7.9.2009 comes with iptables v1.4.21 , there's no newer version available for CentOS/RHEL 7 afaik.

Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.354832   20998 topology_manager.go:205] "Topology Admit Handler"
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.388815   20998 reconciler.go:357] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-tkhpx\" (UniqueName: \"kubernetes.io/projected/daf3a01f-0110-4ac0-8c5f-2602d39b7f8f-kube-api-access-tkhpx\") pod \"metrics-server-5f9f776df5-6jrm5\" (UID: \"daf3a01f-0110-4ac0-8c5f-2602d39b7f8f\") " pod="kube-system/metrics-server-5f9f776df5-6jrm5"
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.388849   20998 reconciler.go:357] "operationExecutor.VerifyControllerAttachedVolume started for volume \"tmp-dir\" (UniqueName: \"kubernetes.io/empty-dir/daf3a01f-0110-4ac0-8c5f-2602d39b7f8f-tmp-dir\") pod \"metrics-server-5f9f776df5-6jrm5\" (UID: \"daf3a01f-0110-4ac0-8c5f-2602d39b7f8f\") " pod="kube-system/metrics-server-5f9f776df5-6jrm5"
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.420710   20998 topology_manager.go:205] "Topology Admit Handler"
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.420797   20998 topology_manager.go:205] "Topology Admit Handler"
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.495475   20998 reconciler.go:357] "operationExecutor.VerifyControllerAttachedVolume started for volume \"config-volume\" (UniqueName: \"kubernetes.io/configmap/f0586ad5-f8ce-4555-aaf6-6a3f184733ec-config-volume\") pod \"local-path-provisioner-79f67d76f8-2kwm8\" (UID: \"f0586ad5-f8ce-4555-aaf6-6a3f184733ec\") " pod="kube-system/local-path-provisioner-79f67d76f8-2kwm8"
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.495509   20998 reconciler.go:357] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-r7bxc\" (UniqueName: \"kubernetes.io/projected/f0586ad5-f8ce-4555-aaf6-6a3f184733ec-kube-api-access-r7bxc\") pod \"local-path-provisioner-79f67d76f8-2kwm8\" (UID: \"f0586ad5-f8ce-4555-aaf6-6a3f184733ec\") " pod="kube-system/local-path-provisioner-79f67d76f8-2kwm8"
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.587202   20998 kube.go:151] Node controller sync successful
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.587267   20998 vxlan.go:138] VXLAN config: VNI=1 Port=0 GBP=false Learning=false DirectRouting=false
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.592957   20998 kube.go:503] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.42.0.0/24]
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.596029   20998 reconciler.go:357] "operationExecutor.VerifyControllerAttachedVolume started for volume \"config-volume\" (UniqueName: \"kubernetes.io/configmap/7758ebac-003b-4e0b-a43b-56638b4f7769-config-volume\") pod \"coredns-597584b69b-b4q92\" (UID: \"7758ebac-003b-4e0b-a43b-56638b4f7769\") " pod="kube-system/coredns-597584b69b-b4q92"
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.596054   20998 reconciler.go:357] "operationExecutor.VerifyControllerAttachedVolume started for volume \"custom-config-volume\" (UniqueName: \"kubernetes.io/configmap/7758ebac-003b-4e0b-a43b-56638b4f7769-custom-config-volume\") pod \"coredns-597584b69b-b4q92\" (UID: \"7758ebac-003b-4e0b-a43b-56638b4f7769\") " pod="kube-system/coredns-597584b69b-b4q92"
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.596073   20998 reconciler.go:357] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-n2hz6\" (UniqueName: \"kubernetes.io/projected/7758ebac-003b-4e0b-a43b-56638b4f7769-kube-api-access-n2hz6\") pod \"coredns-597584b69b-b4q92\" (UID: \"7758ebac-003b-4e0b-a43b-56638b4f7769\") " pod="kube-system/coredns-597584b69b-b4q92"
Mar 15 17:57:45 dev01-spa k3s[20998]: time="2023-03-15T17:57:45+01:00" level=info msg="Wrote flannel subnet file to /run/flannel/subnet.env"
Mar 15 17:57:45 dev01-spa k3s[20998]: time="2023-03-15T17:57:45+01:00" level=info msg="Running flannel backend."
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.623555   20998 vxlan_network.go:62] watching for new subnet leases
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.623670   20998 iptables.go:274] generated 3 rules
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.651075   20998 iptables.go:274] generated 7 rules
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.731563   20998 iptables.go:267] bootstrap done
Mar 15 17:57:45 dev01-spa k3s[20998]: E0315 17:57:45.880874   20998 iptables.go:291] Failed to bootstrap IPTables: failed to setup iptables-restore payload: failed to check rule existence: running [/usr/sbin/iptables -t nat -C FLANNEL-POSTRTG -s 10.42.0.0/16 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE  --wait]: exit status 2: Bad argument `'
Mar 15 17:57:45 dev01-spa k3s[20998]: Try `iptables -h' or 'iptables --help' for more information.
Mar 15 17:57:45 dev01-spa k3s[20998]: I0315 17:57:45.888770   20998 iptables.go:405] Some iptables rules are missing; deleting and recreating rules
Mar 15 17:57:45 dev01-spa k3s[20998]: E0315 17:57:45.903965   20998 iptables.go:304] Failed to ensure iptables rules: error setting up rules: failed to setup iptables-restore payload: failed to check rule existence: running [/usr/sbin/iptables -t nat -C FLANNEL-POSTRTG -s 10.42.0.0/16 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE  --wait]: exit status 2: Bad argument `'
Mar 15 17:57:45 dev01-spa k3s[20998]: Try `iptables -h' or 'iptables --help' for more information.
Mar 15 17:57:46 dev01-spa k3s[20998]: W0315 17:57:46.035769   20998 handler_proxy.go:105] no RequestInfo found in the context
Mar 15 17:57:46 dev01-spa k3s[20998]: W0315 17:57:46.035787   20998 handler_proxy.go:105] no RequestInfo found in the context
Mar 15 17:57:46 dev01-spa k3s[20998]: E0315 17:57:46.035797   20998 controller.go:113] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: Error, could not get list of group versions for APIService
Mar 15 17:57:46 dev01-spa k3s[20998]: I0315 17:57:46.035805   20998 controller.go:126] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
Mar 15 17:57:46 dev01-spa k3s[20998]: E0315 17:57:46.035842   20998 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
Mar 15 17:57:46 dev01-spa k3s[20998]: , Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
Mar 15 17:57:46 dev01-spa k3s[20998]: I0315 17:57:46.037000   20998 controller.go:129] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
Mar 15 17:58:31 dev01-spa k3s[20998]: E0315 17:58:31.210003   20998 remote_runtime.go:625] "ContainerStatus from runtime service failed" err="rpc error: code = NotFound desc = an error occurred when try to find container \"52e97f443d18586dc27ede348fa9d2da2dd283c0195642597e25f9e517fad16d\": not found" containerID="52e97f443d18586dc27ede348fa9d2da2dd283c0195642597e25f9e517fad16d"
Mar 15 17:58:31 dev01-spa k3s[20998]: I0315 17:58:31.210034   20998 kuberuntime_gc.go:361] "Error getting ContainerStatus for containerID" containerID="52e97f443d18586dc27ede348fa9d2da2dd283c0195642597e25f9e517fad16d" err="rpc error: code = NotFound desc = an error occurred when try to find container \"52e97f443d18586dc27ede348fa9d2da2dd283c0195642597e25f9e517fad16d\": not found"
Mar 15 17:58:31 dev01-spa k3s[20998]: E0315 17:58:31.210319   20998 remote_runtime.go:625] "ContainerStatus from runtime service failed" err="rpc error: code = NotFound desc = an error occurred when try to find container \"5850f8d339efe5e64a0a1496d30c90a63890602c222d90ca37841f7bdb3c1b76\": not found" containerID="5850f8d339efe5e64a0a1496d30c90a63890602c222d90ca37841f7bdb3c1b76"
Mar 15 17:58:31 dev01-spa k3s[20998]: I0315 17:58:31.210345   20998 kuberuntime_gc.go:361] "Error getting ContainerStatus for containerID" containerID="5850f8d339efe5e64a0a1496d30c90a63890602c222d90ca37841f7bdb3c1b76" err="rpc error: code = NotFound desc = an error occurred when try to find container \"5850f8d339efe5e64a0a1496d30c90a63890602c222d90ca37841f7bdb3c1b76\": not found"
Mar 15 17:58:31 dev01-spa k3s[20998]: E0315 17:58:31.210605   20998 remote_runtime.go:625] "ContainerStatus from runtime service failed" err="rpc error: code = NotFound desc = an error occurred when try to find container \"c6b283df8fc55cc1f1e5523fade6f3bd981408288ffe41c865f4deb0f91f4b20\": not found" containerID="c6b283df8fc55cc1f1e5523fade6f3bd981408288ffe41c865f4deb0f91f4b20"
Mar 15 17:58:31 dev01-spa k3s[20998]: I0315 17:58:31.210627   20998 kuberuntime_gc.go:361] "Error getting ContainerStatus for containerID" containerID="c6b283df8fc55cc1f1e5523fade6f3bd981408288ffe41c865f4deb0f91f4b20" err="rpc error: code = NotFound desc = an error occurred when try to find container \"c6b283df8fc55cc1f1e5523fade6f3bd981408288ffe41c865f4deb0f91f4b20\": not found"
Mar 15 17:58:31 dev01-spa k3s[20998]: E0315 17:58:31.210863   20998 remote_runtime.go:625] "ContainerStatus from runtime service failed" err="rpc error: code = NotFound desc = an error occurred when try to find container \"c4266974dfd283761ca638c1fb8754e9d3b30236e29c5d500bd85eef136cb7c0\": not found" containerID="c4266974dfd283761ca638c1fb8754e9d3b30236e29c5d500bd85eef136cb7c0"
Mar 15 17:58:31 dev01-spa k3s[20998]: I0315 17:58:31.210887   20998 kuberuntime_gc.go:361] "Error getting ContainerStatus for containerID" containerID="c4266974dfd283761ca638c1fb8754e9d3b30236e29c5d500bd85eef136cb7c0" err="rpc error: code = NotFound desc = an error occurred when try to find container \"c4266974dfd283761ca638c1fb8754e9d3b30236e29c5d500bd85eef136cb7c0\": not found"
Mar 15 17:58:45 dev01-spa k3s[20998]: I0315 17:58:45.906688   20998 iptables.go:405] Some iptables rules are missing; deleting and recreating rules
Mar 15 17:58:45 dev01-spa k3s[20998]: E0315 17:58:45.915493   20998 iptables.go:304] Failed to ensure iptables rules: error setting up rules: failed to setup iptables-restore payload: failed to check rule existence: running [/usr/sbin/iptables -t nat -C FLANNEL-POSTRTG -s 10.42.0.0/16 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE  --wait]: exit status 2: Bad argument `'
Mar 15 17:58:45 dev01-spa k3s[20998]: Try `iptables -h' or 'iptables --help' for more information.
Mar 15 17:59:45 dev01-spa k3s[20998]: I0315 17:59:45.918023   20998 iptables.go:405] Some iptables rules are missing; deleting and recreating rules
Mar 15 17:59:45 dev01-spa k3s[20998]: E0315 17:59:45.926240   20998 iptables.go:304] Failed to ensure iptables rules: error setting up rules: failed to setup iptables-restore payload: failed to check rule existence: running [/usr/sbin/iptables -t nat -C FLANNEL-POSTRTG -s 10.42.0.0/16 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE  --wait]: exit status 2: Bad argument `'
Mar 15 17:59:45 dev01-spa k3s[20998]: Try `iptables -h' or 'iptables --help' for more information.
Mar 15 18:00:45 dev01-spa k3s[20998]: I0315 18:00:45.928998   20998 iptables.go:405] Some iptables rules are missing; deleting and recreating rules
Mar 15 18:00:45 dev01-spa k3s[20998]: E0315 18:00:45.970500   20998 iptables.go:304] Failed to ensure iptables rules: error setting up rules: failed to setup iptables-restore payload: failed to check rule existence: running [/usr/sbin/iptables -t nat -C FLANNEL-POSTRTG -s 10.42.0.0/16 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE  --wait]: exit status 2: Bad argument `'
Mar 15 18:00:45 dev01-spa k3s[20998]: Try `iptables -h' or 'iptables --help' for more information.
Mar 15 18:01:45 dev01-spa k3s[20998]: I0315 18:01:45.986308   20998 iptables.go:405] Some iptables rules are missing; deleting and recreating rules
Mar 15 18:01:46 dev01-spa k3s[20998]: E0315 18:01:46.071078   20998 iptables.go:304] Failed to ensure iptables rules: error setting up rules: failed to setup iptables-restore payload: failed to check rule existence: running [/usr/sbin/iptables -t nat -C FLANNEL-POSTRTG -s 10.42.0.0/16 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE  --wait]: exit status 2: Bad argument `'
Mar 15 18:01:46 dev01-spa k3s[20998]: Try `iptables -h' or 'iptables --help' for more information.
Mar 15 18:02:25 dev01-spa k3s[20998]: time="2023-03-15T18:02:25+01:00" level=info msg="COMPACT revision 0 has already been compacted"
Mar 15 18:02:46 dev01-spa k3s[20998]: I0315 18:02:46.073619   20998 iptables.go:405] Some iptables rules are missing; deleting and recreating rules
Mar 15 18:02:46 dev01-spa k3s[20998]: E0315 18:02:46.082009   20998 iptables.go:304] Failed to ensure iptables rules: error setting up rules: failed to setup iptables-restore payload: failed to check rule existence: running [/usr/sbin/iptables -t nat -C FLANNEL-POSTRTG -s 10.42.0.0/16 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE  --wait]: exit status 2: Bad argument `'
Mar 15 18:02:46 dev01-spa k3s[20998]: Try `iptables -h' or 'iptables --help' for more information.

If the log is verbatim, this command is failing:

/usr/sbin/iptables -t nat -C FLANNEL-POSTRTG -s 10.42.0.0/16 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE  --wait

The comment should be quoted like

/usr/sbin/iptables -t nat -C FLANNEL-POSTRTG -s 10.42.0.0/16 ! -d 224.0.0.0/4 -m comment --comment 'flanneld masq' -j MASQUERADE  --wait

@rbrtbnfgl
Copy link
Contributor

Ok. It's probably related to the different iptables version on Centos. I'll check how the cmd is parsed on flannel.

@brandond
Copy link
Contributor

Can you try starting k3s with --prefer-bundled-bin? I'm curious if the bundled version of iptables handles that differently.

@noselasd
Copy link
Author

Can you try starting k3s with --prefer-bundled-bin? I'm curious if the bundled version of iptables handles that differently.

Thank you, that works !
I did a few of tests with both:

 curl -sfL https://get.k3s.io |   INSTALL_K3S_EXEC="--disable traefik --prefer-bundled-bin" sh -
 curl -sfL https://get.k3s.io |   INSTALL_K3S_EXEC="--disable traefik" sh -

And --prefer-bundled-bin makes networking work as far as I can tell

@brandond
Copy link
Contributor

OK. That suggests that some recent changes to Flannel have broken compatibility with older versions of iptables. Can you confirm what version of iptables your host has?

@brandond brandond added this to the v1.26.4+k3s1 milestone Mar 15, 2023
@noselasd
Copy link
Author

. Can you confirm what version of iptables your host has?

This is on the host:

# iptables --version
iptables v1.4.21

@brandond
Copy link
Contributor

brandond commented Mar 15, 2023

Wow that is an older version than I think I have seen in a while; that is like 9 years old. For some reason I thought there was a newer version available for EL7 but perhaps not without adding additional repos.

@programmerq
Copy link

I too ran into this, but on Ubuntu 18.04.6 LTS.

It has iptables v1.6.1

I found that setting prefer-bundled-bin: true in my /etc/rancher/k3s/config.yaml file also did the trick. Probably time to upgrade from 18.04 too.

@brandond brandond added priority/critical-urgent kind/bug Something isn't working kind/upstream-issue This issue appears to be caused by an upstream bug labels Mar 20, 2023
@brandond brandond changed the title Networking/NAT not working on CentOS 7 with v1.25.7+k3s1 Networking/NAT not working on older iptables versions with v1.25.7+k3s1 Mar 21, 2023
@rbrtbnfgl rbrtbnfgl modified the milestones: v1.26.4+k3s1, v1.26.3+k3s1 Mar 22, 2023
@VestigeJ VestigeJ self-assigned this Mar 24, 2023
@VestigeJ
Copy link

Reproduced using VERSION=v1.26.2+k3s1
Validated using VERSION=v1.26.3-rc2+k3s1

Reproduction on Centos 7.9 bare single node install no configs

$ sudo INSTALL_K3S_VERSION=v1.26.2+k3s1 INSTALL_K3S_EXEC=server ./install-k3s.sh

$ k3s -v

k3s version v1.26.2+k3s1 (ea094d1d)
go version go1.19.6

$ iptables --version

iptables v1.4.21

$ sudo iptables -t nat -vnL | grep -i flannel

  969 58796 FLANNEL-POSTRTG  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* flanneld masq */
Chain FLANNEL-POSTRTG (1 references)
Validation on Centos 7.9 upgrade single node in place to detect missing routing rules

$ sudo INSTALL_K3S_VERSION=v1.26.3-rc2+k3s1 INSTALL_K3S_EXEC=server ./install-k3s.sh

$ k3s -v

k3s version v1.26.3-rc2+k3s1 (01ea3ff2)
go version go1.19.7

centos@node:~$ iptables --version

iptables v1.4.21

$ sudo iptables -t nat -vnL | grep -i flannel

  106  6444 FLANNEL-POSTRTG  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* flanneld masq */
Chain FLANNEL-POSTRTG (1 references)
    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            mark match 0x4000/0x4000 /* flanneld masq */
   84  5040 RETURN     all  --  *      *       10.42.0.0/24         10.42.0.0/16         /* flanneld masq */
    0     0 RETURN     all  --  *      *       10.42.0.0/16         10.42.0.0/24         /* flanneld masq */
    0     0 RETURN     all  --  *      *      !10.42.0.0/16         10.42.0.0/24         /* flanneld masq */
    1    96 MASQUERADE  all  --  *      *       10.42.0.0/16        !224.0.0.0/4          /* flanneld masq */
    0     0 MASQUERADE  all  --  *      *      !10.42.0.0/16         10.42.0.0/16         /* flanneld masq */
Reproduction on Ubuntu 18.04 bare single node install no configs

$ sudo INSTALL_K3S_VERSION=v1.26.2+k3s1 INSTALL_K3S_EXEC=server ./install-k3s.sh

$ k3s -v

k3s version v1.26.2+k3s1 (ea094d1d)
go version go1.19.6

$ iptables --version

iptables v1.6.1

$ sudo iptables -t nat -vnL | grep -i flannel

  891 53800 FLANNEL-POSTRTG  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* flanneld masq */
Chain FLANNEL-POSTRTG (1 references)
Validation on Ubuntu 18.04 upgrade existing node to the latest rc-2

$ sudo INSTALL_K3S_VERSION=v1.26.3-rc2+k3s1 INSTALL_K3S_EXEC=server ./install-k3s.sh

$ k3s -v

k3s version v1.26.3-rc2+k3s1 (01ea3ff2)
go version go1.19.7

ubuntu@node-two:~$ iptables --version

iptables v1.6.1

ubuntu@ip-172-31-17-58:~$ sudo iptables -t nat -vnL | grep -i flannel

  114  7089 FLANNEL-POSTRTG  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* flanneld masq */
Chain FLANNEL-POSTRTG (1 references)
    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            mark match 0x4000/0x4000 /* flanneld masq */
   83  4980 RETURN     all  --  *      *       10.42.0.0/24         10.42.0.0/16         /* flanneld masq */
    0     0 RETURN     all  --  *      *       10.42.0.0/16         10.42.0.0/24         /* flanneld masq */
    0     0 RETURN     all  --  *      *      !10.42.0.0/16         10.42.0.0/24         /* flanneld masq */
    1    96 MASQUERADE  all  --  *      *       10.42.0.0/16        !224.0.0.0/4          /* flanneld masq */
    0     0 MASQUERADE  all  --  *      *      !10.42.0.0/16         10.42.0.0/16         /* flanneld masq */

@garfcat

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working kind/upstream-issue This issue appears to be caused by an upstream bug
Projects
Archived in project
Development

No branches or pull requests

6 participants