Iptables not properly set when using dual stack with ipv6 #7211

HOSTED-POWER · 2023-04-04T20:23:43Z

Environmental Info:
K3s Version: v1.25.8+k3s1

Node(s) CPU architecture, OS, and Version: 5.10.0-21-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21) x86_64 GNU/Linux

Cluster Configuration: 1 server

Describe the bug:
When using configserver (csf) with predefined iptables rules, we never had any issues. K3S properly creates all firewall rules (using ipv4 only). Now with ipv6 activated we get timeout

For example: [WARNING] plugin/kubernetes: Kubernetes API connection failure: Get "https://10.43.0.1:443/version": dial tcp 10.43.0.1:443: i/o timeout

But this never recovers...

Steps To Reproduce:
Install k3s with these arguments:

curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="{v1.25.8+k3s1}" K3S_KUBECONFIG_MODE="644" INSTALL_K3S_EXEC="server --disable=traefik --cluster-cidr=10.42.0.0/16,fc00:a0::/64 --service-cidr=10.43.0.0/16,2001:cafe:42:1::/112 --flannel-ipv6-masq" sh -

And you get the non working situation when csf (consfi)

Install like this, and there are 0 issues in the same environment:

curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="{v1.25.8+k3s1}" K3S_KUBECONFIG_MODE="644" INSTALL_K3S_EXEC="server --disable=traefik" sh -

Another solution:

iptables -I INPUT -d 10.43.0.0/16 -j ACCEPT
iptables -I OUTPUT -d 10.43.0.0/16 -j ACCEPT
iptables -I INPUT -d 10.42.0.0/16 -j ACCEPT
iptables -I OUTPUT -d 10.42.0.0/16 -j ACCEPT

curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="{v1.25.8+k3s1}" K3S_KUBECONFIG_MODE="644" INSTALL_K3S_EXEC="server --disable=traefik --cluster-cidr=10.42.0.0/16,fc00:a0::/64 --service-cidr=10.43.0.0/16,2001:cafe:42:1::/112 --flannel-ipv6-masq" sh -

Also leads to a working installation ...

Expected behavior:
A properly working k3s, also with ipv6

Actual behavior:
Not working, coredns and metrics service keeps crashing and get into crashloopbackoff, also there seems a chain what normally empty which is being filled:

Chain KUBE-SERVICES (2 references)
target     prot opt source               destination
REJECT     tcp  --  anywhere             10.43.0.10           /* kube-system/kube-dns:dns-tcp has no endpoints */ tcp dpt:domain reject-with icmp-port-unreachable
REJECT     tcp  --  anywhere             10.43.0.10           /* kube-system/kube-dns:metrics has no endpoints */ tcp dpt:9153 reject-with icmp-port-unreachable
REJECT     udp  --  anywhere             10.43.0.10           /* kube-system/kube-dns:dns has no endpoints */ udp dpt:domain reject-with icmp-port-unreachable
REJECT     tcp  --  anywhere             10.43.92.92          /* kube-system/metrics-server:https has no endpoints */ tcp dpt:https reject-with icmp-port-unreachable

Additional context / logs:

[WARNING] plugin/kubernetes: Kubernetes API connection failure: Get "https://10.43.0.1:443/version": dial tcp 10.43.0.1:443: i/o timeout
time="2023-04-04T20:16:21Z" level=fatal msg="Error starting daemon: Cannot start Provisioner: failed to get Kubernetes server version: Get "https://10.43.0.1:443/version?timeout=32s\": dial tcp 10.43.0.1:443: i/o timeout"
[WARNING] plugin/kubernetes: Kubernetes API connection failure: Get "https://10.43.0.1:443/version": dial tcp 10.43.0.1:443: i/o timeout

Warning Unhealthy 3m14s (x2 over 3m17s) kubelet Readiness probe failed: Get "https://10.42.0.3:10250/readyz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 2m49s (x16 over 3m16s) kubelet Readiness probe failed: Get "https://10.42.0.3:10250/readyz": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 2m48s kubelet Readiness probe failed: Get "https://10.42.0.3:10250/readyz": dial tcp 10.42.0.3:10250: connect: connection refused

Some logs:

daemon.log
not-working-kube.txt
working-kube.txt

brandond · 2023-04-04T20:28:10Z

When using configserver (csf) with predefined iptables rules, we never had any issues. K3S properly creates all firewall rules (using ipv4 only). Now with ipv6 activated we get timeout

Please take a look at the thread over at #7203 (comment) - can you confirm whether or not you have a default-drop or default-deny rule at the end of your INPUT chain?

This workstation, because it does have some firewalling enabled (corporate policies) has the INPUT chain configured with a default policy of DROP. It has rules to accept local traffic from the "normal" interfaces, but cni0 is not part of those rules.

brandond · 2023-04-04T20:29:09Z

cc @rbrtbnfgl since I believe this is related to the kube-router change we're discussing in that other thread.

rbrtbnfgl · 2023-04-04T20:43:01Z

Yes but I don't know if we had to change to ACCEPT again only for the INPUT chain when a node has a firewall that drops all traffic on it. This was the reason of upstream to maintain ACCEPT.

brandond · 2023-04-04T21:22:26Z

It does feel like the kube-router default ACCEPT rule has been covering up problems for a lot of folks that SHOULD have opened up their firewall rules for K3s. Now that we just RETURN they are running into problems because they didn't properly configure their iptables rules for K3s.

On the one hand I want to say it's working as designed (users need to properly configure their host iptables rules if they are blocking traffic), on the other it is a breaking change for users who are upgrading and suddenly configurations that previously worked now do not.

I wonder if there is a way to fix the timeout issue in #6691 without also breaking clusters when users upgrade on a node that doesn't have properly configured user-managed iptables rules.

HOSTED-POWER · 2023-04-05T09:07:50Z

I lost more than a day trying to fix this, what kind of "preparation" for iptables is needed? I can add them if I know which ones :)

HOSTED-POWER · 2023-04-05T09:13:13Z

It's just bad luck that we started with this ipv6 implementation at the same time 1.25.8 replaced 1.25.7, since indeed, I now installed the v1.25.7 and I have 0 issues. It's curious this happens with a minor version update, never expected this.

This is really a very breaking change as far as I can see :(

rbrtbnfgl · 2023-04-05T09:13:58Z

if you are using some firewall on the node it should be documented on the docs https://docs.k3s.io/advanced#additional-os-preparations
It says to add pod and services IPs on the trusted zone.

brandond · 2023-04-05T09:33:42Z

While I appreciate that the current behavior is probably correct from a security perspective, I am very concerned that it is also a breaking change for many users who were relying on the old behavior for proper functioning of their cluster.

@rbrtbnfgl would it be possible to put the allow/return behavior behind a CLI flag that defaults to the old ALLOW by default?

cc @cwayne18 @caroline-suse-rancher

rbrtbnfgl · 2023-04-05T09:34:48Z

It does feel like the kube-router default ACCEPT rule has been covering up problems for a lot of folks that SHOULD have opened up their firewall rules for K3s. Now that we just RETURN they are running into problems because they didn't properly configure their iptables rules for K3s.

On the one hand I want to say it's working as designed (users need to properly configure their host iptables rules if they are blocking traffic), on the other it is a breaking change for users who are upgrading and suddenly configurations that previously worked now do not.

I wonder if there is a way to fix the timeout issue in #6691 without also breaking clusters when users upgrade on a node that doesn't have properly configured user-managed iptables rules.

The issue is related to the iptables rules added by kube-router at the begin of the chain. The packets are rightly marked if they need to be accepted but the ACCEPT rule is executed before the other rules on the chain. If the ACCEPT rule that match the mark is appended at the end of the chain the packets are accepted by kube-router rule but after they check every the rules on that chain.

HOSTED-POWER · 2023-04-05T11:05:09Z

While I appreciate that the current behavior is probably correct from a security perspective, I am very concerned that it is also a breaking change for many users who were relying on the old behavior for proper functioning of their cluster.

@rbrtbnfgl would it be possible to put the allow/return behavior behind a CLI flag that defaults to the old ALLOW by default?

cc @cwayne18 @caroline-suse-rancher

To be honest we were very happy with the current implementation, it made it very easy and always working. I don't see a lot of advantage to change this, since we need to open it anyway. Why not rely on k3s to do this for us?

rbrtbnfgl · 2023-04-05T15:21:47Z

I can easily change kube-router to add the ACCEPT rules at the end of the chain to maintain the previous behaviour and the fix for #6691

brandond · 2023-04-05T15:31:34Z

@rbrtbnfgl do you think you can get that in for the next release? If so I believe that would probably save us a lot of additional issues.

HOSTED-POWER · 2023-04-07T08:11:40Z

Wow I can't wait to test the fix, is there an easy way to do this? Or will it be released soon?

PS: Do we really need to uninstall k3s completely if we want to enable ipv6 on a already installed k3s? That seems a lot of hassle, but it seems to be noted in the documentation?

rbrtbnfgl · 2023-04-07T08:13:22Z

it's enough to run k3s-killall.sh and then start K3s

brandond · 2023-04-07T08:14:48Z

Wow I can't wait to test the fix, is there an easy way to do this? Or will it be released soon?

See #7203 (comment)
As @rbrtbnfgl you will need to use k3s-killall.sh to clear the iptables rules, before starting the new version.

PS: Do we really need to uninstall k3s completely if we want to enable ipv6 on a already installed k3s? That seems a lot of hassle, but it seems to be noted in the documentation?

If you want to have a dual-stack cluster, yes you should configure the dual-stack CIDRs when starting the server for the first time.

HOSTED-POWER · 2023-04-07T08:21:46Z

ok that's confusing @brandond , can we get away with the kill? What with single node installs?

The fix doesn't work for our use case it seems, but I commented in the other thread

HOSTED-POWER · 2023-04-14T11:08:10Z

Hi Brandond, we have a bunch of servers to upgrade with ipv6 support. Is the killall usable or not? :)

Is there any way/command to show the current k3s_install_options ? This could come in handy as well when running upgrades

For the rest, I tested the rc1 of 1.25 and it seems resolved so far!

brandond · 2023-04-18T02:32:53Z

Closing as duplicate of #7203

proligde · 2023-05-15T16:04:43Z

@rbrtbnfgl I stumbled across this issue after my (and my collegues') local k3s-backed dev environments stopped working after upgrading to 1.26.4.

My scenario is, that we use k3s as local development environment and route all our local domains to 127.0.0.1. This worked for years now, but all of a sudden ports 80 and 443 running through the svclb stopped responding on 127.0.0.1 while still working on the LAN addresses like 192.168....

After pinning the problem down to the k3s version used, I could confirm it still worked on k3s v 1.25.6, but not anymore on 1.25.9, 1.26.4 and 1.27.1 which led me to this very ticket and the PR mentioned below.

To be honest – I don't understand how PR #7218 should produce that behavior. On the other hand my iptables and containerd routing knowledge is very limited. So I'm wondering - is this just a red herring or what did I misconfigure here?

thanks so much in advance - Max

rbrtbnfgl · 2023-05-15T18:24:33Z

Hi @proligde could you open an issue with your setup config?

brandond assigned rbrtbnfgl Apr 5, 2023

brandond added this to K3s Development Apr 5, 2023

brandond added this to the v1.26.4+k3s1 milestone Apr 5, 2023

brandond moved this to To Test in K3s Development Apr 5, 2023

ShylajaDevadiga self-assigned this Apr 6, 2023

brandond closed this as completed Apr 18, 2023

github-project-automation bot moved this from To Test to Done Issue in K3s Development Apr 18, 2023

k3s-io locked and limited conversation to collaborators May 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Iptables not properly set when using dual stack with ipv6 #7211

Iptables not properly set when using dual stack with ipv6 #7211

HOSTED-POWER commented Apr 4, 2023 •

edited

Loading

brandond commented Apr 4, 2023 •

edited

Loading

brandond commented Apr 4, 2023

rbrtbnfgl commented Apr 4, 2023 •

edited

Loading

brandond commented Apr 4, 2023 •

edited

Loading

HOSTED-POWER commented Apr 5, 2023

HOSTED-POWER commented Apr 5, 2023

rbrtbnfgl commented Apr 5, 2023

brandond commented Apr 5, 2023

rbrtbnfgl commented Apr 5, 2023 •

edited

Loading

HOSTED-POWER commented Apr 5, 2023

rbrtbnfgl commented Apr 5, 2023

brandond commented Apr 5, 2023

HOSTED-POWER commented Apr 7, 2023

rbrtbnfgl commented Apr 7, 2023

brandond commented Apr 7, 2023 •

edited

Loading

HOSTED-POWER commented Apr 7, 2023

HOSTED-POWER commented Apr 14, 2023

brandond commented Apr 18, 2023

proligde commented May 15, 2023

rbrtbnfgl commented May 15, 2023

Iptables not properly set when using dual stack with ipv6 #7211

Iptables not properly set when using dual stack with ipv6 #7211

Comments

HOSTED-POWER commented Apr 4, 2023 • edited Loading

brandond commented Apr 4, 2023 • edited Loading

brandond commented Apr 4, 2023

rbrtbnfgl commented Apr 4, 2023 • edited Loading

brandond commented Apr 4, 2023 • edited Loading

HOSTED-POWER commented Apr 5, 2023

HOSTED-POWER commented Apr 5, 2023

rbrtbnfgl commented Apr 5, 2023

brandond commented Apr 5, 2023

rbrtbnfgl commented Apr 5, 2023 • edited Loading

HOSTED-POWER commented Apr 5, 2023

rbrtbnfgl commented Apr 5, 2023

brandond commented Apr 5, 2023

HOSTED-POWER commented Apr 7, 2023

rbrtbnfgl commented Apr 7, 2023

brandond commented Apr 7, 2023 • edited Loading

HOSTED-POWER commented Apr 7, 2023

HOSTED-POWER commented Apr 14, 2023

brandond commented Apr 18, 2023

proligde commented May 15, 2023

rbrtbnfgl commented May 15, 2023

HOSTED-POWER commented Apr 4, 2023 •

edited

Loading

brandond commented Apr 4, 2023 •

edited

Loading

rbrtbnfgl commented Apr 4, 2023 •

edited

Loading

brandond commented Apr 4, 2023 •

edited

Loading

rbrtbnfgl commented Apr 5, 2023 •

edited

Loading

brandond commented Apr 7, 2023 •

edited

Loading