-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iptables not properly set when using dual stack with ipv6 #7211
Comments
Please take a look at the thread over at #7203 (comment) - can you confirm whether or not you have a default-drop or default-deny rule at the end of your INPUT chain?
|
cc @rbrtbnfgl since I believe this is related to the kube-router change we're discussing in that other thread. |
Yes but I don't know if we had to change to ACCEPT again only for the INPUT chain when a node has a firewall that drops all traffic on it. This was the reason of upstream to maintain ACCEPT. |
It does feel like the kube-router default ACCEPT rule has been covering up problems for a lot of folks that SHOULD have opened up their firewall rules for K3s. Now that we just RETURN they are running into problems because they didn't properly configure their iptables rules for K3s. On the one hand I want to say it's working as designed (users need to properly configure their host iptables rules if they are blocking traffic), on the other it is a breaking change for users who are upgrading and suddenly configurations that previously worked now do not. I wonder if there is a way to fix the timeout issue in #6691 without also breaking clusters when users upgrade on a node that doesn't have properly configured user-managed iptables rules. |
I lost more than a day trying to fix this, what kind of "preparation" for iptables is needed? I can add them if I know which ones :) |
It's just bad luck that we started with this ipv6 implementation at the same time 1.25.8 replaced 1.25.7, since indeed, I now installed the v1.25.7 and I have 0 issues. It's curious this happens with a minor version update, never expected this. This is really a very breaking change as far as I can see :( |
if you are using some firewall on the node it should be documented on the docs https://docs.k3s.io/advanced#additional-os-preparations |
While I appreciate that the current behavior is probably correct from a security perspective, I am very concerned that it is also a breaking change for many users who were relying on the old behavior for proper functioning of their cluster. @rbrtbnfgl would it be possible to put the allow/return behavior behind a CLI flag that defaults to the old ALLOW by default? |
The issue is related to the iptables rules added by kube-router at the begin of the chain. The packets are rightly marked if they need to be accepted but the ACCEPT rule is executed before the other rules on the chain. If the ACCEPT rule that match the mark is appended at the end of the chain the packets are accepted by kube-router rule but after they check every the rules on that chain. |
To be honest we were very happy with the current implementation, it made it very easy and always working. I don't see a lot of advantage to change this, since we need to open it anyway. Why not rely on k3s to do this for us? |
I can easily change kube-router to add the ACCEPT rules at the end of the chain to maintain the previous behaviour and the fix for #6691 |
@rbrtbnfgl do you think you can get that in for the next release? If so I believe that would probably save us a lot of additional issues. |
Wow I can't wait to test the fix, is there an easy way to do this? Or will it be released soon? PS: Do we really need to uninstall k3s completely if we want to enable ipv6 on a already installed k3s? That seems a lot of hassle, but it seems to be noted in the documentation? |
it's enough to run |
See #7203 (comment)
If you want to have a dual-stack cluster, yes you should configure the dual-stack CIDRs when starting the server for the first time. |
ok that's confusing @brandond , can we get away with the kill? What with single node installs? The fix doesn't work for our use case it seems, but I commented in the other thread |
Hi Brandond, we have a bunch of servers to upgrade with ipv6 support. Is the killall usable or not? :) Is there any way/command to show the current k3s_install_options ? This could come in handy as well when running upgrades For the rest, I tested the rc1 of 1.25 and it seems resolved so far! |
Closing as duplicate of #7203 |
@rbrtbnfgl I stumbled across this issue after my (and my collegues') local k3s-backed dev environments stopped working after upgrading to 1.26.4. My scenario is, that we use k3s as local development environment and route all our local domains to 127.0.0.1. This worked for years now, but all of a sudden ports 80 and 443 running through the svclb stopped responding on 127.0.0.1 while still working on the LAN addresses like 192.168.... After pinning the problem down to the k3s version used, I could confirm it still worked on k3s v 1.25.6, but not anymore on 1.25.9, 1.26.4 and 1.27.1 which led me to this very ticket and the PR mentioned below. To be honest – I don't understand how PR #7218 should produce that behavior. On the other hand my iptables and containerd routing knowledge is very limited. So I'm wondering - is this just a red herring or what did I misconfigure here? thanks so much in advance - Max |
Hi @proligde could you open an issue with your setup config? |
Environmental Info:
K3s Version: v1.25.8+k3s1
Node(s) CPU architecture, OS, and Version: 5.10.0-21-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21) x86_64 GNU/Linux
Cluster Configuration: 1 server
Describe the bug:
When using configserver (csf) with predefined iptables rules, we never had any issues. K3S properly creates all firewall rules (using ipv4 only). Now with ipv6 activated we get timeout
For example: [WARNING] plugin/kubernetes: Kubernetes API connection failure: Get "https://10.43.0.1:443/version": dial tcp 10.43.0.1:443: i/o timeout
But this never recovers...
Steps To Reproduce:
Install k3s with these arguments:
And you get the non working situation when csf (consfi)
Install like this, and there are 0 issues in the same environment:
Another solution:
Also leads to a working installation ...
Expected behavior:
A properly working k3s, also with ipv6
Actual behavior:
Not working, coredns and metrics service keeps crashing and get into crashloopbackoff, also there seems a chain what normally empty which is being filled:
Additional context / logs:
[WARNING] plugin/kubernetes: Kubernetes API connection failure: Get "https://10.43.0.1:443/version": dial tcp 10.43.0.1:443: i/o timeout
time="2023-04-04T20:16:21Z" level=fatal msg="Error starting daemon: Cannot start Provisioner: failed to get Kubernetes server version: Get "https://10.43.0.1:443/version?timeout=32s\": dial tcp 10.43.0.1:443: i/o timeout"
[WARNING] plugin/kubernetes: Kubernetes API connection failure: Get "https://10.43.0.1:443/version": dial tcp 10.43.0.1:443: i/o timeout
Warning Unhealthy 3m14s (x2 over 3m17s) kubelet Readiness probe failed: Get "https://10.42.0.3:10250/readyz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 2m49s (x16 over 3m16s) kubelet Readiness probe failed: Get "https://10.42.0.3:10250/readyz": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 2m48s kubelet Readiness probe failed: Get "https://10.42.0.3:10250/readyz": dial tcp 10.42.0.3:10250: connect: connection refused
Some logs:
daemon.log
not-working-kube.txt
working-kube.txt
The text was updated successfully, but these errors were encountered: