New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fedora firewall issue] Fail to create an IPv6 multinode cluster - it hangs when it is Joining the workers. #1283
Comments
Have you read known issues? I'm surprised ipv4 is workingz btrfs is known to cause issues |
/assign |
@ricardo-rod please paste or upload the output of the command to create a cluster adding verbosity with the flag |
Good news, it seems the problem should be with the OS, I ran the same kind cluster with Debian and it works like a charm. I tested in a non BTRFS filesystem using Fedora 30 and 31 and disabled the SELinux config file and nothing, the same problem. Then I just thought that the problem maybe is with the firewalld, because kind is not issuing the correct commands when is running IPv6 using Fedora, Redhat or Suse. then boom it was the firewall, I've activated back the SELinux and disabled the firewalld and it worked. I know, I must not disable the firewall, but there it is the bug, with the firewalld when using IPv6. @aojea Here it is the output file from bash good luck. |
This sounds like a bug between docker and firewalld on your hosts when using IPv6, we're not doing anything terribly interesting on the host networking wise, just normal containers with IPs and a port forward. |
This may also have cause problems https://fedoraproject.org/wiki/Changes/firewalld_default_to_nftables KIND does not execute any firewall commands, docker does however. |
As Ben says KIND doesn't touch the iptables rules on the hosts, docker does. |
The output of the non-working environment in Fedora 30: The output of the working environment in Debian: I'm still trying to see if the problem is the IPv6 Nat I will try tomorrow or late at night, I'm hoping that will be a workaround. |
at a first sight I don't see any docker rule allowing FORWARDing traffic between pods on the fedora dump |
Here it is the output of docker and when it is running the process I know now that it is opening the socket when I issued netstat -atunp, I can saw the port there but the forwarding is missing. https://raw.githubusercontent.com/ricardo-rod/files/master/non-working-fedora-docker As we all knew this is a problem related to nat6, Docker does not do a real NAT6 when it is using IPv6, then I remembered a container that I've used in the past when I was learning docker to bypass the non-global IPv6 address limitation. https://github.com/robbertkl/docker-ipv6nat. Then boom it worked without disabling the daemon firewalld, now the port forward has been made, look at the logs. It seems that maybe the port forwarding option that passes kind needs to be informed to the firewalld team or maybe to docker team to work with the rpm distributions, or better to work with the new firewalld/UFW nftables. nota: BTRFS is working like a charm in fedora 31 and 30. Workaround 1: Step 2: run your IPv6 multinode cluster yaml file. Workaround 2: You must not do, you know how to setup a manual ip6tables and iptables for all the connections and ports-forwarding. Step 1: disable and stop the daemon firewalld
Step 2: Step 2: run your IPv6 multinode cluster yaml file. @BenTheElder & @aojea, could you please take look at the config when it's working in fedora with the IPv6 docker nat and compared to the working one using Debian, Thanks in advance. Instructions for How to proceed with the bug? |
@ricardo-rod please paste the |
The ip6tables rules don't have the DOCKER rules allowing the containers communication. I could check with @saschagrunert that this is configured in docker here: docker rules are added using libnetwork, that seems it only work with the IPv4 iptables version, there is no place with an I don't think that docker is going to implement ip6tables per comments in these issues recently reported @ricardo-rod do you mind to test this and report if it works?
bonus if it works and you send a PR to document to the Known Issues in our docs 😄 |
Todo listo no hay problemas, I made a test with 2 new VMs and I issued the firewalld commands you wrote. Nothing more to report. |
@ricardo-rod can we close then? |
I must inform when I am tried to setup the kubernetes dashboard it hangs until I stopped again the firewalld daemon, here it is the output of: anyway docker is confused and is sending the IPv6 nat to IPv4 address and that is not going to happend unles NAT64 is being used but I don't think so. ● firewalld.service - firewalld - dynamic firewall daemon Jan 31 10:33:16 ryzen firewalld[466921]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -D FORWARD -i docker0 -o docker0 -j DROP' failed: iptables: Bad rule (does a matching rule exist in that chain?).
@aojea as you can see in the last part of it the inspect shows that IPv6 options are not being used, docker is not sending ip6tables at all and always tries to bind from IPv6 address to masquerade to IPv4 address as the firewalld output shows and that is never going to work. Happy and Sad :) - :( |
@ricardo-rod let's go step by step, first questions with the commands, I provide the cluster works? if the cluster works Can you create the cluster and paste Once we know the cluster works we can check how to do with the applications running inside the cluster. |
@aojea yes the cluster works and the pods are running and the containers too, when I shutdown the firewalld I got the same effect everything works, but when the daemon firewalld is started after the creation of the pods, services and dashboard do not get through. was this rule added by the firewalld-cmd command '/usr/sbin/iptables -w10 -t nat -A DOCKER -p tcp -d ::1 --dport 32770 -j DNAT --to-destination 172.17.0.2:6443 ! -i docker0'? if the cluster works docker network inspect bridge should show the containers attached to the bridge and I can't see any.
|
ok, cool, we are good then. If you see the It seems a bug in firewalld then? we need to understand who adds that rule
with the cluster deployed and running give me the iptables-save again please 😅 |
Here it is the output if the wordpress from kubernetes. kubectl get secrets; kubectl get pvc; kubectl get pods; kubectl get services wordpress NAME TYPE DATA AGE Now here is the output of the port-forwarding
Forwarding from [::]:8080 -> 80 And now the file containing the iptables-save ip6tables-save. I will have to take a deeper look at the firewalld logs to see what is going on, if is docker that sends the commands or firewalld. |
@ricardo-rod firewalld seems to have several issues reported with docker |
Here it is the output I can confirm that docker is issuing the commands not firewalld. Here is the debugged version of docker and firewalld logs, after the removal of every rule for and from docker of the daemon firewalld I started everything again, the results were that docker is sending the wrong commands to firewalld causing a bad port-forwarding (IPv6 to IPv4 nat which invalidates IPv6 connectivity). Where to make the bug? to firewalld or docker, maybe firewalld team will act quicker than docker, knowing that the docker network team has not fixed IPv6 at all. |
@ricardo-rod I'd go with firewalld, just explain the scenario as you did here, you have docker with IPv6 and firewalld, and it's installing a wrong rule, an IPv6 address in an IPv4 rule
feel free to tag me on the firewalld issue |
Is there anything else for this project to do here? this seems to be a downstream bug. |
as far as I can tell this is a bug with docker / firewalld, let's track in those projects. I see an issue has been opened against each. |
What happened:
When I am trying to run a multinode or multinode HA cluster with IPv6 it just hangs in there for so many minutes that I have stopped counting,
What you expected to happen:
The nodes should join the cluster after less than 2 minutes and I have waited 45 minutes and nothing.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
I tried global IPv6 routing space and IPv6 unique local address and I've got the same results. when I was trying to run the same YAML for IPv4 only there are no problems.
Environment:
kind version: (use
kind version
):kind v0.7.0 go1.13.5 linux/amd64
Kubernetes version: (use
kubectl version
):Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.2", GitCommit:"59603c6e503c87169aea6106f57b9f242f64df89", GitTreeState:"clean", BuildDate:"2020-01-18T23:30:10Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"linux/amd64"}
docker info
):Client:
Debug Mode: false
Server:
Containers: 3
Running: 0
Paused: 0
Stopped: 3
Images: 18
Server Version: 19.03.5
Storage Driver: btrfs
Build Version: Btrfs v5.2.1
Library Version: 102
Logging Driver: json-file
Cgroup Driver: cgroupfs
Client: Docker Engine - Community
Version: 19.03.5
API version: 1.40
Go version: go1.12.12
Git commit: 633a0ea838
Built: Wed Nov 13 07:26:43 2019
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.5
API version: 1.40 (minimum version 1.12)
Go version: go1.12.12
Git commit: 633a0ea838
Built: Wed Nov 13 07:24:37 2019
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.10
GitCommit: b34a5c8af56e510852c35414db4c1f4fa6172339
runc:
Version: 1.0.0-rc8+dev
GitCommit: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
docker-init:
Version: 0.18.0
GitCommit: fec3683
/etc/os-release
):NAME=Fedora
VERSION="31 (Workstation Edition)"
ID=fedora
VERSION_ID=31
VERSION_CODENAME=""
PLATFORM_ID="platform:f31"
PRETTY_NAME="Fedora 31 (Workstation Edition)"
ANSI_COLOR="0;34"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:31"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f31/system-administrators-guide/"
SUPPORT_URL="https://fedoraproject.org/wiki/Communicating_and_getting_help"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=31
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=31
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="Workstation Edition"
VARIANT_ID=workstation
The text was updated successfully, but these errors were encountered: