-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iptables rules not deleted when using hostPort #3412
Comments
Thank you for reporting this. |
How i can get additional debug logs? |
I faced with same issue with Is required close this ticket? |
@sergey-safarov this is definitely not about 1.18, I have 1.17.4 with the issue. It seems that HostPort is totally broken in kubernetes :( I don't think this ticket should be closed because as an end-user we don't install hostPort plugin directly but choose calico as a networking solution which seems broken in terms of hostPort support. @tmjd I just wanted to state that this is pretty severe issue. I experienced half-hour totally unexpected downtime because of it just by re-deploying minor configuration update on my traefik ingress controller which uses hostPort (on bare-metal cluster). The only way to deal with this is to restart those nodes or to identify and clean the iptables rules manually. I've tried to navigate and search through the provided link to the plugin and it seems that nobody going to fix the issue. I'm quite new to the k8s and don't deeply understand how the networking works, maybe the calico maintainers could consider pushing this to be fixed or consider changing the plugin to more maintained one (if there is) or at least document a big red warning that hostPort feature is totally broken with calico? Thank you. |
Has anyone submitted an upstream issue against the hostPort plugin? Calico doesn't play a role in the implementation of host ports, so I'm not sure there's much we could do here. The other thing to look into is whether or not the CNI plugin is even getting called to tear down the pod. It could be an issue with the runtime not calling the CNI plugin, or it could be an issue with the host port plugin itself. |
I am also experiencing this, also on CentOS8. Could it be that this issue is NFT related, as CentOS8 uses NFT, which is configured as backend. I'd think that otherwise more people would have complained in the mean time. I honestly don't know where to start debuggig, any suggestions welcome. |
Okay, did a round of debuggin I wanted to document here. Using the following manifest: apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: nginx
name: nginx
namespace: troubleshoot
spec:
containers:
- image: nginx
name: nginx
resources: {}
ports:
- containerPort: 80
hostPort: 10081
name: http
- containerPort: 443
hostPort: 8443
name: https
dnsPolicy: ClusterFirst
restartPolicy: Never Then executing the following: $ k apply -f nginx.yml; sleep 15 ;k delete -f nginx.yml
$ # checking iptables
$ iptables -t nat --line-numbers -L CNI-HOSTPORT-DNAT
Chain CNI-HOSTPORT-DNAT (2 references)
num target prot opt source destination
1 CNI-DN-bcfdb00a9541b4df781a0 tcp -- anywhere anywhere /* dnat name: "cni0" id: "e05c0b7dc624c20da56c8db60492744381eb27b7449e28e87f8efa6da8210a9e" */ multiport dports kamanda,pcsync-https
$ # next round
$ k apply -f nginx.yml; sleep 15 ;k delete -f nginx.yml
$ iptables -t nat --line-numbers -L CNI-HOSTPORT-DNAT
Chain CNI-HOSTPORT-DNAT (2 references)
num target prot opt source destination
1 CNI-DN-bcfdb00a9541b4df781a0 tcp -- anywhere anywhere /* dnat name: "cni0" id: "e05c0b7dc624c20da56c8db60492744381eb27b7449e28e87f8efa6da8210a9e" */ multiport dports kamanda,pcsync-https
2 CNI-DN-2a1d84389734e54983abb tcp -- anywhere anywhere /* dnat name: "cni0" id: "24b63cb1cbda96c6c89e5fb89700141460323010decfb98160b6434700e3a9a4" */ multiport dports kamanda,pcsync-https This means the target= $ iptables -t nat --line-numbers -L CNI-DN-bcfdb00a9541b4df781a0
Chain CNI-DN-bcfdb00a9541b4df781a0 (1 references)
num target prot opt source destination
1 CNI-HOSTPORT-SETMARK tcp -- 10.233.124.50 anywhere tcp dpt:kamanda
2 CNI-HOSTPORT-SETMARK tcp -- localhost6 anywhere tcp dpt:kamanda
3 DNAT tcp -- anywhere anywhere tcp dpt:kamanda to:10.233.124.50:80
4 CNI-HOSTPORT-SETMARK tcp -- 10.233.124.50 anywhere tcp dpt:pcsync-https
5 CNI-HOSTPORT-SETMARK tcp -- localhost6 anywhere tcp dpt:pcsync-https
6 DNAT tcp -- anywhere anywhere tcp dpt:pcsync-https to:10.233.124.50:443
$ kubectl get po -A -o wide | grep 10.233.124.50
$ => No pod with this IP address exists. I verified this, when the pod exists, an entry is shown. Next step, setting FELIX to loglevel debug and using stern to show the logs: $ stern calico-node- | grep -i CNI-DN-bcfdb00a9541b4df781a0
calico-node-vx7p8 calico-node 2020-06-28 15:12:07.416 [DEBUG][61] table.go 729: Parsing line ipVersion=0x4 line=":CNI-DN-bcfdb00a9541b4df781a0 - [0:0]" table="nat"
calico-node-vx7p8 calico-node 2020-06-28 15:12:07.416 [DEBUG][61] table.go 736: Found forward-reference chainName="CNI-DN-bcfdb00a9541b4df781a0" ipVersion=0x4 line=":CNI-DN-bcfdb00a9541b4df781a0 - [0:0]" table="nat"
calico-node-vx7p8 calico-node 2020-06-28 15:12:07.420 [DEBUG][61] table.go 729: Parsing line ipVersion=0x4 line="-A CNI-HOSTPORT-DNAT -p tcp -m comment --comment \"dnat name: \\\"cni0\\\" id: \\\"e05c0b7dc624c20da56c8db60492744381eb27b7449e28e87f8efa6da8210a9e\\\"\" -m multiport --dports 10081,8443 -j CNI-DN-bcfdb00a9541b4df781a0" table="nat"
calico-node-vx7p8 calico-node 2020-06-28 15:12:07.423 [DEBUG][61] table.go 729: Parsing line ipVersion=0x4 line="-A CNI-DN-bcfdb00a9541b4df781a0 -s 10.233.124.50/32 -p tcp -m tcp --dport 10081 -j CNI-HOSTPORT-SETMARK" table="nat"
calico-node-vx7p8 calico-node 2020-06-28 15:12:07.423 [DEBUG][61] table.go 729: Parsing line ipVersion=0x4 line="-A CNI-DN-bcfdb00a9541b4df781a0 -s 127.0.0.1/32 -p tcp -m tcp --dport 10081 -j CNI-HOSTPORT-SETMARK" table="nat"
calico-node-vx7p8 calico-node 2020-06-28 15:12:07.423 [DEBUG][61] table.go 729: Parsing line ipVersion=0x4 line="-A CNI-DN-bcfdb00a9541b4df781a0 -p tcp -m tcp --dport 10081 -j DNAT --to-destination 10.233.124.50:80" table="nat"
calico-node-vx7p8 calico-node 2020-06-28 15:12:07.423 [DEBUG][61] table.go 729: Parsing line ipVersion=0x4 line="-A CNI-DN-bcfdb00a9541b4df781a0 -s 10.233.124.50/32 -p tcp -m tcp --dport 8443 -j CNI-HOSTPORT-SETMARK" table="nat"
calico-node-vx7p8 calico-node 2020-06-28 15:12:07.423 [DEBUG][61] table.go 729: Parsing line ipVersion=0x4 line="-A CNI-DN-bcfdb00a9541b4df781a0 -s 127.0.0.1/32 -p tcp -m tcp --dport 8443 -j CNI-HOSTPORT-SETMARK" table="nat"
calico-node-vx7p8 calico-node 2020-06-28 15:12:07.423 [DEBUG][61] table.go 729: Parsing line ipVersion=0x4 line="-A CNI-DN-bcfdb00a9541b4df781a0 -p tcp -m tcp --dport 8443 -j DNAT --to-destination 10.233.124.50:443" table="nat"
calico-node-vx7p8 calico-node 2020-06-28 15:12:07.424 [DEBUG][61] table.go 799: Read hashes from dataplane: map[string][]string{"CNI-DN-2a1d84389734e54983abb":[]string{"", "", "", "", "", ""}, "CNI-DN-365d9a185cff3e5ea9379":[]string{"", ""}, "CNI-DN-442e7af5e08e6bf46621a":[]string{"", "", ""}, "CNI-DN-90b43263af9d462e04f9b":[]string{"", "", "", "", "", ""}, "CNI-DN-9a7663babfc285ea01c15":[]string{"", "", "", "", "", ""}, "CNI-DN-bcfdb00a9541b4df781a0":[]string{"", "", "", "", "", ""}, "CNI-DN-e4582bd2b3ff7ed196a28":[]string{"", ""}, "CNI-DN-ea59c359e9d718f7116dc":[]string{}, "CNI-DN-f0f13904d22032106b1e7":[]string{"", "", ""}, "CNI-DN-f54bc33afc414d6e64133":[]string{}, "CNI-DN-fc387103aa44e3b189d09":[]string{"", "", "", "", "", ""}, "CNI-DN-fc95996fce81df369bc47":[]string{}, "CNI-DN-ff88843368fae4d5cdfae":[]string{}, "CNI-HOSTPORT-DNAT":[]string{"", ""}, "CNI-HOSTPORT-MASQ":[]string{""}, "CNI-HOSTPORT-SETMARK":[]string{""}, "INPUT":[]string{}, "KUBE-KUBELET-CANARY":[]string{}, "KUBE-MARK-DROP":[]string{""}, "KUBE-MARK-MASQ":[]string{""}, "KUBE-POSTROUTING":[]string{""}, "OUTPUT":[]string{"tVnHkvAo15HuiPy0", ""}, "POSTROUTING":[]string{"O3lYWMrLQYEMJtB5", "", ""}, "PREROUTING":[]string{"6gwbT8clXdHdC1b1", ""}, "cali-OUTPUT":[]string{"GBTAv2p5CwevEyJm"}, "cali-POSTROUTING":[]string{"Z-c7XtVd2Bq7s_hA", "nYKhEzDlr11Jccal", "SXWvdsbh4Mw7wOln"}, "cali-PREROUTING":[]string{"r6XmIziWUJsdOK6Z"}, "cali-fip-dnat":[]string{}, "cali-fip-snat":[]string{}, "cali-nat-outgoing":[]string{"flqWnvo8yq4ULQLa"}} ipVersion=0x4 table="nat"
calico-node-vx7p8 calico-node 2020-06-28 15:12:07.425 [DEBUG][61] table.go 568: Skipping expected chain chainName="CNI-DN-bcfdb00a9541b4df781a0" ipVersion=0x4 table="nat" For me the sentence The reference is to this line of code in Felix. As the cause of the problem seems to be in Felix, I will open an issue there. |
Does anyone know if this has been fixed in upstream portmap or here? |
@caseydavenport can you please explain - is this issue closed because it's fixed in some version (which?) or just because it's inactive? |
I saw this issue happening in several environments (3 different customers + internal labs) in the last 12 months, including one customer with Calico Enterprise with Tigera support. |
Met this, too. Iptables shows duplicate rules for hostPort. Even after deleting the container hostPorts, the redundant rule still exists. |
I use
nodePort
definition for nginx deployment.When deployment is deleted then related IP iptables rules not cleared.
If I create again deployment then iptable contains two related rules and traffic forwarding is broken.
Expected Behavior
iptables
rules cleared when deleted deployment withhostPort
definition.Current Behavior
iptables
rules still present after deployment deleted.Possible Solution
I not know.
Steps to Reproduce (for bugs)
Context
I not able publish nginx on my dev server using 80 and 443 port.
Your Environment
[root@safarov-server wordpress]# /opt/cni/bin/calico -v v3.13.2
The text was updated successfully, but these errors were encountered: