[WIP][TEST} debug kubeproxy iptables lock #92811

aojea · 2020-07-05T20:42:43Z

No description provided.

k8s-ci-robot · 2020-07-05T20:42:45Z

@aojea: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

aojea · 2020-07-05T21:27:28Z

/test pull-kubernetes-e2e-kind-ipv6

aojea · 2020-07-06T08:50:47Z

ok, one interesting failure here
https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/92811/pull-kubernetes-conformance-kind-ga-only-parallel/1279878461086765057

1/5208 Tests Failed. | expand_less
-- | --
[sig-network] Services should have session affinity timeout work for NodePort service [LinuxOnly] [Conformance] expand_less4m59stest/e2e/framework/framework.go:597 Jul  5 21:40:03.634: Unexpected error:     <*errors.errorString \| 0xc0005e6f50>: {         s: "service is not reachable within 2m0s timeout on endpoint 172.18.0.2:31750 over TCP protocol",     }     service is not reachable within 2m0s timeout on endpoint 172.18.0.2:31750 over TCP protocol occurred test/e2e/network/service.go:3313 | [sig-network] Services should have session affinity timeout work for NodePort service [LinuxOnly] [Conformance] expand_less | 4m59s | test/e2e/framework/framework.go:597 Jul  5 21:40:03.634: Unexpected error:     <*errors.errorString \| 0xc0005e6f50>: {         s: "service is not reachable within 2m0s timeout on endpoint 172.18.0.2:31750 over TCP protocol",     }     service is not reachable within 2m0s timeout on endpoint 172.18.0.2:31750 over TCP protocol occurred test/e2e/network/service.go:3313
[sig-network] Services should have session affinity timeout work for NodePort service [LinuxOnly] [Conformance] expand_less | 4m59s
test/e2e/framework/framework.go:597 Jul  5 21:40:03.634: Unexpected error:     <*errors.errorString \| 0xc0005e6f50>: {         s: "service is not reachable within 2m0s timeout on endpoint 172.18.0.2:31750 over TCP protocol",     }     service is not reachable within 2m0s timeout on endpoint 172.18.0.2:31750 over TCP protocol occurred test/e2e/network/service.go:3313

problem is kube-proxy does not install the nodeport rules, instead is rejecting the traffic to the nodeport

2020-07-05T21:40:11.472894763Z stderr F I0705 21:40:11.469510       1 proxier.go:1680] Opened local port "nodePort for services-1776/affinity-nodeport-timeout" (:31750/tcp)
2020-07-05T21:40:11.4729044Z stderr F I0705 21:40:11.469527       1 traffic.go:91] [DetectLocalByCIDR (10.244.0.0/16)] Jump Not Local: [-A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp -p tcp -d 10.96.0.10/32 --dport 53 ! -s 10.244.0.0/16 -j KUBE-MARK-MASQ]
2020-07-05T21:40:11.472937524Z stderr F I0705 21:40:11.469553       1 traffic.go:91] [DetectLocalByCIDR (10.244.0.0/16)] Jump Not Local: [-A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp -p tcp -d 10.96.0.10/32 --dport 9153 ! -s 10.244.0.0/16 -j KUBE-MARK-MASQ]
2020-07-05T21:40:11.472944435Z stderr F I0705 21:40:11.469579       1 traffic.go:91] [DetectLocalByCIDR (10.244.0.0/16)] Jump Not Local: [-A KUBE-SERVICES -m comment --comment "default/kubernetes:https cluster IP" -m tcp -p tcp -d 10.96.0.1/32 --dport 443 ! -s 10.244.0.0/16 -j KUBE-MARK-MASQ]
2020-07-05T21:40:11.472953475Z stderr F I0705 21:40:11.469754       1 proxier.go:1567] Restoring iptables rules: *filter
2020-07-05T21:40:11.472978675Z stderr F :KUBE-SERVICES - [0:0]
2020-07-05T21:40:11.472984563Z stderr F :KUBE-EXTERNAL-SERVICES - [0:0]
2020-07-05T21:40:11.472990554Z stderr F :KUBE-FORWARD - [0:0]
2020-07-05T21:40:11.472996638Z stderr F -A KUBE-SERVICES -m comment --comment "services-1776/affinity-nodeport-timeout has no endpoints" -m tcp -p tcp -d 10.110.216.198/32 --dport 80 -j REJECT
2020-07-05T21:40:11.473019145Z stderr F -A KUBE-EXTERNAL-SERVICES -m comment --comment "services-1776/affinity-nodeport-timeout has no endpoints" -m addrtype --dst-type LOCAL -m tcp -p tcp --dport 31750 -j REJECT
2020-07-05T21:40:11.473034603Z stderr F -A KUBE-FORWARD -m conntrack --ctstate INVALID -j DROP
2020-07-05T21:40:11.47306164Z stderr F -A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x00004000/0x00004000 -j ACCEPT
2020-07-05T21:40:11.473067979Z stderr F -A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack pod source rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
2020-07-05T21:40:11.473074652Z stderr F -A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack pod destination rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
2020-07-05T21:40:11.473080719Z stderr F COMMIT
2020-07-05T21:40:11.473088162Z stderr F *nat

aojea · 2020-07-06T09:29:08Z

@BenTheElder FYI, I think that in kind we are being to agressive with iptables and this is causing contention when there is a big number of changes, that may explain the iptables lock issue

diff between kube-prox GCE config

I0705 20:58:01.848313       1 proxier.go:328] iptables(IPv4) sync params: minSyncPeriod=10s, syncPeriod=1m0s, burstSyncs=2

and KIND config

2020-07-05T21:25:18.696487533Z stderr F I0705 21:25:18.693778       1 proxier.go:328] iptables(IPv4) sync params: minSyncPeriod=0s, syncPeriod=30s, burstSyncs=2

aojea · 2020-07-06T20:56:54Z

/retest

aojea · 2020-07-07T14:27:11Z

/test pull-kubernetes-e2e-kind-ipv6
/test pull-kubernetes-e2e-kind

aojea · 2020-07-07T21:28:25Z

/test pull-kubernetes-e2e-kind-ipv6
/test pull-kubernetes-e2e-kind

Currently kube-proxy defaults the min-sync-period for iptables to 0. However, as explained by Dan Winship, "With minSyncPeriod: 0, you run iptables-restore 100 times. With minSyncPeriod: 1s , you run iptables-restore once. With minSyncPeriod: 10s , you also run iptables-restore once, but you might have to wait 10 seconds first"

the iptables monitor was using iptables -L to list the chains, without the -n option, so it was trying to do reverse DNS lookups. A side effect is that it was holding the lock, so other components could not use it. We can use -S instead of -L -n to avoid this, since we only want to check the chain exists. Also, avoid to hold the lock in this operation because we can fail fast and retry later. root@kind-control-plane:/# time iptables -t nat -L > /dev/null real 0m5.154s user 0m0.004s sys 0m0.005s root@kind-control-plane:/# time iptables -t nat -L -n > /dev/null real 0m0.002s user 0m0.000s sys 0m0.002s

aojea · 2020-07-08T09:02:44Z

/test pull-kubernetes-e2e-kind-ipv6

aojea · 2020-07-08T09:52:22Z

/test pull-kubernetes-e2e-kind

aojea · 2020-07-08T10:08:35Z

/test pull-kubernetes-e2e-kind

aojea · 2020-07-08T11:54:56Z

/test pull-kubernetes-e2e-kind-ipv6
/test pull-kubernetes-e2e-kind

k8s-ci-robot · 2020-07-09T11:35:59Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: aojea
To complete the pull request process, please assign thockin
You can assign the PR to them by writing /assign @thockin in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2020-07-09T14:40:31Z

@aojea: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
pull-kubernetes-bazel-test	`a3e1877`	link	`/test pull-kubernetes-bazel-test`
pull-kubernetes-e2e-gce-ubuntu-containerd	`a3e1877`	link	`/test pull-kubernetes-e2e-gce-ubuntu-containerd`
pull-kubernetes-e2e-kind	`a3e1877`	link	`/test pull-kubernetes-e2e-kind`
pull-kubernetes-e2e-gce-device-plugin-gpu	`a3e1877`	link	`/test pull-kubernetes-e2e-gce-device-plugin-gpu`
pull-kubernetes-verify	`a3e1877`	link	`/test pull-kubernetes-verify`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

aojea · 2020-07-09T14:42:36Z

@BenTheElder and the winner holding the lock is .... the portmap plugin

wget https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/92811/pull-kubernetes-e2e-kind/1281213384548683778/artifacts/logs/kind-worker/kubelet.log

grep IPTABLES kubelet.log | grep -v ms |  awk '{ print $NF,$0 }' | sort -k1,1 -n | cut -f2- -d' '
(snipped)
Jul 09 14:02:47 kind-worker kubelet[767]: I0709 14:02:47.983562     767 server.go:1336] IPTABLES lock acquired by Pid: 148823 Comm: (ip6tables) PPid: 147327 PComm: (portmap) ElapsedTime: 12.807867801s
Jul 09 14:01:32 kind-worker kubelet[767]: I0709 14:01:32.837867     767 server.go:1336] IPTABLES lock acquired by Pid: 140811 Comm: (ip6tables) PPid: 140644 PComm: (portmap) ElapsedTime: 12.839015454s
Jul 09 14:01:32 kind-worker kubelet[767]: I0709 14:01:32.790118     767 server.go:1336] IPTABLES lock acquired by Pid: 140660 Comm: (iptables) PPid: 139580 PComm: (portmap) ElapsedTime: 12.843264998s
Jul 09 14:02:49 kind-worker kubelet[767]: I0709 14:02:49.116133     767 server.go:1336] IPTABLES lock acquired by Pid: 148823 Comm: (ip6tables) PPid: 147327 PComm: (portmap) ElapsedTime: 13.940434659s
Jul 09 14:01:34 kind-worker kubelet[767]: I0709 14:01:34.144748     767 server.go:1336] IPTABLES lock acquired by Pid: 140811 Comm: (ip6tables) PPid: 140644 PComm: (portmap) ElapsedTime: 14.145899462s
Jul 09 14:01:35 kind-worker kubelet[767]: I0709 14:01:35.259976     767 server.go:1336] IPTABLES lock acquired by Pid: 140811 Comm: (ip6tables) PPid: 140644 PComm: (portmap) ElapsedTime: 15.261130135s
Jul 09 14:02:50 kind-worker kubelet[767]: I0709 14:02:50.517041     767 server.go:1336] IPTABLES lock acquired by Pid: 148823 Comm: (ip6tables) PPid: 147327 PComm: (portmap) ElapsedTime: 15.341346641s
Jul 09 14:01:36 kind-worker kubelet[767]: I0709 14:01:36.372998     767 server.go:1336] IPTABLES lock acquired by Pid: 140811 Comm: (ip6tables) PPid: 140644 PComm: (portmap) ElapsedTime: 16.374148888s
Jul 09 14:01:37 kind-worker kubelet[767]: I0709 14:01:37.471170     767 server.go:1336] IPTABLES lock acquired by Pid: 140811 Comm: (ip6tables) PPid: 140644 PComm: (portmap) ElapsedTime: 17.472321233s
Jul 09 14:01:38 kind-worker kubelet[767]: I0709 14:01:38.573475     767 server.go:1336] IPTABLES lock acquired by Pid: 140811 Comm: (ip6tables) PPid: 140644 PComm: (portmap) ElapsedTime: 18.574628968s
Jul 09 14:01:39 kind-worker kubelet[767]: I0709 14:01:39.770235     767 server.go:1336] IPTABLES lock acquired by Pid: 140811 Comm: (ip6tables) PPid: 140644 PComm: (portmap) ElapsedTime: 19.771386831s

k8s-ci-robot · 2020-07-12T22:38:54Z

@aojea: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

aojea · 2020-07-13T09:05:02Z

more findings about this, it seems that when a pod is deleted the portmap is executed 3 times, despite it does not have portmaps configured, if we add that to the fact that it executes iptables and ip6tables it means we hold the lock 6 times.

how to repro

move the portmap plugin to portmap.bin
create a wrapper script to log the calls

docker exec -it kind-control-plane cat /opt/cni/bin/portmap
#!/bin/sh
#!/bin/sh
echo $(date +"%T") >> /tmp/debug_portmap
echo "CNI: $CNI_COMMAND $CNI_IFNAME $CNI_NETNS $CNI_CONTAINERID"  >> /tmp/debug_portmap
tee -a /tmp/debug_portmap | /opt/cni/bin/portmap.bin
echo "\n---------------" >> /tmp/debug_portmap

create a pod kubectl run nginx --image=nginx --port=80

CNI: ADD eth0 /var/run/netns/cni-5f0b2fce-72a5-6108-092f-241e4030e484 47aa93243add946867e7acb33291a483ea19f417bef324665c2f9669ede0b703
{"capabilities":{"portMappings":true},"cniVersion":"0.3.1","name":"kindnet","prevResult":{"cniVersion":"0.3.1","dns":{},"interfaces":[{"mac":"86:36:f5:fa:5b:9e","name":"vethf01dee95"},{"mac":"02:4c:5d:31:c8:a1","name":"eth0","sandbox":"/var/run/netns/cni-5f0b2fce-72a5-6108-092f-241e4030e484"}],"ips":[{"address":"10.244.0.21/24","gateway":"10.244.0.1","interface":1,"version":"4"}],"routes":[{"dst":"0.0.0.0/0"}]},"runtimeConfig":{"portMappings":null},"type":"portmap"}
---------------

delete the pod kubectl delete deployment nginx

09:39:44
CNI: DEL eth0 /var/run/netns/cni-5f0b2fce-72a5-6108-092f-241e4030e484 47aa93243add946867e7acb33291a483ea19f417bef324665c2f9669ede0b703
{"capabilities":{"portMappings":true},"cniVersion":"0.3.1","name":"kindnet","runtimeConfig":{"portMappings":null},"type":"portmap"}
---------------
09:39:45
CNI: DEL eth0  47aa93243add946867e7acb33291a483ea19f417bef324665c2f9669ede0b703
{"capabilities":{"portMappings":true},"cniVersion":"0.3.1","name":"kindnet","runtimeConfig":{"portMappings":null},"type":"portmap"}
---------------
09:39:46
CNI: DEL eth0  47aa93243add946867e7acb33291a483ea19f417bef324665c2f9669ede0b703
{"capabilities":{"portMappings":true},"cniVersion":"0.3.1","name":"kindnet","runtimeConfig":{"portMappings":null},"type":"portmap"}
---------------

if we configure a hostPort

09:39:43
CNI: ADD eth0 /var/run/netns/cni-0b43b484-5f8c-f5ee-45fc-799d285e8c52 587c678bc086c87e4e95748ac2e1a186d426d8ada6758450f7b8847a53bfc482
{"capabilities":{"portMappings":true},"cniVersion":"0.3.1","name":"kindnet","prevResult":{"cniVersion":"0.3.1","dns":{},"interfaces":[{"mac":"fe:ea:f7:1e:f5:7b","name":"vethecffc0b3"},{"mac":"72:69:36:b7:bb:1d","name":"eth0","sandbox":"/var/run/netns/cni-0b43b484-5f8c-f5ee-45fc-799d285e8c52"}],"ips":[{"address":"10.244.0.22/24","gateway":"10.244.0.1","interface":1,"version":"4"}],"routes":[{"dst":"0.0.0.0/0"}]},"runtimeConfig":{"portMappings":[{"HostPort":8080,"ContainerPort":80,"Protocol":"tcp","HostIP":""}]},"type":"portmap"}

we can observe that is executed one time more with a delay

09:42:41
CNI: DEL eth0 /var/run/netns/cni-0b43b484-5f8c-f5ee-45fc-799d285e8c52 587c678bc086c87e4e95748ac2e1a186d426d8ada6758450f7b8847a53bfc482
{"capabilities":{"portMappings":true},"cniVersion":"0.3.1","name":"kindnet","runtimeConfig":{"portMappings":[{"HostPort":8080,"ContainerPort":80,"Protocol":"tcp","HostIP":""}]},"type":"portmap"}
---------------
09:42:42
CNI: DEL eth0  587c678bc086c87e4e95748ac2e1a186d426d8ada6758450f7b8847a53bfc482
{"capabilities":{"portMappings":true},"cniVersion":"0.3.1","name":"kindnet","runtimeConfig":{"portMappings":[{"HostPort":8080,"ContainerPort":80,"Protocol":"tcp","HostIP":""}]},"type":"portmap"}
---------------
09:42:43
CNI: DEL eth0  587c678bc086c87e4e95748ac2e1a186d426d8ada6758450f7b8847a53bfc482
{"capabilities":{"portMappings":true},"cniVersion":"0.3.1","name":"kindnet","runtimeConfig":{"portMappings":[{"HostPort":8080,"ContainerPort":80,"Protocol":"tcp","HostIP":""}]},"type":"portmap"}
---------------
09:43:31
CNI: DEL eth0  587c678bc086c87e4e95748ac2e1a186d426d8ada6758450f7b8847a53bfc482
{"capabilities":{"portMappings":true},"cniVersion":"0.3.1","name":"kindnet","runtimeConfig":{"portMappings":[{"HostPort":8080,"ContainerPort":80,"Protocol":"tcp","HostIP":""}]},"type":"portmap"}
---------------

so, it seems containerd tears down the network multiple times

Jul 13 09:13:15 kind-control-plane containerd[25550]: time="2020-07-13T09:13:15.937380950Z" level=info msg="TearDown network for sandbox \"b78efe047b461be029f0b2e8a08a1f478bfaee97a57c253945e5ca4289e38014\" successfully"
Jul 13 09:13:16 kind-control-plane containerd[25550]: time="2020-07-13T09:13:16.859755495Z" level=info msg="TearDown network for sandbox \"b78efe047b461be029f0b2e8a08a1f478bfaee97a57c253945e5ca4289e38014\" successfully"
Jul 13 09:13:17 kind-control-plane containerd[25550]: time="2020-07-13T09:13:17.696627579Z" level=info msg="TearDown network for sandbox \"b78efe047b461be029f0b2e8a08a1f478bfaee97a57c253945e5ca4289e38014\" successfully"
Jul 13 09:13:17 kind-control-plane containerd[25550]: time="2020-07-13T09:13:17.861819714Z" level=info msg="TearDown network for sandbox \"b78efe047b461be029f0b2e8a08a1f478bfaee97a57c253945e5ca4289e38014\" successfully"
Jul 13 09:13:30 kind-control-plane containerd[25550]: time="2020-07-13T09:13:30.861008355Z" level=info msg="TearDown network for sandbox \"b78efe047b461be029f0b2e8a08a1f478bfaee97a57c253945e5ca4289e38014\" successfully"

full log here
https://pastebin.com/9d83jMvj

if we log the number of iptables called issued using a tool like execsnoop, we can see that the portmap plugin executes multiples iptables and ip6tables calls using the --wait flag without any timeout, that means wait forever

sudo ./execsnoop.bt | grep -E ip6?tables

84112      78102 /usr/sbin/iptables --version
84113      78103 /usr/sbin/iptables -t nat -S OUTPUT --wait
84115      78104 /usr/sbin/ip6tables --version
84116      78105 /usr/sbin/ip6tables -t nat -S OUTPUT --wait
84117      78106 /usr/sbin/iptables -t nat -N CNI-DN-9d924da12ab9b7dd58f9a --wait
84118      78107 /usr/sbin/iptables -t nat -F CNI-DN-9d924da12ab9b7dd58f9a --wait
84119      78108 /usr/sbin/iptables -t nat -S CNI-HOSTPORT-DNAT --wait
84121      78109 /usr/sbin/iptables -t nat -D CNI-HOSTPORT-DNAT -p tcp -m comment --comment dnat name: "kindnet" id: "034ae066e50dbfd10de36326a95ce2ac23276958e2408c02eb5a3c56732b6574" -m multiport --dports 8080 -j
84125      78110 /usr/sbin/iptables -t nat -X CNI-DN-9d924da12ab9b7dd58f9a --wait
84128      78111 /usr/sbin/iptables -t nat -N CNI-SN-9d924da12ab9b7dd58f9a --wait
84130      78112 /usr/sbin/iptables -t nat -S CNI-HOSTPORT-SNAT --wait
84131      78113 /usr/sbin/iptables -t nat -X CNI-SN-9d924da12ab9b7dd58f9a --wait
84132      78114 /usr/sbin/ip6tables -t nat -N CNI-DN-9d924da12ab9b7dd58f9a --wait
84133      78115 /usr/sbin/ip6tables -t nat -S CNI-HOSTPORT-DNAT --wait
84134      78116 /usr/sbin/ip6tables -t nat -X CNI-DN-9d924da12ab9b7dd58f9a --wait
84136      78117 /usr/sbin/ip6tables -t nat -N CNI-SN-9d924da12ab9b7dd58f9a --wait
84137      78118 /usr/sbin/ip6tables -t nat -S CNI-HOSTPORT-SNAT --wait
84138      78119 /usr/sbin/ip6tables -t nat -X CNI-SN-9d924da12ab9b7dd58f9a --wait
84250      78155 /usr/sbin/iptables --version
84252      78156 /usr/sbin/iptables -t nat -S OUTPUT --wait
84253      78157 /usr/sbin/ip6tables --version
84254      78158 /usr/sbin/ip6tables -t nat -S OUTPUT --wait
84255      78159 /usr/sbin/iptables -t nat -N CNI-DN-9d924da12ab9b7dd58f9a --wait
84256      78160 /usr/sbin/iptables -t nat -S CNI-HOSTPORT-DNAT --wait
84257      78161 /usr/sbin/iptables -t nat -X CNI-DN-9d924da12ab9b7dd58f9a --wait
84258      78162 /usr/sbin/iptables -t nat -N CNI-SN-9d924da12ab9b7dd58f9a --wait
84259      78163 /usr/sbin/iptables -t nat -S CNI-HOSTPORT-SNAT --wait
84260      78164 /usr/sbin/iptables -t nat -X CNI-SN-9d924da12ab9b7dd58f9a --wait
84261      78165 /usr/sbin/ip6tables -t nat -N CNI-DN-9d924da12ab9b7dd58f9a --wait
84263      78166 /usr/sbin/ip6tables -t nat -S CNI-HOSTPORT-DNAT --wait
84264      78167 /usr/sbin/ip6tables -t nat -X CNI-DN-9d924da12ab9b7dd58f9a --wait
84265      78168 /usr/sbin/ip6tables -t nat -N CNI-SN-9d924da12ab9b7dd58f9a --wait
84266      78169 /usr/sbin/ip6tables -t nat -S CNI-HOSTPORT-SNAT --wait
84267      78170 /usr/sbin/ip6tables -t nat -X CNI-SN-9d924da12ab9b7dd58f9a --wait
85250      78243 /usr/sbin/iptables --version
85251      78244 /usr/sbin/iptables -t nat -S OUTPUT --wait
85253      78245 /usr/sbin/ip6tables --version
85253      78246 /usr/sbin/ip6tables -t nat -S OUTPUT --wait
85255      78247 /usr/sbin/iptables -t nat -N CNI-DN-9d924da12ab9b7dd58f9a --wait
85257      78248 /usr/sbin/iptables -t nat -S CNI-HOSTPORT-DNAT --wait
85260      78250 /usr/sbin/iptables -t nat -X CNI-DN-9d924da12ab9b7dd58f9a --wait
85262      78251 /usr/sbin/iptables -t nat -N CNI-SN-9d924da12ab9b7dd58f9a --wait
85264      78252 /usr/sbin/iptables -t nat -S CNI-HOSTPORT-SNAT --wait
85265      78253 /usr/sbin/iptables -t nat -X CNI-SN-9d924da12ab9b7dd58f9a --wait
85266      78254 /usr/sbin/ip6tables -t nat -N CNI-DN-9d924da12ab9b7dd58f9a --wait
85267      78255 /usr/sbin/ip6tables -t nat -S CNI-HOSTPORT-DNAT --wait
85268      78256 /usr/sbin/ip6tables -t nat -X CNI-DN-9d924da12ab9b7dd58f9a --wait
85269      78257 /usr/sbin/ip6tables -t nat -N CNI-SN-9d924da12ab9b7dd58f9a --wait
85270      78258 /usr/sbin/ip6tables -t nat -S CNI-HOSTPORT-SNAT --wait
85271      78259 /usr/sbin/ip6tables -t nat -X CNI-SN-9d924da12ab9b7dd58f9a --wait
88303      78290 /usr/sbin/iptables -t nat -S --wait

bboreham · 2020-07-15T15:25:47Z

Great piece of detective work, Antonio! We've now merged your PR into the portmap plugin.

Can I ask: are there any issues filed to reduce the number of times Kubelet (or containerd) calls DEL on the same pod?

aojea · 2020-07-15T15:43:27Z

Great piece of detective work, Antonio! We've now merged your PR into the portmap plugin.

Thanks, we rolled out the change in the CI with KIND yesterday and random network failures stopped, so it seems is working ok

Can I ask: are there any issues filed to reduce the number of times Kubelet (or containerd) calls DEL on the same pod?

I opened one some time ago containerd/cri#1352
I think that since the DEL should be idempotent the runtimes prefer to be safe and run it to avoid sandbox leakage, but that's my guess

bboreham · 2020-07-15T16:03:21Z

That CRI issue seems focused on whether the calls should be concurrent, whereas my point is that they shouldn't need more than one, for a given container.

aojea · 2020-07-15T16:26:29Z

That CRI issue seems focused on whether the calls should be concurrent, whereas my point is that they shouldn't need more than one, for a given container.

yeah, sorry,

That CRI issue seems focused on whether the calls should be concurrent, whereas my point is that they shouldn't need more than one, for a given container.

ups, sorry, no I'm aware of, I attached some kubelet logs here https://pastebin.com/9d83jMvj , that show that TearDown network is called 3 times from the kubelet, I didn't analyzed the events that are triggering the TearDowns that causes that the CRI configured send the DEL to the CNI

aojea · 2020-07-22T08:45:34Z

it was very useful 😄

k8s-ci-robot requested review from bowei and MrHohn July 5, 2020 20:43

k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 5, 2020

BenTheElder mentioned this pull request Jul 6, 2020

low kube-proxy sync period potentially causing iptables lock conflicts in CI kubernetes-sigs/kind#1707

Closed

aojea force-pushed the iptlock branch from dca46d5 to de90f10 Compare July 7, 2020 21:10

aojea force-pushed the iptlock branch from de90f10 to 3c818f5 Compare July 8, 2020 06:56

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jul 8, 2020

aojea mentioned this pull request Jul 8, 2020

iptables: don't do reverse DNS lookups #92860

Merged

Antonio Ojea and others added 3 commits July 8, 2020 10:59

debug kubeproxy iptables lock

721d447

aojea force-pushed the iptlock branch from 3c818f5 to edc1145 Compare July 8, 2020 08:59

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/kubelet and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 9, 2020

k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Jul 9, 2020

aojea force-pushed the iptlock branch from 21af5e0 to b075afc Compare July 9, 2020 11:46

kubelet iptables lock monitor

a3e1877

aojea force-pushed the iptlock branch from b075afc to a3e1877 Compare July 9, 2020 13:07

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 12, 2020

This was referenced Jul 13, 2020

portmap should not perform deletions if not portMapping config received containernetworking/plugins#509

Merged

portmap plugin is causing performance problems in Kubernetes due to iptables lock containernetworking/plugins#510

Closed

aojea closed this Jul 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][TEST} debug kubeproxy iptables lock #92811

[WIP][TEST} debug kubeproxy iptables lock #92811

aojea commented Jul 5, 2020

k8s-ci-robot commented Jul 5, 2020

aojea commented Jul 5, 2020

aojea commented Jul 6, 2020

aojea commented Jul 6, 2020

aojea commented Jul 6, 2020

aojea commented Jul 7, 2020

aojea commented Jul 7, 2020

aojea commented Jul 8, 2020

aojea commented Jul 8, 2020

aojea commented Jul 8, 2020

aojea commented Jul 8, 2020

k8s-ci-robot commented Jul 9, 2020

k8s-ci-robot commented Jul 9, 2020 •

edited

aojea commented Jul 9, 2020

k8s-ci-robot commented Jul 12, 2020

aojea commented Jul 13, 2020 •

edited

bboreham commented Jul 15, 2020

aojea commented Jul 15, 2020 •

edited

bboreham commented Jul 15, 2020

aojea commented Jul 15, 2020

aojea commented Jul 22, 2020

[WIP][TEST} debug kubeproxy iptables lock #92811

[WIP][TEST} debug kubeproxy iptables lock #92811

Conversation

aojea commented Jul 5, 2020

k8s-ci-robot commented Jul 5, 2020

aojea commented Jul 5, 2020

aojea commented Jul 6, 2020

aojea commented Jul 6, 2020

aojea commented Jul 6, 2020

aojea commented Jul 7, 2020

aojea commented Jul 7, 2020

aojea commented Jul 8, 2020

aojea commented Jul 8, 2020

aojea commented Jul 8, 2020

aojea commented Jul 8, 2020

k8s-ci-robot commented Jul 9, 2020

k8s-ci-robot commented Jul 9, 2020 • edited

aojea commented Jul 9, 2020

k8s-ci-robot commented Jul 12, 2020

aojea commented Jul 13, 2020 • edited

bboreham commented Jul 15, 2020

aojea commented Jul 15, 2020 • edited

bboreham commented Jul 15, 2020

aojea commented Jul 15, 2020

aojea commented Jul 22, 2020

k8s-ci-robot commented Jul 9, 2020 •

edited

aojea commented Jul 13, 2020 •

edited

aojea commented Jul 15, 2020 •

edited