Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flannel / iptables: resource temporarily unavailable #88148

Open
ofrivera opened this issue Feb 14, 2020 · 2 comments
Open

Flannel / iptables: resource temporarily unavailable #88148

ofrivera opened this issue Feb 14, 2020 · 2 comments

Comments

@ofrivera
Copy link

@ofrivera ofrivera commented Feb 14, 2020

Networking between (some) pods stops working, I could ping some containers from one of the host(s), but not others. Restart flanneld on the subset of nodes not responding might help. As we looked closer to this issue we started noticing iptables errors affecting flanneld, kube-proxy and kubelet.

Sometimes the issue my correct on its own or after a few hours, but worst case scenario nodes can suddenly die and stop responding, causing all kind of issues to the cluster, some side effects are connections dropped or latency.

What happened:
flanneld, kube-proxy and kubelet send errors

iptables: resource temporarily unavailable
ipset: resource temporarily unavailable
runc: resource temporarily unavailable: : unknown

flanneld:

Feb 13 03:44:30 k8-wn-01-lf.net flanneld[2278]: E0213 03:44:30.343178    2278 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: fork/exec /usr/sbin/iptables: resource temporarily unavailable
Feb 13 03:44:31 k8-wn-01-lf.net flanneld[2278]: E0213 03:44:31.052376    2278 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: fork/exec /usr/sbin/iptables: resource temporarily unavailable
Feb 13 04:10:22 k8-wn-01-lf.net flanneld[2278]: E0213 04:10:22.521416    2278 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: fork/exec /usr/sbin/iptables: resource temporarily unavailable
Feb 13 04:10:23 k8-wn-01-lf.net flanneld[2278]: E0213 04:10:23.093853    2278 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: fork/exec /usr/sbin/iptables: resource temporarily unavailable
Feb 13 04:10:27 k8-wn-01-lf.net flanneld[2278]: E0213 04:10:27.521844    2278 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: fork/exec /usr/sbin/iptables: resource temporarily unavailable

kube-proxy:

Feb 13 04:10:25 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:10:25.748082    1911 proxier.go:1493] Failed to execute iptables-save, syncing all rules: fork/exec /usr/sbin/iptables-save: resource temporarily unavailable
Feb 13 04:10:25 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:10:25.748267    1911 proxier.go:1493] Failed to execute iptables-save, syncing all rules: fork/exec /usr/sbin/iptables-save: resource temporarily unavailable
Feb 13 04:10:25 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:10:25.748456    1911 proxier.go:1442] Failed to ensure that nat chain KUBE-SERVICES exists: error creating chain "KUBE-SERVICES": fork/exec /usr/sbin/iptables: resource temporarily unavailable:
Feb 13 04:10:25 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:10:25.748712    1911 ipset.go:162] Failed to make sure ip set: &{{KUBE-LOAD-BALANCER-SOURCE-CIDR hash:ip,port,net inet 1024 65536 0-65535 Kubernetes service load balancer ip + port + source cidr for packet filter purpose} map[] 0xc0001e3f50} exist, error: error creating ipset KUBE-LOAD-BALANCER-SOURCE-CIDR, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable
Feb 13 04:10:35 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:10:35.959750    1911 proxier.go:1493] Failed to execute iptables-save, syncing all rules: fork/exec /usr/sbin/iptables-save: resource temporarily unavailable
Feb 13 04:10:35 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:10:35.960059    1911 proxier.go:1493] Failed to execute iptables-save, syncing all rules: fork/exec /usr/sbin/iptables-save: resource temporarily unavailable
Feb 13 04:10:35 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:10:35.960365    1911 proxier.go:1442] Failed to ensure that nat chain KUBE-SERVICES exists: error creating chain "KUBE-SERVICES": fork/exec /usr/sbin/iptables: resource temporarily unavailable:
Feb 13 04:11:14 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:11:14.028881    1911 ipset.go:162] Failed to make sure ip set: &{{KUBE-LOOP-BACK hash:ip,port,ip inet 1024 65536 0-65535 Kubernetes endpoints dst ip:port, source ip for solving hairpin purpose} map[] 0xc0001e3f50} exist, error: error creating ipset KUBE-LOOP-BACK, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable
Feb 13 04:14:18 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:14:18.732105    1911 proxier.go:1493] Failed to execute iptables-save, syncing all rules: fork/exec /usr/sbin/iptables-save: resource temporarily unavailable
Feb 13 04:14:18 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:14:18.732287    1911 proxier.go:1493] Failed to execute iptables-save, syncing all rules: fork/exec /usr/sbin/iptables-save: resource temporarily unavailable
Feb 13 04:14:18 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:14:18.732441    1911 proxier.go:1442] Failed to ensure that nat chain KUBE-SERVICES exists: error creating chain "KUBE-SERVICES": fork/exec /usr/sbin/iptables: resource temporarily unavailable:
Feb 13 04:21:36 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:21:36.688985    1911 ipset.go:162] Failed to make sure ip set: &{{KUBE-CLUSTER-IP hash:ip,port inet 1024 65536 0-65535 Kubernetes service cluster ip + port for masquerade purpose} map[10.3.0.1,tcp:443:{} 10.3.0.10,tcp:53:{} 10.3.0.10,tcp:9153:{} 10.3.0.10,udp:53:{} 10.3.1.29,tcp:8080:{} 10.3.1.84,tcp:5015:{} 10.3.10.129,tcp:9091:{} 10.3.100.203,tcp:5001:{} 10.3.101.211,tcp:5015:{} 10.3.102.115,tcp:8080:{} 10.3.102.51,tcp:8080:{} 10.3.104.18,tcp:80:{} 10.3.104.185,tcp:8080:{} 10.3.104.252,tcp:8080:{} 10.3.105.99,tcp:8888:{} 10.3.106.18,tcp:5015:{} 10.3.107.170,tcp:80:{} 10.3.11.65,tcp:8888:{} 10.3.111.163,tcp:80:{} 10.3.112.154,tcp:80:{} 10.3.119.156,tcp:5901:{} 10.3.12.93,tcp:10902:{} 10.3.12.93,tcp:9090:{} 10.3.120.21,tcp:6379:{} 10.3.121.200,tcp:5015:{} 10.3.122.178,tcp:8080:{} 10.3.122.221,tcp:80:{} 10.3.124.14,tcp:5010:{} 10.3.126.145,tcp:9090:{} 10.3.127.172,tcp:5009:{} 10.3.127.193,tcp:80:{} 10.3.128.51,tcp:8080:{} 10.3.129.158,tcp:8080:{} 10.3.130.16,tcp:26379:{} 10.3.132.142,tcp:26379:{} 10.3.132.186,tcp:8080:{} 10.3.132.63,tcp:8082:{} 10.3.133.236,tcp:8080:{} 10.3.134.201,tcp:80:{} 10.3.136.83,tcp:5001:{} 10.3.137.194,tcp:8080:{} 10.3.138.29,tcp:5015:{} 10.3.141.150,tcp:80:{} 10.3.145.250,tcp:80:{} 10.3.149.140,tcp:80:{} 10.3.149.166,tcp:8080:{} 10.3.149.206,tcp:8081:{} 10.3.149.206,tcp:8083:{} 10.3.149.206,tcp:8084:{} 10.3.149.68,tcp:8080:{} 10.3.151.210,tcp:8080:{} 10.3.152.237,tcp:80:{} 10.3.152.238,tcp:5015:{} 10.3.154.174,tcp:5901:{} 10.3.156.221,tcp:5901:{} 10.3.157.226,tcp:80:{} 10.3.158.124,tcp:80:{} 10.3.158.190,tcp:8080:{} 10.3.158.38,tcp:10902:{} 10.3.158.38,tcp:9090:{} 10.3.159.48,tcp:8080:{} 10.3.163.117,tcp:8080:{} 10.3.163.142,tcp:6379:{} 10.3.163.63,tcp:8080:{} 10.3.164.34,tcp:8080:{} 10.3.164.99,tcp:80:{} 10.3.166.46,tcp:8080:{} 10.3.167.13,tcp:8080:{} 10.3.171.228,tcp:5015:{} 10.3.171.33,tcp:80:{} 10.3.172.127,tcp:5901:{} 10.3.172.146,tcp:443:{} 10.3.175.241,tcp:5901:{} 10.3.177.209,tcp:3000:{} 10.3.177.220,tcp:5901:{} 10.3.179.104,tcp:8080:{} 10.3.181.7,tcp:5006:{} 10.3.183.1,tcp:8080:{} 10.3.185.61,tcp:8081:{} 10.3.185.70,tcp:8080:{} 10.3.186.34,tcp:5901:{} 10.3.187.6,tcp:8080:{} 10.3.188.192,tcp:5015:{} 10.3.188.204,tcp:5015:{} 10.3.19.17,tcp:5015:{} 10.3.19.68,tcp:5015:{} 10.3.192.145,tcp:8081:{} 10.3.194.123,tcp:5901:{} 10.3.197.168,tcp:5015:{} 10.3.198.244,tcp:443:{} 10.3.198.244,tcp:80:{} 10.3.199.82,tcp:8888:{} 10.3.2.12,tcp:5901:{} 10.3.2.183,tcp:80:{} 10.3.2.55,tcp:443:{} 10.3.2.55,tcp:80:{} 10.3.200.254,tcp:8000:{} 10.3.201.38,tcp:5001:{} 10.3.203.15,tcp:8080:{} 10.3.204.82,tcp:5015:{} 10.3.205.170,tcp:5006:{} 10.3.206.122,tcp:5902:{} 10.3.207.136,tcp:5015:{} 10.3.207.205,tcp:80:{} 10.3.207.238,tcp:8080:{} 10.3.209.122,tcp:5003:{} 10.3.209.222,tcp:5010:{} 10.3.209.64,tcp:8080:{} 10.3.21.211,tcp:5016:{} 10.3.210.12,tcp:80:{} 10.3.215.241,tcp:5901:{} 10.3.217.180,tcp:80:{} 10.3.219.126,tcp:10902:{} 10.3.219.126,tcp:9090:{} 10.3.219.161,tcp:8080:{} 10.3.220.58,tcp:5902:{} 10.3.223.8,tcp:5902:{} 10.3.224.83,tcp:8080:{} 10.3.23.164,tcp:8080:{} 10.3.23.207,tcp:80:{} 10.3.230.178,tcp:5001:{} 10.3.236.247,tcp:8080:{} 10.3.237.68,tcp:8080:{} 10.3.24.71,tcp:443:{} 10.3.242.152,tcp:80:{} 10.3.246.223,tcp:8080:{} 10.3.248.43,tcp:8080:{} 10.3.248.58,tcp:80:{} 10.3.252.209,tcp:5003:{} 10.3.253.37,tcp:80:{} 10.3.26.87,tcp:26379:{} 10.3.27.240,tcp:80:{} 10.3.3.207,tcp:8080:{} 10.3.30.246,tcp:8080:{} 10.3.30.40,tcp:80:{} 10.3.32.105,tcp:5901:{} 10.3.32.209,tcp:5000:{} 10.3.34.168,tcp:8080:{} 10.3.35.173,tcp:5016:{} 10.3.35.48,tcp:5901:{} 10.3.39.100,tcp:80:{} 10.3.4.132,tcp:80:{} 10.3.4.208,tcp:80:{} 10.3.41.241,tcp:8082:{} 10.3.41.77,tcp:8080:{} 10.3.45.237,tcp:8080:{} 10.3.46.120,tcp:8080:{} 10.3.48.220,tcp:80:{} 10.3.5.159,tcp:5015:{} 10.3.5.222,tcp:5901:{} 10.3.5.92,tcp:5001:{} 10.3.51.242,tcp:8080:{} 10.3.51.58,tcp:80:{} 10.3.52.184,tcp:8080:{} 10.3.53.74,tcp:80:{} 10.3.55.210,tcp:80:{} 10.3.55.27,tcp:5015:{} 10.3.55.77,tcp:80:{} 10.3.57.184,tcp:8080:{} 10.3.58.204,tcp:5902:{} 10.3.6.144,tcp:6379:{} 10.3.60.193,tcp:8080:{} 10.3.61.152,tcp:5015:{} 10.3.61.162,tcp:5015:{} 10.3.61.4,tcp:5901:{} 10.3.62.135,tcp:8082:{} 10.3.62.173,tcp:9090:{} 10.3.62.39,tcp:5000:{} 10.3.62.7,tcp:5004:{} 10.3.66.143,tcp:8080:{} 10.3.66.194,tcp:80:{} 10.3.69.249,tcp:5015:{} 10.3.7.179,tcp:443:{} 10.3.70.188,tcp:8080:{} 10.3.72.175,tcp:80:{} 10.3.72.246,tcp:5901:{} 10.3.75.171,tcp:8080:{} 10.3.78.64,tcp:80:{} 10.3.79.162,tcp:8080:{} 10.3.79.8,tcp:8080:{} 10.3.82.192,tcp:80:{} 10.3.82.32,tcp:5015:{} 10.3.88.67,tcp:80:{} 10.3.9.162,tcp:80:{} 10.3.92.176,tcp:80:{} 10.3.93.239,tcp:8082:{} 10.3.96.94,tcp:8080:{} 10.3.97.137,tcp:8080:{} 10.3.99.156,tcp:8080:{} 10.3.99.7,tcp:80:{}] 0xc0001e3f50} exist, error: error creating ipset KUBE-CLUSTER-IP, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable
Feb 13 04:23:53 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:23:53.666233    1911 ipset.go:128] Failed to list ip set entries, error: error listing set: KUBE-NODE-PORT-SCTP, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable
Feb 13 04:23:53 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:23:53.666307    1911 ipset.go:128] Failed to list ip set entries, error: error listing set: KUBE-LOAD-BALANCER-SOURCE-CIDR, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable
Feb 13 04:23:53 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:23:53.666377    1911 ipset.go:128] Failed to list ip set entries, error: error listing set: KUBE-LOAD-BALANCER, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable
Feb 13 04:23:53 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:23:53.666448    1911 ipset.go:128] Failed to list ip set entries, error: error listing set: KUBE-LOAD-BALANCER-FW, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable
Feb 13 04:23:53 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:23:53.666530    1911 ipset.go:128] Failed to list ip set entries, error: error listing set: KUBE-LOAD-BALANCER-LOCAL, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable
Feb 13 04:23:53 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:23:53.666601    1911 ipset.go:128] Failed to list ip set entries, error: error listing set: KUBE-LOAD-BALANCER-SOURCE-IP, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable
Feb 13 04:23:53 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:23:53.666755    1911 ipset.go:128] Failed to list ip set entries, error: error listing set: KUBE-CLUSTER-IP, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable
Feb 13 04:23:53 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:23:53.666827    1911 ipset.go:128] Failed to list ip set entries, error: error listing set: KUBE-NODE-PORT-TCP, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable
Feb 13 04:23:53 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:23:53.666897    1911 ipset.go:128] Failed to list ip set entries, error: error listing set: KUBE-NODE-PORT-LOCAL-TCP, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable
Feb 13 04:23:53 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:23:53.666969    1911 ipset.go:128] Failed to list ip set entries, error: error listing set: KUBE-NODE-PORT-LOCAL-SCTP, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable
Feb 13 04:23:53 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:23:53.667041    1911 ipset.go:128] Failed to list ip set entries, error: error listing set: KUBE-EXTERNAL-IP, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable
Feb 13 04:23:53 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:23:53.667122    1911 ipset.go:128] Failed to list ip set entries, error: error listing set: KUBE-NODE-PORT-LOCAL-UDP, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable
Feb 13 04:23:53 k8-wn-01-lf.net kube-proxy[1911]: E0213 04:23:53.667194    1911 ipset.go:128] Failed to list ip set entries, error: error listing set: KUBE-LOOP-BACK, error: fork/exec /usr/sbin/ipset: resource temporarily unavailable

kubelet:

Feb 13 04:21:17 k8-wn-01-lf.net kubelet[3906]: E0213 04:21:17.347533    3906 kubelet_network_linux.go:53] Failed to ensure that nat chain KUBE-MARK-DROP exists: error creating chain "KUBE-MARK-DROP": fork/exec /usr/sbin/iptables: resource temporarily unavailable:
Feb 13 04:24:17 k8-wn-01-lf.net kubelet[3906]: E0213 04:24:17.389055    3906 kubelet_network_linux.go:53] Failed to ensure that nat chain KUBE-MARK-DROP exists: error creating chain "KUBE-MARK-DROP": fork/exec /usr/sbin/iptables: resource temporarily unavailable:
Feb 13 04:25:17 k8-wn-01-lf.net kubelet[3906]: E0213 04:25:17.389471    3906 kubelet_network_linux.go:53] Failed to ensure that nat chain KUBE-MARK-DROP exists: error creating chain "KUBE-MARK-DROP": fork/exec /usr/sbin/iptables: resource temporarily unavailable:
Feb 13 04:26:02 k8-wn-01-lf.net kubelet[3906]: E0213 04:26:02.576829    3906 mount_linux.go:143] Mount failed: fork/exec /usr/bin/systemd-run: resource temporarily unavailable
Feb 13 04:26:02 k8-wn-01-lf.net kubelet[3906]: E0213 04:26:02.576937    3906 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/secret/ff702226-4e5b-11ea-83a0-42e50a0a0350-default-token-gg2hf\" (\"ff702226-4e5b-11ea-83a0-42e50a0a0350\")" failed. No retries permitted until 2020-02-13 04:26:03.076892337 -0800 PST m=+542776.196923036 (durationBeforeRetry 500ms). Error: "MountVolume.SetUp failed for volume \"default-token-gg2hf\" (UniqueName: \"kubernetes.io/secret/ff702226-4e5b-11ea-83a0-42e50a0a0350-default-token-gg2hf\") pod \"collections-5cc4dcccd5-nf92g\" (UID: \"ff702226-4e5b-11ea-83a0-42e50a0a0350\") : mount failed: fork/exec /usr/bin/systemd-run: resource temporarily unavailable\nMounting command: systemd-run\nMounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/ff702226-4e5b-11ea-83a0-42e50a0a0350/volumes/kubernetes.io~secret/default-token-gg2hf --scope -- mount -t tmpfs tmpfs /var/lib/kubelet/pods/ff702226-4e5b-11ea-83a0-42e50a0a0350/volumes/kubernetes.io~secret/default-token-gg2hf\nOutput: \n"
Feb 13 04:26:03 k8-wn-01-lf.net kubelet[3906]: E0213 04:26:03.078274    3906 mount_linux.go:143] Mount failed: fork/exec /usr/bin/systemd-run: resource temporarily unavailable
Feb 13 04:26:03 k8-wn-01-lf.net kubelet[3906]: E0213 04:26:03.078380    3906 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/secret/ff702226-4e5b-11ea-83a0-42e50a0a0350-default-token-gg2hf\" (\"ff702226-4e5b-11ea-83a0-42e50a0a0350\")" failed. No retries permitted until 2020-02-13 04:26:04.078336563 -0800 PST m=+542777.198367277 (durationBeforeRetry 1s). Error: "MountVolume.SetUp failed for volume \"default-token-gg2hf\" (UniqueName: \"kubernetes.io/secret/ff702226-4e5b-11ea-83a0-42e50a0a0350-default-token-gg2hf\") pod \"collections-5cc4dcccd5-nf92g\" (UID: \"ff702226-4e5b-11ea-83a0-42e50a0a0350\") : mount failed: fork/exec /usr/bin/systemd-run: resource temporarily unavailable\nMounting command: systemd-run\nMounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/ff702226-4e5b-11ea-83a0-42e50a0a0350/volumes/kubernetes.io~secret/default-token-gg2hf --scope -- mount -t tmpfs tmpfs /var/lib/kubelet/pods/ff702226-4e5b-11ea-83a0-42e50a0a0350/volumes/kubernetes.io~secret/default-token-gg2hf\nOutput: \n"
Feb 13 04:26:04 k8-wn-01-lf.net kubelet[3906]: E0213 04:26:04.083060    3906 mount_linux.go:143] Mount failed: fork/exec /usr/bin/systemd-run: resource temporarily unavailable
Feb 13 04:26:32 k8-wn-01-lf.net kubelet[3906]: E0213 04:26:32.136914    3906 kuberuntime_container.go:585] Container "docker://501f95674beb081dcf38d506fa3dc8f323dd576c26cdac7ac1f7f9f465545f74" termination failed with gracePeriod 30: rpc error: code = Unknown desc = Error response from daemon: cannot stop container: 501f95674beb081dcf38d506fa3dc8f323dd576c26cdac7ac1f7f9f465545f74: Cannot kill container 501f95674beb081dcf38d506fa3dc8f323dd576c26cdac7ac1f7f9f465545f74: unknown error after kill: fork/exec /usr/sbin/runc: resource temporarily unavailable: : unknown
Feb 13 04:26:32 k8-wn-01-lf.net kubelet[3906]: E0213 04:26:32.137030    3906 remote_runtime.go:250] StopContainer "ce6ad1fb3bbcf80383ed5a75b147a1b560b02b22231fb26dd4a40e062d464e48" from runtime service failed: rpc error: code = Unknown desc = Error response from daemon: cannot stop container: ce6ad1fb3bbcf80383ed5a75b147a1b560b02b22231fb26dd4a40e062d464e48: Cannot kill container ce6ad1fb3bbcf80383ed5a75b147a1b560b02b22231fb26dd4a40e062d464e48: unknown error after kill: fork/exec /usr/sbin/runc: resource temporarily unavailable: : unknown
Feb 13 04:26:32 k8-wn-01-lf.net kubelet[3906]: E0213 04:26:32.137049    3906 kuberuntime_container.go:585] Container "docker://ce6ad1fb3bbcf80383ed5a75b147a1b560b02b22231fb26dd4a40e062d464e48" termination failed with gracePeriod 30: rpc error: code = Unknown desc = Error response from daemon: cannot stop container: ce6ad1fb3bbcf80383ed5a75b147a1b560b02b22231fb26dd4a40e062d464e48: Cannot kill container ce6ad1fb3bbcf80383ed5a75b147a1b560b02b22231fb26dd4a40e062d464e48: unknown error after kill: fork/exec /usr/sbin/runc: resource temporarily unavailable: : unknown

kernel/docker:

Feb 13 04:26:29 k8-wn-01-lf.net kernel: docker0: port 23(vethac0f51c) entered forwarding state
Feb 13 04:26:29 k8-wn-01-lf.net kernel: docker0: port 23(vethac0f51c) entered forwarding state
Feb 13 04:26:30 k8-wn-01-lf.net kernel: docker0: port 23(veth3b535d6) entered forwarding state
Feb 13 04:26:30 k8-wn-01-lf.net kernel: docker0: port 23(veth3b535d6) entered forwarding state
Feb 13 04:26:31 k8-wn-01-lf.net kernel: docker0: port 23(veth011ff61) entered forwarding state
Feb 13 04:26:31 k8-wn-01-lf.net kernel: docker0: port 23(veth011ff61) entered forwarding state
Feb 13 04:26:32 k8-wn-01-lf.net kernel: docker0: port 23(veth14301c0) entered forwarding state
Feb 13 04:26:32 k8-wn-01-lf.net kernel: docker0: port 23(veth14301c0) entered forwarding state
Feb 13 04:26:33 k8-wn-01-lf.net kernel: docker0: port 23(veth645472a) entered forwarding state
Feb 13 04:26:33 k8-wn-01-lf.net kernel: docker0: port 23(veth645472a) entered forwarding state

What you expected to happen:
Flannel manages to (re)apply missing iptables rules without errors.
Iptables to be always available or know what element of kubernetes is causing them to be unavailable.

How to reproduce it (as minimally and precisely as possible):

  1. Install kuberentes cluster version 1.14.6
  2. Configuration:
$ cat /etc/sysctl.conf
# sysctl settings are defined through files in
# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.
#
# Vendors settings live in /usr/lib/sysctl.d/.
# To override a whole file, create a new file with the same in
# /etc/sysctl.d/ and put new settings there. To override
# only specific settings, add a file with a lexically later
# name in /etc/sysctl.d/ and put new settings there.
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
# Force gc to clean-up quickly
net.ipv4.neigh.default.gc_interval = 3600

# Set ARP cache entry timeout
net.ipv4.neigh.default.gc_stale_time = 3600

# Setup DNS threshold for arp 
net.ipv4.neigh.default.gc_thresh3 = 4096
net.ipv4.neigh.default.gc_thresh2 = 2048
net.ipv4.neigh.default.gc_thresh1 = 1024

#Other
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
vm.overcommit_memory = 1
fs.inotify.max_user_instances = 8192
fs.inotify.max_user_watches = 1048576
fs.file-max = 52706963
fs.nr_open = 52706963
net.netfilter.nf_conntrack_max = 2310720
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv4.tcp_tw_recycle = 0
$ sudo cat /etc/kubernetes/kube-proxy-config.yaml
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
clientConnection:
  kubeconfig: "/etc/kubernetes/kube-proxy.kubeconfig"
mode: "ipvs"
iptables:
  masqueradeAll: true
ipvs:
  scheduler: "lc"
clusterCIDR: "10.200.0.0/16"
  1. Put some workload in the cluster

Anything else we need to know?:
We have seen this behavior on different clusters: +10 nodes, +100 nodes and +200 nodes, but same configuration.
Sometimes starts on its own or after we add new nodes to the cluster.
We have seen this problem in 1.13.4 as well.
Happens on all our nodes occasionally.

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.4", GitCommit:"c27b913fddd1a6c480c229191a087698aa92f0b1", GitTreeState:"clean", BuildDate:"2019-02-28T13:37:52Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.6", GitCommit:"96fac5cd13a5dc064f7d9f4f23030a6aeface6cc", GitTreeState:"clean", BuildDate:"2019-08-19T11:05:16Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
    On promise
  • OS (e.g: cat /etc/os-release):
$ cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)
  • Kernel (e.g. uname -a):
Linux k8-wn-01-lf.net 4.4.183-1.el7.elrepo.x86_64 #1 SMP Sat Jun 22 07:45:20 EDT 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: Ansible/the hard way
  • Network plugin and version (if this is a network-related bug):
flanneld --version
v0.11.0
  • Others:
iptables v1.4.21

Docker version 18.09.6, build 481bc77156

etcd Version: 3.3.7
Git SHA: 56536de55
Go Version: go1.9.6
Go OS/Arch: linux/amd64
@ofrivera

This comment has been minimized.

Copy link
Author

@ofrivera ofrivera commented Feb 14, 2020

/sig network

@k8s-ci-robot k8s-ci-robot added sig/network and removed needs-sig labels Feb 14, 2020
@athenabot

This comment has been minimized.

Copy link

@athenabot athenabot commented Feb 14, 2020

/triage unresolved

Comment /remove-triage unresolved when the issue is assessed and confirmed.

🤖 I am a bot run by vllry. 👩‍🔬

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.