Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube proxy CrashLoopBackOff #81353

Open
zjj2wry opened this issue Aug 13, 2019 · 4 comments

Comments

@zjj2wry
Copy link
Member

commented Aug 13, 2019

What happened:
kube-proxy-worker-d2wd6 0/1 CrashLoopBackOff 1 6h36m

I0813 04:51:18.992525       1 server_others.go:216] Tearing down inactive rules.
I0813 04:51:18.992594       1 graceful_termination.go:160] Trying to delete rs: 172.16.132.159:30073/TCP/10.81.130.93:443
I0813 04:51:18.992763       1 graceful_termination.go:171] Not deleting, RS 172.16.132.159:30073/TCP/10.81.130.93:443: 0 ActiveConn, 3 InactiveConn
I0813 04:51:18.992789       1 graceful_termination.go:160] Trying to delete rs: 172.16.132.159:30073/TCP/10.81.130.239:443
I0813 04:51:18.992824       1 graceful_termination.go:171] Not deleting, RS 172.16.132.159:30073/TCP/10.81.130.239:443: 0 ActiveConn, 3 InactiveConn
I0813 04:51:18.992836       1 graceful_termination.go:160] Trying to delete rs: 172.16.132.159:30073/TCP/10.81.129.39:443
I0813 04:51:18.992876       1 graceful_termination.go:171] Not deleting, RS 172.16.132.159:30073/TCP/10.81.129.39:443: 0 ActiveConn, 2 InactiveConn
I0813 04:51:18.992890       1 graceful_termination.go:160] Trying to delete rs: 172.16.132.159:30509/TCP/10.81.129.39:80
I0813 04:51:18.992950       1 graceful_termination.go:171] Not deleting, RS 172.16.132.159:30509/TCP/10.81.129.39:80: 0 ActiveConn, 3 InactiveConn
I0813 04:51:18.992964       1 graceful_termination.go:160] Trying to delete rs: 172.16.132.159:30634/TCP/10.81.130.93:80
I0813 04:51:18.992996       1 graceful_termination.go:171] Not deleting, RS 172.16.132.159:30634/TCP/10.81.130.93:80: 0 ActiveConn, 3 InactiveConn
I0813 04:51:18.993004       1 graceful_termination.go:160] Trying to delete rs: 172.16.132.159:30634/TCP/10.81.129.39:80
I0813 04:51:18.993062       1 graceful_termination.go:171] Not deleting, RS 172.16.132.159:30634/TCP/10.81.129.39:80: 0 ActiveConn, 3 InactiveConn
I0813 04:51:18.993076       1 graceful_termination.go:160] Trying to delete rs: 172.16.132.159:30509/TCP/10.81.130.93:80
I0813 04:51:18.993111       1 graceful_termination.go:171] Not deleting, RS 172.16.132.159:30509/TCP/10.81.130.93:80: 0 ActiveConn, 3 InactiveConn
I0813 04:51:18.993126       1 graceful_termination.go:160] Trying to delete rs: 172.16.132.159:30509/TCP/10.81.130.239:80
I0813 04:51:18.993159       1 graceful_termination.go:171] Not deleting, RS 172.16.132.159:30509/TCP/10.81.130.239:80: 0 ActiveConn, 3 InactiveConn
I0813 04:51:18.993167       1 graceful_termination.go:160] Trying to delete rs: 172.16.132.159:32029/TCP/10.81.130.93:443
I0813 04:51:18.993214       1 graceful_termination.go:171] Not deleting, RS 172.16.132.159:32029/TCP/10.81.130.93:443: 0 ActiveConn, 3 InactiveConn
I0813 04:51:18.993228       1 graceful_termination.go:160] Trying to delete rs: 172.16.132.159:32029/TCP/10.81.130.239:443
I0813 04:51:18.993258       1 graceful_termination.go:171] Not deleting, RS 172.16.132.159:32029/TCP/10.81.130.239:443: 0 ActiveConn, 3 InactiveConn
I0813 04:51:18.993267       1 graceful_termination.go:160] Trying to delete rs: 172.16.132.159:32029/TCP/10.81.129.39:443
I0813 04:51:18.993311       1 graceful_termination.go:171] Not deleting, RS 172.16.132.159:32029/TCP/10.81.129.39:443: 0 ActiveConn, 3 InactiveConn
I0813 04:51:18.993325       1 graceful_termination.go:160] Trying to delete rs: 172.16.132.159:30634/TCP/10.81.130.239:80
I0813 04:51:18.993410       1 graceful_termination.go:171] Not deleting, RS 172.16.132.159:30634/TCP/10.81.130.239:80: 0 ActiveConn, 3 InactiveConn
E0813 04:51:19.028430       1 proxier.go:430] Failed to execute iptables-restore for nat: exit status 1 (iptables-restore: line 7 failed
)
I0813 04:51:19.032155       1 server.go:466] Version: v1.12.6-aliyun.1
I0813 04:51:19.047182       1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_max' to 2097152
I0813 04:51:19.047235       1 conntrack.go:52] Setting nf_conntrack_max to 2097152
I0813 04:51:19.047282       1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I0813 04:51:19.047319       1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
I0813 04:51:19.047472       1 config.go:202] Starting service config controller
I0813 04:51:19.047492       1 controller_utils.go:1027] Waiting for caches to sync for service config controller
I0813 04:51:19.047491       1 config.go:102] Starting endpoints config controller
I0813 04:51:19.047509       1 controller_utils.go:1027] Waiting for caches to sync for endpoints config controller
I0813 04:51:19.147677       1 controller_utils.go:1034] Caches are synced for service config controller
I0813 04:51:19.147715       1 controller_utils.go:1034] Caches are synced for endpoints config controller
[root@iZ2zehkmdztgqr9gfnfrttZ lcsuper]#

Return to normal after reboot pod(delete this pod)

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
    Client Version: version.Info{Major:"1", Minor:"12+", GitVersion:"v1.12.6-aliyun.1", GitCommit:"4304b26", GitTreeState:"", BuildDate:"2019-04-08T08:50:29Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"12+", GitVersion:"v1.12.6-aliyun.1", GitCommit:"8cb561c", GitTreeState:"", BuildDate:"2019-04-22T11:34:20Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:
@zjj2wry

This comment has been minimized.

Copy link
Member Author

commented Aug 13, 2019

/sig Network

@k8s-ci-robot k8s-ci-robot added sig/network and removed needs-sig labels Aug 13, 2019

@gongguan

This comment has been minimized.

Copy link
Contributor

commented Aug 14, 2019

Does execute iptables -t nat -L failed?
Is there many hanging iptables process by using ps -ef | grep iptables?
If the answer is yes, iptables may be hold by a process.

@zjj2wry

This comment has been minimized.

Copy link
Member Author

commented Aug 15, 2019

Does execute iptables -t nat -L failed?

no error

Is there many hanging iptables process by using ps -ef | grep iptables?

no

It should now be that there is no automatic recovery after a short failure of the node.

@athenabot

This comment has been minimized.

Copy link

commented Aug 15, 2019

/triage unresolved

Comment /remove-triage unresolved when the issue is assessed and confirmed.

🤖 I am a bot run by vllry. 👩‍🔬

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.