-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crippling xtables lock contention on kubernetes #933
Comments
Thanks for the great issue report. A command line arg would be the right way to configure this. Allowing very large values would also provide a mechanism for disabling it. It would be great to add a node to troubleshooting.md about this too. I think it's worth getting this fix into the next release but longer term we'll probably want to do something better. |
Going to leave this open for now to help track a longer term solution. |
Switch to nftables :-) ? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Expected Behavior
Kubeproxy should be able to get a word in edgewise and update our tables.
Current Behavior
We currently have three kubernetes 1.7.6 clusters in prod, one fairly large one (about 50 m4.10xl nodes in aws, 818 services) and two smaller clusters that net about 200 services apiece but are otherwise identical.
As part of our move to kubernetes 1.9 by way or 1.8 we performed an upgrade from flannel 0.8.0 to 0.9.1. Things went smoothly on the two smaller clusters, however on the large cluster as soon as a flannel pod got upgraded on a node the kubeproxy on the node started reporting
kube-proxy-w8hxt:kube-proxy E0126 20:16:49.022132 1 proxier.go:1601] Failed to execute iptables-restore: failed to acquire old iptables lock: timed out waiting for the condition
.As part of our troubleshooting we also moved to flannel 0.10.0, same issue.
I don't have any in depth knowledge of how the iptables xtables.lock works, but on an upgraded box we were seeing upwards of 4 iptables processes pending at all times (/usr/local/bin/iptables commands with --wait flags) they don't seem to have a fifo type queuing arrangement I imagine that it's just whoever checks and finds no lock runs.
Digging through the chanelog we found this pull request: #808
We suspected this was the root cause and that a 5s check for these rules was causing excessive contention on our nodes.
Possible Solution
I've built a replacement container against 0.10.0 in which I quickly changed the 5s check to a 5m check and deployed it on one of the nodes that was previously effected, so far so good.
I think perhaps that hardcoding this value is a mistake and perhaps a flag or other configuration could be provided to adjust this sync timer. I'm perfectly happy to work up the pull request for this feature but would like some guidance about how you would like me to provide this new parameter (I suspect a commandline flag that defaults to 5 seconds).
Steps to Reproduce (for bugs)
Context
I think I covered the context above
Your Environment
The text was updated successfully, but these errors were encountered: