Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Be more agressive acquiring the iptables lock #85771
What type of PR is this?
What this PR does / why we need it:
if kube-proxy is not able to get the lock it means a maximum penalty of 35 sec, 5 secs waiting to get the lock and 30 secs to retry
Currently, kubernetes uses the iptables -w 5 option, waiting 5 seconds to acquire the lock
We can be more aggressive trying to acquire the lock using an smaller interval
We can reproduce this situation using flock to hold the lock
acquire the lock in iptables
observe iptables behaviour with -W = 100000
observe iptables behaviour with -W = 10000
remove the lock
Which issue(s) this PR fixes:
Special notes for your reviewer:
Seems this was the previous behaviour based on
and the one implemented for flushing the chains
Does this PR introduce a user-facing change?:
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
in the original PR you noted:
So what changed?
As I said before, I don't think the kind problem is actually a timeout getting the lock, particularly given that kubelet reliably always acquires the lock successfully and kube-proxy reliably hits the timeout. Something else is going on there. (eg like with #82587 where pkg/util/iptables was grabbing the lock itself and then calling the iptable binary which also tried to grab the lock but was guaranteed to fail since we had already grabbed it).
I was investigating more the kind issue #85727 and as you say has to be something different ... However, I could observe that the values used in this PR are similar to the previous values used in kube-proxy when it didn't have the
[APPROVALNOTIFIER] This PR is APPROVED
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing
iptables has two options to modify the behaviour trying to acquire the lock. --wait -w [seconds] maximum wait to acquire xtables lock before give up --wait-interval -W [usecs] wait time to try to acquire xtables lock interval to wait for xtables lock default is 1 second Kubernetes uses -w 5 that means that wait 5 seconds to try to acquire the lock. If we are not able to acquire it, kube-proxy fails and retries in 30 seconds, that is an important penalty on sensitive applications. We can be a bit more aggresive and try to acquire the lock every 100 msec, that means that we have to fail 50 times to not being able to succeed.