-
Notifications
You must be signed in to change notification settings - Fork 260
[CNI][Fix] Make iptable calls idempotent for swift podsubnet scenario #1795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| iptables.GetInsertIptableRuleCmd(iptables.V4, iptables.Nat, iptables.Swift, azureIMDSMatch, snatHostIPJump), | ||
|
|
||
| var iptableCmds []iptables.IPTableEntry | ||
| if !iptables.ChainExists(iptables.V4, iptables.Nat, iptables.Swift) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assumign *exits functions don't take a lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't but what would be the problem even if it does.. that call may fail but retries ensure eventually it will succeed right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exist check is only at start up time.
so if somone put a retry around handle common options and the first one failed at third step then the retry wont re-run exist. and you will fail till your retry expires.
Exist check closter to create/append/insert would be more reliable but its hypothetical if we're not really retrying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are not retrying so should be good.. you re right but i have to make considerable code change to achieve that. anyway this iptables part gonna move to cns in long run. even if it fails after exist check, it succeeds in next attempt. it wouldn't endup stuck in that error indefinitely as like now.
paulgmiller
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like a safe mitigation . Be nice if all commands were a omic or did the check themselves but that is likely a bigger cange.
| iptableCmds = append(iptableCmds, iptables.GetInsertIptableRuleCmd(iptables.V4, iptables.Nat, iptables.Swift, azureIMDSMatch, snatHostIPJump)) | ||
| } | ||
|
|
||
| options[network.IPTablesKey] = iptableCmds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is done at invokeation of azure cni process
So if we ever tried to retry witin the cni we might still hit the same thing but that would only block one pod i guess.
another way to do this is to haave all cmds first run the exit. I assume there is no way to do them all atomically with one lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
today cni executes sequentially since we acquire lock at beginning and there can be only one CNI process executes these iptable cmds at a time but we have a plan to parallelize cni .. In parallel execution of cni, there are possiblities that there will be race in adding iptable rules but eventually one will succeed...if cni returns error, containerd runtime will retry the operation
thats right.. initally thought of redesigning but its a bigger change.. anyway in future this will be part of cns |
…#1795) * cns cilium * make swift iptable calls idempotent
Reason for Change:
CNI executes iptable cmds for swift podsubnet case and if iptable chain/rule already exists, it throws error and following cni calls fails. Need a reboot or removal of that rule/chain to recover cni. This fix ensure iptable calls are idempotent and not to add iptable rule/chain if it already exists
Issue Fixed:
Requirements:
Notes: