-
Notifications
You must be signed in to change notification settings - Fork 260
[NPM] RETURN early on MARK in DROP chains #881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
JungukCho
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a detailed explanation to describe what was an issue and how "early on Mark" resolves the issue with one example yaml? Since I only understood what code does (e.g., control order of rules (e.g., append or insert) based on type of chain), I cannot quite map what it solved. It will help me to understand this PR. Since this PR is linked with #870 , it would be nice to use the example. Will be very appreciate to show iptables changes as well.
Also are there valid unit test for this?
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
JungukCho
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to point out exact codes, but not sure where they are.
However, after playing with this PR, in high-level, this PR removes "AZURE-NPM-EGRESS(or INGRESS)-DROPS" chains from "AZURE-NPM-EGRESS (or INGRESS)-PORT" chain to resolve packet drop issues.
One thing that I am not sure is to put "RETURN" in each "AZURE-NPM-EGRESS (or INGRESS)-DROPS" chains is necessary.
IPtables output from example yaml which enables reproducing packet drop issues. Italic parts seems redundant.
Or are there something which I missed?
Chain AZURE-NPM-EGRESS (1 references)
pkts bytes target prot opt in out source destination
0 0 AZURE-NPM-EGRESS-PORT all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 mark match 0x3000 / RETURN-on-EGRESS-and-INGRESS-mark-0x3000 /
0 0 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 mark match 0x1000 / RETURN-on-EGRESS-mark-0x1000 */*
0 0 AZURE-NPM-EGRESS-DROPS all -- * * 0.0.0.0/0 0.0.0.0/0
Chain AZURE-NPM-EGRESS-DROPS (1 references)
pkts bytes target prot opt in out source destination
0 0 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 mark match 0x3000 / _RETURN-on-EGRESS-and-INGRESS-mark-0x3000 /
0 0 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 mark match 0x1000 / RETURN-on-EGRESS-mark-0x1000 /
0 0 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 match-set azure-npm-784554818 src /* DROP-ALL-FROM-ns-default /_
0 0 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 match-set azure-npm-784554818 src / DROP-ALL-FROM-ns-default */
npm/iptm/iptm.go
Outdated
| } | ||
|
|
||
| func isDropsChain(chainName string) bool { | ||
| for _, chain := range IptablesAzureDropsChainList { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer to using "if condition" instead of "for loop with IptableAzureDropsChainList" since there are only two elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok let me fix it
| getAzureNPMAcceptChainRules, | ||
| getAzureNPMIngressChainRules, | ||
| getAzureNPMIngressPortChainRules, | ||
| getAzureNPMIngressFromChainRules, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need these chains no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah we will need the chains, this is just removing the default RETURN rules added in these chains
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we change the function name then? GetAllChainsAndRules is misleading
The redundant RETURN rules in DROP chains are a form contingency for future chain ordering when we call DROP chains directly unlike today. These rules ensure any future changes will insure us from dropping relevant MARK'ed packets (whcih has caused this issue in the first place) |
npm/iptm/iptm.go
Outdated
| } | ||
|
|
||
| // (TODO) need a review of all chains to make sure we are not adding duplicate rules. | ||
| // NPM should not add multiple TO/FROM rules along with DROPs in PORT chains, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still need these comments? Is it resolved in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we will not need this comment as we are taking care of these duplicate rules in this PR.
JungukCho
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One concern in this PR is I am not 100% sure that whether it does allow invalid packets or not allow valid packets when various multiple network policies are applied in AZURE-NPM-INGRESS(or EGRESS)-DROPS chain while I think this PR does not change those behavior from previous code base and getDefaultDropEntries() guarantees that one.
Do you have any thoughts?
npm/iptm/iptm.go
Outdated
| if entry.IsJumpEntry { | ||
| // Since there is a RETURN statement added to each DROP chain, we need to make sure | ||
| // any new DROP rule added to ingress or egress DROPS chain is added at the BOTTOM | ||
| if entry.IsJumpEntry || isDropsChain(entry.Chain) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NPM does not need entry.IsJumpEntry anymore with this PR.
So, please confirm in NPM code. If we confirm this, we can remove this if condition.
Also, review and update translatePolicy_test.go with this change accordingly. If other codes also impacted, please check their UTs as well.
As I put comment about this before, RETURNs in "AZURE-NPM-EGRESS (or INGRESS)-DROPS" chains are redundant.
If we confirm this, it seems we only need below block in iptm.go with this PR.
iptMgr.OperationFlag = util.IptablesInsertionFlag.Let me know if I miss something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, we do not need to append when the jumpentry is true, since we are now removing all jump entries and are relying on the default rules to help with that jump transitions to next chains.
As far as the redundant RETURN rule in DROP chains are concerned, i feel strongly to keep those rules in place as a contingency for any later changes. With these redundant rules, we are ensuring that no matter what we will NEVER DROP a MARK'ed packet and this assurance helps are from hitting any other bugs in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One concern in this PR is I am not 100% sure that whether it does allow invalid packets or not allow valid packets when various multiple network policies are applied in AZURE-NPM-INGRESS(or EGRESS)-DROPS chain while I think this PR does not change those behavior from previous code base and getDefaultDropEntries() guarantees that one.
Do you have any thoughts?
I think overall this is a better design because, keeping the translation logic aside, while evaluating rule we are making sure that a MARKED packet will never be dropped and a unmarked packet will be dropped on certain conditions. This simplifies the logic from before where we have multiple avenue to DROPs chain before we evaluated all MARK rules in a given direction (which was wrong).
|
Thank you for applying comments! One more suggestion is to remove "IsJumpEntry" member from "IptEntry" struct (https://github.com/Azure/azure-container-networking/blob/master/npm/iptm/iptm.go#L50) with this PR since the "IsJumpEntry" member is not needed anymore. Except for above suggestion, it looks good to me. |
JungukCho
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I think this PR has functional improvement, but I assume it also has data plane latency improvement. So, while it is not urgent now, it would be good to have some measurements to see data plane improvements with this PR later.
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
If there are multiple policies applied on a cluster with TO rules in one and PORT rules in another selecting the same namespace and pod selector, then some TO traffic is being dropped. The reason for that is in PORT chains we are adding TO/FROM chain rules and DROP rules, event though TO/FORM chain MARKs the packet and returns it, since there are no RETURNs in PORT after the TO/FROM is called, we are sending the packet to DROP chains and dropping the traffic.
This fix adds a RETURN rule at the top of both DROP chains to return, in future we should check if we can reduce the TO/FROM rules and DROP rules in PORT chain
Along with that, with this PR, we will now be only adding a single JUMP to either TO/FROM chain in PORT chain for all traffic. We will not be adding any DROP rules in PORT or TO/FROM chains ! This will reduce the number jump rules drastically, keeps a simple logic on where the MARK rules are hit and where the DROP rules are hit.