-
Notifications
You must be signed in to change notification settings - Fork 260
fix: [NPM] reposition iptables jump to AZURE-NPM chain #1086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
… tests subtests for check and add forward chain
|
placeAzureChainFirst == true test results: Cyclonus: https://github.com/Azure/azure-container-networking/runs/4193283318?check_suite_focus=true lets switch the flag and test with false to make sure there is no regressions |
vakalapa
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
Fix #1088
Problem
In the iptables nat table, under certain conditions, KUBE-SERVICES will mark a packet sent to an LB service before DNAT to a pod's IP. Then, in iptables filter table the KUBE-FORWARD table will accept packets with this mark before the packet reaches AZURE-NPM chain.
As a result, we might not get a chance to drop packets that we should.
iptables chains
filter table
FORWARD chain (as it is now)
As it is now, NPM comes after KUBE-FORWARD.
KUBE-FORWARD chain
Notice the accept on masquerade mark (0x4000).
nat table (example with an ILB service named elf/nginx-svc)
For this example, elf/nginx-svc delegates traffic to two pods for incoming traffic on tcp port 80 or nodeport 30525.
KUBE-SERVICES is Kubernetes' nat chain. It is referenced in the OUTPUT AND PREROUTING chains.
Depending on the src, we may mark for masquerading if port 80 is used. Depending on the IP/port used for the svc, we either jump to the svc chain, fwd chain, or nodeports chain.
Here's the mark-for-masquerade chain:
Here's the LB's svc chain, which DNATs to one of the pods randomly. The packet is marked for masquerade (0x4000) if the src IP is the same as the dst pod's IP.
If the traffic was sent to the LB's external IP, it gets forwarded (marked for masquerade before sent to above svc chain for DNAT).
If the nodeport was used, mark for masquerade and send to the svc chain)
Solution
When a toggle is turned on (default off), move the jump from FORWARD to AZURE-NPM chain above the jump to KUBE-FORWARD.
As a result, packets DNAT to a pod from the ILB will pass through NPM instead of being accepted beforehand.
new FORWARD chain in filter table
We add a
ctstate NEWrequirement for the jump to Azure chain, and position the jump depending on the toggle. This follows KUBE-FORWARD's practice and is necessary so for example, we don't deny an HTTP response for an HTTP request that we allow.When the toggle is set to place the azure chain first:
Keep the rest the same, including the redundant check for state RELATED/ESTABLISHED in the final ACCEPT in Azure chain.