-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf: egressgw: sync logic to determine if destination is outside cluster #18246
Conversation
c9dcd86
to
b2e8f29
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Gilberto, LGTM.
As we discussed offline, I think we should describe what IPV4_SNAT_EXCLUSION_DST_CIDR
and, subsequently, --ipv4-native-routing-cidr
mean. My understanding was that the above CIDR included node IPs and only those, which at least in the EKS environment you tried out was not the case. IMO, this can be done in a followup PR but we should do it soon before it gets forgotten since there was a lot of time wasted on this based on different understandings of what above macro meant.
Based on our offline discussion IPV4_SNAT_EXCLUSION_DST_CIDR
seems to be: "a CIDR that does not require SNAT (unless traffic is going through egress gw)". We should add an appropriate line in @pchaigno's list here: #15379 (comment).
to make it more explicit their purpose Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
b2e8f29
to
99b4c9a
Compare
Yep, I'll fill a PR with these changes now 👍 |
Marking as ready to merge. No need to rerun full CI as I just added an additional comment and amended a commit message (and before CI was green) |
In the context of egress gateway, when traffic is leaving the cluster we need to check twice if it is a match for an egress NAT policy: * first time in handle_ipv4_from_lxc(), on the node where the client pod is running (to determine if it should be forwarded to a gateway node) * second time in snat_v4_needed(), on the actual gateway node (to determine if it should be SNATed) Currently the 2 checks are slightly diverging wrt how traffic destined to outside the cluster is identified: * in the first case we use is_cluster_destination(), which uses the information stored on the ipcache and EP maps * in the second case we just rely on the IPV4_SNAT_EXCLUSION_DST_CIDR The issue with the IPV4_SNAT_EXCLUSION_DST_CIDR logic is that we may incorrectly exclude from egress gw SNAT traffic that is supposed to be SNATed: case in point an EKS environment where the primary VPC is shared between the cluster and some other EC2 nodes that don't belong to the cluster. To fix this, this commit changes the snat_v4_needed() logic to match the one we use in handle_ipv4_from_lxc() and executes it before the IPV4_SNAT_EXCLUSION_DST_CIDR check. Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
In snat_v4_neeeded(), one of the conditions to determine if a packet should be SNATed with an egress IP is: !local_ep || is_cluster_destination() The intent of the first check (!local_ep) was to that tells us that traffic was redirected by an egress gateway policy to a different node to be masqueraded, but in practice it's not needed: as long as the packet is destined to outside the cluster, is not reply traffic and it's matched by an egress NAT policy, it should be SNATed with the egress gw IP (moreover we should not assume that there's no local EP since it's possible that the node where the client pod is running is the same node that will act as egress gateway). Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
99b4c9a
to
6fc9f33
Compare
(taking care of v1.10 backport as there may be a few conflicts here) |
See individual commits