Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

egressgw: Redirect from bpf_overlay to egress gw SNAT netdev #29379

Merged
merged 3 commits into from
Nov 27, 2023

Conversation

ysksuzuki
Copy link
Member

@ysksuzuki ysksuzuki commented Nov 27, 2023

This PR fixes the problem that an iptables SNAT rule in the host netns interferes with packets to egress gw by redirecting from bpf_overlay.

Fixes: #17848

egressgw: Fix the issue that an iptables SNAT rule in the host netns interferes packets to egress gw and bypass the egress GW policy

This commit fixes the problem that an iptables SNAT rule in the host netns
interferes with packets to egress gw by redirecting from bpf_overlay.

Fixes: cilium#17848

Signed-off-by: Yusuke Suzuki <yusuke.suzuki@isovalent.com>
…hook

So that we can remove the duplicate check in nat.h and bpf_overlay.h

Signed-off-by: Yusuke Suzuki <yusuke.suzuki@isovalent.com>
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 27, 2023
Signed-off-by: Yusuke Suzuki <yusuke.suzuki@isovalent.com>
@ysksuzuki ysksuzuki force-pushed the egress-gw-redirect-from-overlay branch from 87d0bcd to 15a3b6c Compare November 27, 2023 08:20
@ysksuzuki
Copy link
Member Author

/test

@ysksuzuki ysksuzuki marked this pull request as ready for review November 27, 2023 10:13
@ysksuzuki ysksuzuki requested review from a team as code owners November 27, 2023 10:13
@julianwiedmann julianwiedmann added kind/bug This is a bug in the Cilium logic. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. feature/egress-gateway Impacts the egress IP gateway feature. release-note/bug This PR fixes an issue in a previous release of Cilium. labels Nov 27, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 27, 2023
Copy link
Member

@julianwiedmann julianwiedmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, looks good!

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 27, 2023
@julianwiedmann julianwiedmann added this pull request to the merge queue Nov 27, 2023
Merged via the queue into cilium:main with commit 20449f2 Nov 27, 2023
61 checks passed
julianwiedmann added a commit to julianwiedmann/cilium that referenced this pull request May 8, 2024
To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(cilium#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

cilium#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture, and also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
julianwiedmann added a commit to julianwiedmann/cilium that referenced this pull request May 8, 2024
To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(cilium#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

cilium#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture, and also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
julianwiedmann added a commit to julianwiedmann/cilium that referenced this pull request May 14, 2024
To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(cilium#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

cilium#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture. But it also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP - this event
is raised on the *second* pass through to-netdev, so we need the SNAT to
happen at the same time.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
julianwiedmann added a commit to julianwiedmann/cilium that referenced this pull request May 14, 2024
To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(cilium#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

cilium#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture. But it also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP - this event
is raised on the *second* pass through to-netdev, so we need the SNAT to
happen at the same time.

Also add a comment to clarify the check to skip HostFW for SNATed traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
julianwiedmann added a commit to julianwiedmann/cilium that referenced this pull request May 14, 2024
To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(cilium#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

cilium#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture. But it also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP - this event
is raised on the *second* pass through to-netdev, so we need the SNAT to
happen at the same time.

Also add a comment to clarify the check to skip HostFW for SNATed traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
julianwiedmann added a commit to julianwiedmann/cilium that referenced this pull request May 14, 2024
To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(cilium#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

cilium#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture. But it also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP - this event
is raised on the *second* pass through to-netdev, so we need the SNAT to
happen at the same time.

Also add a comment to clarify the check to skip HostFW for SNATed traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
julianwiedmann added a commit to julianwiedmann/cilium that referenced this pull request May 14, 2024
To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(cilium#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

cilium#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture. But it also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP - this event
is raised on the *second* pass through to-netdev, so we need the SNAT to
happen at the same time.

Also add a comment to clarify the check to skip HostFW for SNATed traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
julianwiedmann added a commit to julianwiedmann/cilium that referenced this pull request May 14, 2024
To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(cilium#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

cilium#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture. But it also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP - this event
is raised on the *second* pass through to-netdev, so we need the SNAT to
happen at the same time.

Also add a comment to clarify the check to skip HostFW for SNATed traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
julianwiedmann added a commit to julianwiedmann/cilium that referenced this pull request May 14, 2024
To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(cilium#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

cilium#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture. But it also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP - this event
is raised on the *second* pass through to-netdev, so we need the SNAT to
happen at the same time.

Also add a comment to clarify the check to skip HostFW for SNATed traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
julianwiedmann added a commit to julianwiedmann/cilium that referenced this pull request May 14, 2024
To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(cilium#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

cilium#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture. But it also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP - this event
is raised on the *second* pass through to-netdev, so we need the SNAT to
happen at the same time.

Also add a comment to clarify the check to skip HostFW for SNATed traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
julianwiedmann added a commit to julianwiedmann/cilium that referenced this pull request May 14, 2024
To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(cilium#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

cilium#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture. But it also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP - this event
is raised on the *second* pass through to-netdev, so we need the SNAT to
happen at the same time.

Also add a comment to clarify the check to skip HostFW for SNATed traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
github-merge-queue bot pushed a commit that referenced this pull request May 23, 2024
To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture. But it also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP - this event
is raised on the *second* pass through to-netdev, so we need the SNAT to
happen at the same time.

Also add a comment to clarify the check to skip HostFW for SNATed traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
joamaki pushed a commit that referenced this pull request May 30, 2024
[ upstream commit cf6b203 ]

To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture. But it also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP - this event
is raised on the *second* pass through to-netdev, so we need the SNAT to
happen at the same time.

Also add a comment to clarify the check to skip HostFW for SNATed traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
joamaki pushed a commit that referenced this pull request May 30, 2024
[ upstream commit cf6b203 ]

To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture. But it also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP - this event
is raised on the *second* pass through to-netdev, so we need the SNAT to
happen at the same time.

Also add a comment to clarify the check to skip HostFW for SNATed traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
joamaki pushed a commit that referenced this pull request May 30, 2024
[ upstream commit cf6b203 ]

To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture. But it also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP - this event
is raised on the *second* pass through to-netdev, so we need the SNAT to
happen at the same time.

Also add a comment to clarify the check to skip HostFW for SNATed traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
joamaki pushed a commit that referenced this pull request May 31, 2024
[ upstream commit cf6b203 ]

To let EGW traffic exit the gateway through the correct interface,
we've introduced FIB lookup-driven redirects in the to-netdev path
(#26215). This is needed for cases
where the traffic first hits one interface via the default route, but then
needs to bounce to some other interface that matches the actual egressIP.
In this approach we masquerade the packet on its first pass through
to-netdev, set the SNAT_DONE mark, and then redirect to the actual egress
interface. Due to the SNAT_DONE mark we then skip the SNAT logic in the
second pass through to-netdev.

#29379 then improved the situation for
any EGW traffic that enters the gateway from the overlay network (==
anything that's not by a pod on the gateway). We now redirect in
from-overlay, straight to the actual egress interface and masquerade the
packet there.

Now also harmonize the approach for local pods, and defer the masquerade
until the packet hits the actual egress interface. This simplifies the
overall picture. But it also allows us to raise TO_NETWORK datapath trace
events that are enriched with the packet's original source IP - this event
is raised on the *second* pass through to-netdev, so we need the SNAT to
happen at the same time.

Also add a comment to clarify the check to skip HostFW for SNATed traffic.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/egress-gateway Impacts the egress IP gateway feature. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

datapath: Redirect from bpf_overlay to egress gw SNAT netdev
2 participants