Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf: fix iptables masquerading for node -> remote pod traffic #16136

Merged
merged 2 commits into from Jun 22, 2021

Conversation

jibi
Copy link
Member

@jibi jibi commented May 13, 2021

When Cilium runs with KPR, host-firewall or bandwidth manager, it will
try to auto-derive one or more devices to which the bpf_host program is
attached.

This program will, among other things, redirect ingress traffic destined
to a pod to the pod's lxc device using bpf_redirect().

This causes the traffic to bypass the nf_conntrack table, leading to a
situation where traffic leaving the pod after the connection's been
established will be (incorrectly) masqueraded in case Iptables
masquerading is enabled, since the connection is not tracked by
netfilter.

This commit fixes this by skipping bpf_redirect() when we detect this
case (i.e. traffic is flowing through bpf_host attached to a physical
device and Cilium has installed Iptables rules which require conntrack).

Fixes: #14859.
Fixes: #12205.

@jibi jibi added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/bug This PR fixes an issue in a previous release of Cilium. needs-backport/1.10 labels May 13, 2021
@jibi jibi requested a review from brb May 13, 2021 08:29
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.10.0-rc2 May 13, 2021
@jibi jibi force-pushed the pr/jibi/fix-dev-masq branch 3 times, most recently from 489c425 to 170258f Compare May 14, 2021 09:00
@jibi
Copy link
Member Author

jibi commented May 14, 2021

test-me-please

@jibi jibi marked this pull request as ready for review May 14, 2021 14:06
@jibi jibi requested a review from a team May 14, 2021 14:06
@jibi jibi requested a review from a team as a code owner May 14, 2021 14:06
@jibi jibi requested a review from nebril May 14, 2021 14:06
@jibi jibi assigned pchaigno and unassigned pchaigno May 14, 2021
@jibi jibi requested a review from pchaigno May 14, 2021 14:07
Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the same fix required for the IPv6 path?

bpf/bpf_host.c Outdated Show resolved Hide resolved
@jibi
Copy link
Member Author

jibi commented May 14, 2021

Isn't the same fix required for the IPv6 path?

Right, I forgot about that since in direct routing mode pods can only have v4 IPs assigned to (as native-routing-cidr is a v4 CIDR), but yeah, let's fix that path as well since we are on it

@brb
Copy link
Member

brb commented May 16, 2021

since in direct routing mode pods can only have v4 IPs assigned to (as native-routing-cidr is a v4 CIDR)

In the IPv6 case the following used for the MASQ exclusion - https://github.com/cilium/cilium/blob/master/pkg/datapath/config.go#L141.

it will try to auto-derive a device to which the bpf_host program is attached.

Nit: It can derive more than one devices.

Anyway, it's getting more difficult to wrap my head around NO_REDIRECT usage and implications. I've created the issue for that #16166. I guess we need to fix your particular case anyway before doing my suggested change.

@aanm aanm added this to Needs backport from master in 1.10.0-rc3 May 17, 2021
@aanm aanm removed this from Needs backport from master in 1.10.0-rc2 May 17, 2021
@jibi
Copy link
Member Author

jibi commented May 17, 2021

since in direct routing mode pods can only have v4 IPs assigned to (as native-routing-cidr is a v4 CIDR)

In the IPv6 case the following used for the MASQ exclusion - https://github.com/cilium/cilium/blob/master/pkg/datapath/config.go#L141.

it will try to auto-derive a device to which the bpf_host program is attached.

Nit: It can derive more than one devices.

Anyway, it's getting more difficult to wrap my head around NO_REDIRECT usage and implications. I've created the issue for that #16166. I guess we need to fix your particular case anyway before doing my suggested change.

I think ideally we should move most of this conditional logic to userspace, and have NO_REDIRECT being a single source of truth for when we actually need to skip redirect (without the need to check for encap vs direct routing or nodeport being enabled in the datapath)

@jibi jibi marked this pull request as draft May 17, 2021 08:27
@jibi
Copy link
Member Author

jibi commented May 17, 2021

test-me-please

@brb
Copy link
Member

brb commented May 17, 2021

I think ideally we should move most of this conditional logic to userspace, and have NO_REDIRECT being a single source of truth for when we actually need to skip redirect (without the need to check for encap vs direct routing or nodeport being enabled in the datapath)

@jibi 👍 👍 👍

@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.9 in 1.9.9 Jul 5, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.9 to Needs backport from master in 1.9.9 Jul 5, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.9 to Needs backport from master in 1.9.9 Jul 5, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.9 in 1.9.9 Jul 5, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.9 in 1.9.9 Jul 5, 2021
qmonnet added a commit to qmonnet/cilium that referenced this pull request Jul 15, 2021
This reverts commit be4e93e.

Issue cilium#12205 has been fixed via cilium#16136, and the Host Firewall can be
used again in the tests.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
qmonnet added a commit to qmonnet/cilium that referenced this pull request Jul 15, 2021
The test was disabled because of issue cilium#12205: When bpf_host was loading
on the native device, the source identity of packet on the destination
node was resolved to WORLD and policy enforcement would fail.

This has now been fixed via cilium#16136, and we can run the test again.

Also adjust the conditions for the test, to reflect the changes to
surrounding IPSec tests from f1209d0 ("test: Enable IPSec tests on
4.19").

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
aanm pushed a commit that referenced this pull request Jul 15, 2021
This reverts commit be4e93e.

Issue #12205 has been fixed via #16136, and the Host Firewall can be
used again in the tests.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
aanm pushed a commit that referenced this pull request Jul 15, 2021
The test was disabled because of issue #12205: When bpf_host was loading
on the native device, the source identity of packet on the destination
node was resolved to WORLD and policy enforcement would fail.

This has now been fixed via #16136, and we can run the test again.

Also adjust the conditions for the test, to reflect the changes to
surrounding IPSec tests from f1209d0 ("test: Enable IPSec tests on
4.19").

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
@joestringer joestringer added this to Backport pending to v1.9 in 1.9.10 Jul 19, 2021
@joestringer joestringer removed this from Backport pending to v1.9 in 1.9.9 Jul 19, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Backport done to v1.9 in 1.9.9 Jul 19, 2021
@joestringer joestringer removed this from Backport pending to v1.9 in 1.9.10 Jul 19, 2021
krishgobinath pushed a commit to krishgobinath/cilium that referenced this pull request Oct 20, 2021
This reverts commit be4e93e.

Issue cilium#12205 has been fixed via cilium#16136, and the Host Firewall can be
used again in the tests.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
krishgobinath pushed a commit to krishgobinath/cilium that referenced this pull request Oct 20, 2021
The test was disabled because of issue cilium#12205: When bpf_host was loading
on the native device, the source identity of packet on the destination
node was resolved to WORLD and policy enforcement would fail.

This has now been fixed via cilium#16136, and we can run the test again.

Also adjust the conditions for the test, to reflect the changes to
surrounding IPSec tests from f1209d0 ("test: Enable IPSec tests on
4.19").

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
No open projects
1.10.2
Backport done to v1.10
1.9.9
Backport done to v1.9
8 participants