Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix 5.10+ complexity issue with kubeProxyReplacement=disabled #16084

Merged
merged 3 commits into from May 12, 2021

Conversation

pchaigno
Copy link
Member

@pchaigno pchaigno commented May 11, 2021

This pull request fixes #14726, our complexity issue with kubeProxyReplacement=disabled when running Linux 5.10+. The fix is less general than originally hoped and only applies to 5.10+; other complexity issues are therefore not fixed by this change. With a bit more work, we may able to extend it to 5.1+ in the future.

See commits for details. As a summary:

  1. Implements a couple changes to allow our BPF datapath to pass the verifier when compiled with mattr=+alu32.
  2. Compile our BPF datapath with mcpu=v3 (implies mattr=+alu32) on kernels 5.10+.
  3. Disable a workaround for Complexity issue on 5.10+ with kubeProxyReplacement=disabled #14726 in tests, to make sure the complexity issue is fixed.

In addition to the end-to-end tests, these changes were also tested on kernels 5.10 and 5.11 with datapath configurations KERNEL=419 and KERNEL=netnext.

@pchaigno pchaigno added kind/bug This is a bug in the Cilium logic. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/bug This PR fixes an issue in a previous release of Cilium. release-blocker/1.10 labels May 11, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.10.0-rc2 May 11, 2021
@pchaigno pchaigno marked this pull request as ready for review May 11, 2021 19:45
@pchaigno pchaigno requested a review from a team May 11, 2021 19:45
@pchaigno pchaigno requested review from a team as code owners May 11, 2021 19:45
mattr=+alu32, supported since LLVM 7.0 and implied by mcpu=v3, enables
the use of 32-bit registers in BPF bytecode. Enabling this compiler
option can however result in loading issues as illustrated below.

    12: (61) r1 = *(u32 *)(r0 +80) // ctx->data_end
    13: (61) r6 = *(u32 *)(r0 +76) // ctx->data
    14: (bc) w7 = w6 // <- verifier looses track of inferred pkt type here.
    [...]
    38: (71) r1 = *(u8 *)(r7 +20)
    R7 invalid mem access 'inv'

These errors typically happen because the data and data_end pointers are
actually 32-bit registers. Depending on how these pointers are used,
LLVM sometimes makes use of that assumption (e.g., 32-bit assignment on
instruction 14 above). The verifier is however not able to follow and
reject such programs.

We can usually work around those by ensuring these pointers are only
used via 64-bit types. This commit implements this wherever needed to
pass the verifier.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Paul Chaignon <paul@cilium.io>
Set mcpu=v3 in the compiler on kernels 5.10+ to use all available eBPF
instructions and 32-bit registers. This change fixes the complexity
issue we're hitting on v5.10+ when socket-level load balancing is disabled
(via enable-host-services=false or kube-proxy-replacement=disabled).

Using the third eBPF instruction set doesn't reduce complexity for all
BPF programs but it leads to more standard numbers, with less variations
in complexities. A big part of this improvement is due to the implicit
use of mattr=+alu32 to enable 32-bit eBPF registers.

In addition to the end-to-end test on bpf-next, this change was tested
on kernels 5.10 and 5.11 with the existing verifier-test.sh, compiling
the datapath with both KERNEL=netnext and KERNEL=419.

Signed-off-by: Paul Chaignon <paul@cilium.io>
On master and with kernels 5.10+, we have a complexity issue when
ENABLE_HOST_SERVICES_FULL is undefined (i.e., socket-level load balancing
is disabled and additional code compiled in bpf_lxc as a replacement).
Our verifier test included a workaround for that issue, by always
defining ENABLE_HOST_SERVICES_FULL on newer kernels.

This commit removes that workaround since the previous commit fixed the
complexity issue.

Signed-off-by: Paul Chaignon <paul@cilium.io>
@pchaigno
Copy link
Member Author

test-me-please

Copy link
Member

@borkmann borkmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.10 in 1.10.0-rc2 May 14, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Needs backport from master in 1.10.0-rc2 May 14, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Needs backport from master in 1.10.0-rc2 May 14, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.10 in 1.10.0-rc2 May 14, 2021
@brb
Copy link
Member

brb commented May 16, 2021

@pchaigno Don't we want to backport this to v1.9? Asking, as #14960 was reporting the complexity issue.

@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Backport done to v1.10 in 1.10.0-rc2 May 17, 2021
@pchaigno
Copy link
Member Author

pchaigno commented May 17, 2021

@pchaigno Don't we want to backport this to v1.9? Asking, as #14960 was reporting the complexity issue.

@brb It's on my radar to backport to both v1.9 and v1.8. I want to finish testing with the latest BPF security fixes first as those will be backported to all LTS kernels. Looks good so far, but I didn't run the full e2e tests yet.

@pchaigno
Copy link
Member Author

Marking for backport to v1.9.

@diversario
Copy link
Contributor

@pchaigno Don't we want to backport this to v1.9? Asking, as #14960 was reporting the complexity issue.

@brb It's on my radar to backport to both v1.9 and v1.8. I want to finish testing with the latest BPF security fixes first as those will be backported to all LTS kernels. Looks good so far, but I didn't run the full e2e tests yet.

Is this not going to be backported to 1.8 after all?

@pchaigno
Copy link
Member Author

Is this not going to be backported to 1.8 after all?

Probably not. It's a fair bit of effort to backport to stable branches because several other PRs need to be backported as well. v1.8 is also not supported since the release of v1.11.

@pchaigno pchaigno added the kind/complexity-issue Relates to BPF complexity or program size issues label Mar 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is a bug in the Cilium logic. kind/complexity-issue Relates to BPF complexity or program size issues ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
No open projects
1.10.0-rc2
Backport done to v1.10
Development

Successfully merging this pull request may close these issues.

Complexity issue on 5.10+ with kubeProxyReplacement=disabled
10 participants