-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
datapath: Improved BPF testing framework #20017
Conversation
27d561b
to
decad16
Compare
c08ceaa
to
51ed754
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looked like others focused on the details of the test progs themselves, so I focused more on the build systems and general bits that were different from the way that other go tests work in the tree.
I explicitly didn't review the actual test content, because if you delete the full files and then add full files, it's much more difficult to review the diffs. If you're able to combine these in the same commit with minimal diff, that would make review easier for those pieces.
@@ -0,0 +1,443 @@ | |||
// SPDX-License-Identifier: Apache-2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar feedback here, we don't need yet another directory (test/bpf
, bpf/tests
, test/bpf_tests
, bpf/tests/bpf_tests
) to confuse people even more about where the code should live :-)
IIRC the rationale behind test/bpf/
vs. bpf/tests
is the language distinction of Go vs. C, but if we have a reasonable way to satisfy whatever tooling is analyzing these directories then my preference would actually be to just put them all in the same directory. If the tooling doesn't like them being in the same directory (eg go tooling complaining about the C files) then we can keep two separate dirs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the distinction seems to be that /bpf
is for C code and the rest for Go. You can't put Go and C files in the same directory since Go will automatically attempt to compile .c files as CGO, so any Go code would have to live in a separate directory anyway. Having 1 Go file in among all the C code felt strange, but I agree that it would be better to co-locate code that works together, perhaps not just for the tests but other parts of cilium as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. Maybe we should move the unit-test.c (and elf-demo.c?) into bpf/ ..?
elf-demo.c
I'm less sure about, IIRC that's actually used for the pkg/elf
Go testing. Could potentially even move that one under something like test/elf
.
Well anyways this is not such a big deal, we can always move the files later.
test/bpf_tests/bpf_test.go
Outdated
// Give the global log buf some time to empty | ||
time.Sleep(50 * time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I'm always suspicious about sleeps, is there a way we can flush the log or something instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cilium/ebpf currently doesn't provide this, though I agree it should, and I believe it is possible, I will see if I can fix that. Would it be acceptable to leave it like this and fix it as soon as a flush feature has been upstreamed in cilium/ebpf? (if that takes longer than this PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a question first, what's the consequences if 50 Milliseconds is not enough time to flush the log? Can we easily detect that problem and react to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the condition we are looking for is:
- Did the test complete
- Are we blocking
But we can't simply use a select
to check if we are blocked, since it uses epoll
. It would be great if we the library had a "check without blocking" function, but it doesn't. So to really fix it we need to update cilium/ebpf. Worst case if the reader does take longer than 50ms, is that we lose some debug information while in verbose mode, so I personally feel like this doesn't have to be a blocking issue as long as we improve this in a follow-up.
078f03f
to
88b0cfa
Compare
ebfa850
to
e36559f
Compare
This commit adds a new BPF testing framework for datapath code. This framework will allow us to write unit tests in C which will be executed, as eBPF programs in the kernel. This allows us to test our code against the actual kernel facilities and helper calls as well as confirming that code will pass the verifier. This framework differs from other related works in the sense that we do the test setup, execution and verification all in C/eBPF code instead of having to do half the work in userspace and the other half in eBPF. A lightweight loader program is used to automatically detect test programs in ELF files in a given directory, load, and execute them using the `BPF_PROG_TEST_RUN` feature. The test results are passed back to the loader which will convert them into sub-tests within the golang testing framework to allow for easy integration into existing tooling. Doing setup in C allows us to easily mock out code with `#define` preprocessor tricks, replace tailcalls with stubs/code, and leverage conditional testing depending on which flags are set / features enabled. It also decouples datapath testing from the agent allowing us to verify that the eBPF programs work as expected separately from the agent / userspace. Each ELF file can contain multiple `CHECK`'s each of which will become a separate program which can fail(`test_fail`/`test_fail_now()`) or pass (default if not failed). Tests can log messages with `test_log("msg")` for debugging purposes or as fail message(`test_fatal("msg")` logs and `test_fail_now()`). These messages can include parameters analogous to `bpf_trace_printk`, for example: `test_log("Expected 123, found: %llu", some_var)`. Or for compact checks the `assert(some_var == 123)` style can be used which will report the file and line of failed asserts and stop the test. These check programs can also define sub-tests with a `TEST("name", {` block which can pass/fail independently from the parent test. The appeal is that such sub-tests run after each other in the exact same context and can for example test different aspects of the same test setup(ctx, map state). Both `CHECK` and `TEST` can be skipped by calling `test_skip()`/ `test_skip_now()`, to mark a test as skipped, for example if applies to a feature which is disabled or depends on features which are not available in the current kernel version. Skipped tests are explicitly marked as such instead of silently ignored. Tests can use `#ifdef` preprocessor statements to determine to skip or not. Tests that do not require tailcalls can do setup, execution and pass/fail checking all in the `CHECK` program. Tests that do need to tests across multiple tailcalls need to use a `SETUP` program in addition, in which case the setup and execution will happen in the `SETUP` program which will exit after the last tailcall is done. The loader will run any `SETUP` program before the `CHECK` program of the same name, the `CHECK` program will be invoked with the result context of the `SETUP` program, the result number of the `SETUP` program will be pretended to the context(4 bytes), this allows the `CHECK` program to verify the return value, context and map state and determine a pass/ fail result. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
The `memcpy` and `memcmp` builtin replacements did work for uneven values other than 1. This commit updates both builtin replacements so they do. This is quite important since it isn't uncommon to copy strings with an uneven amount of chars in the new test framework. Additionally, this commit extends the max size of values passed into `memcmp` from 32 bytes to 72 bytes which is required for some unit tests to compare packets which are larger than 32 bytes. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
This commit ports most existing unit tests in their different forms to the new eBPF unit test framework. This removes the need for a mocking framework in userspace and replaces the existing BPF_PROG_TEST_RUN based framework in the CI. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
The const Coccinelle script attempts to detect all function arguments that should be marked as const. To that end, for each function argument, we skip all cases that could lead to that argument being modified. One corner case we need to detect is when we modify an array field in a struct (`func(..., x->field, ...)`). In that case, we can't easily distinguish between a pointer field and an array field. In the former case, the struct is not modified; in the latter case, it is. Since we only have a few cases with array fields in the codebase, we list those manually. Until recently, that was only the addr field in some struct (holding an IPv6). We now need to add the fields for MAC addresses. Signed-off-by: Paul Chaignon <paul@cilium.io>
e36559f
to
3c42068
Compare
I have gotten the CI as far as I can take it. The only remaining complaint from checkpatch is about the macros used:
But I can't really "fix" these without altering/breaking the behavior of the macros |
This commits adds a new page to the "For developers/Testing" section. The page covers how to run and create BPF tests within the framework that was added in cilium#20017 Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
This commits adds a new page to the "For developers/Testing" section. The page covers how to run and create BPF tests within the framework that was added in cilium#20017 Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
This commits adds a new page to the "For developers/Testing" section. The page covers how to run and create BPF tests within the framework that was added in #20017 Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
This PR adds a new BPF testing framework for datapath code. This
framework will allow us to write tests in C which will be executed,
as eBPF programs in the kernel. This allows us to test our code against
the actual kernel facilities and helper calls as well as confirming that
code will pass the verifier.
This framework differs from other related works in the sense that we do
the test setup, execution and verification all in C/eBPF code instead
of having to do half the work in userspace and the other half in eBPF.
A lightweight loader program is used to automatically detect test
programs in ELF files in a given directory, load, and execute them using
the
BPF_PROG_TEST_RUN
feature. The test results are passed back to theloader which will convert them into sub-tests within the golang testing
framework to allow for easy integration into existing tooling.
Doing setup in C allows us to easily mock out code with
#define
preprocessor tricks, replace tailcalls with stubs/code, and leverage
conditional testing depending on which flags are set / features enabled.
It also decouples datapath testing from the agent allowing us to verify
that the eBPF programs work as expected separately from the agent /
userspace.
Each ELF file can contain multiple
CHECK
's each of which will becomea separate program which can fail(
test_fail
/test_fail_now()
) or pass(default if not failed). Tests can log messages with
test_log("msg")
for debugging purposes or as fail message(
test_fatal("msg")
logs andtest_fail_now()
). These messages can include parameters analogous tobpf_trace_printk
, for example:test_log("Expected 123, found: %llu", some_var)
. Or for compactchecks the
assert(some_var == 123)
style can be used which willreport the file and line of failed asserts and stop the test.
These check programs can also define sub-tests with a
TEST("name", {
block which can pass/fail independently from the parent test. The appeal
is that such sub-tests run after each other in the exact same context
and can for example test different aspects of the same test setup(ctx,
map state).
Both
CHECK
andTEST
can be skipped by callingtest_skip()
/test_skip_now()
, to mark a test as skipped, for example if applies toa feature which is disabled or depends on features which are not
available in the current kernel version. Skipped tests are explicitly
marked as such instead of silently ignored. Tests can use
#ifdef
preprocessor statements to determine to skip or not.
Tests that do not require tailcalls can do setup, execution and
pass/fail checking all in the
CHECK
program. Tests that do need totests across multiple tailcalls need to use a
SETUP
program inaddition, in which case the setup and execution will happen in the
SETUP
program which will exit after the last tailcall is done. Theloader will run any
SETUP
program before theCHECK
program of thesame name, the
CHECK
program will be invoked with the result contextof the
SETUP
program, the result number of theSETUP
program willbe pretended to the context(4 bytes), this allows the
CHECK
programto verify the return value, context and map state and determine a pass/
fail result.
Fixes: #18480