-
Notifications
You must be signed in to change notification settings - Fork 682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
collection: Add code to parse kernel config files #960
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the long delay. I think the approach is fine, we should just copy in Timo's code and improve on it. collection.go
is already pretty large, how about you put the new code into a package internal/kconfig
or something like that?
We can probably get away with two (exported) functions:
func ParseKconfig(file string) (map[string]string, error)
func PutKconfigValue([]byte, btf.Type, value string) error
Have you thought about how you're going to test this? We need coverage of the parser and of programs referencing the various kinds of .kconfig (libbpf_tristate, regular bool, int, etc.).
collection.go
Outdated
|
||
path := fmt.Sprintf("/boot/config-%s", kernelRelease) | ||
|
||
ret := kconfig.New() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The kconfig.Kconfig
abstraction seems overkill for us, I'd just port https://github.com/ti-mo/kconfig/blob/0708bf82969f7e34fdc1b0edc60a523a249280fe/parser.go#L11-L49 into internal.ParseKconfig(file string) (map[string]string, error)
or similar. Some things to keep in mind:
- The code in Timo's repo is unlicensed, so we need to get his OK to relicense to MIT. cc @ti-mo
- I think we'll need to avoid
regexp
, since it's a fairly heavy dependency. You can hand write a parser using strings.Cut. Please make sure to add good test coverage here. I'm a bit worried that Kconfig is a fickle format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already gave my permission offline, but for the record: go ahead! 🙂
collection.go
Outdated
triModule triState = 2 | ||
) | ||
|
||
func addKconfigValueTri(data []byte, typ btf.Type, value string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not take triState
instead of string
?
func addKconfigValueTri(data []byte, typ btf.Type, value string) error { | |
func addKconfigValueTri(data []byte, typ btf.Type, value triState) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the add*()
takes a string
as last argument, I think this is better to keep with this signature for the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only understood what you meant after I read through the code: a kconfig can be a tristate, a bool, an integer or a string. It's therefore easiest to represent it as a string most of the time.
collection.go
Outdated
data[0] = byt | ||
case *btf.Enum: | ||
enum := typ.(*btf.Enum) | ||
if enum.Name != "libbpf_tristate" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a link to where this is defined / how it ends up in the C?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, it is defined here:
https://github.com/libbpf/libbpf/blob/fbd60dbff51c870f5e80a17c4f2fd639eb80af90/src/bpf_helpers.h#L169
I translated the value to some const
as there is no enum
in golang
.
c7e92e9
to
16934f3
Compare
I addressed the majority of your comments, I will begin to write the tests for it by taking inspiration from I am nonetheless wondering about how we should deal with this: Should we add some initial support to |
0ce8c28
to
a1988c0
Compare
As far as I remember we refuse all weak symbols so far (please double check), so just implementing strong sym support for now should be enough? |
941934e
to
ac1991e
Compare
OK, for the moment let's return an error if something is unknown. Regarding testing, so far I added unit tests but I would like to add integration ones (i.e. tests that creates eBPF program). I think I can find a way to add, but I doubt it will be a proper way. |
6fd050f
to
2dd68e2
Compare
I mark this PR as ready for review as I think we are not so far from the truth. I nonetheless would like to get your opinion on several point:
I am also wondering if rather than creating the big |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the important bits are here! Some thoughts:
- Maybe I've mentioned this before, but I'm not a big fan of integration tests. They end up being cumbersome to maintain and usually don't cover a lot of the edge cases. For example, we currently don't test for min/max values of integers, all possible bools, tristates, etc. Doing that through the ELF reader is a big bore. Why not split the problem into two? First, add tests for the various
putKconfigValue...
in packagekconfig
. Make sure we handle out of bounds values correctly. Second, extend the existing kconfig ELF reader test to pull in a very common, stable kconfig likeCONFIG_HZ
. That gets rid of all of theAdditionalKconfig
mess and also allows us to dropPatchKconfig
. - Can you add a small benchmark to measure how expensive parsing kconfig is? We currently parse it even if the BPF doesn't end up requiring it. I think that's the correct choice as long as parsing is not super slow.
testdata/kconfig_config.c
Outdated
// Highly inspired from libbpf test: | ||
// https://github.com/torvalds/linux/blob/6e98b09da931a00bf4e0477d0fa52748bf28fcce/tools/testing/selftests/bpf/progs/test_core_extern.c | ||
// The modifications are the following: | ||
// 1. Removed __weak as cilium/ebpf does not support it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure we don't introduce behaviour that is incompatbile with adding __weak later: what does libbpf do when it finds a symbol that is marked weak but doesn't exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because Lines 655 to 662 in d8d898a
diff --git a/btf/types.go b/btf/types.go
index 3fb5499..38cc76a 100644
--- a/btf/types.go
+++ b/btf/types.go
@@ -657,6 +657,8 @@ func alignof(typ Type) (int, error) {
n = int(t.size())
case *Int:
n = int(t.Size)
+ case *Array:
+ return alignof(t.Type)
default:
return 0, fmt.Errorf("can't calculate alignment of %T", t)
} |
9e6cde8
to
ae82cb2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your review!
I think we are almost ready even though I am wondering about reducing the memory occupancy by only having a map
containing the CONFIG_*
used in the eBPF code rather than all the kernel config.
@lmb Unless it significantly increases code complexity, a user shouldn't need to pay for a feature they don't use. Is there a reason we can't defer parsing kconfig until a lookup is needed?
Sure, but then we might as well not cache the result of the parser at all; you're basically describing populating .ksyms and throwing away the result. I'd say leave this out of scope of this PR. We can experiment with replacing the map with a minimal trie implementation that only supports inserts and lookups. (can be done in 30-40 lines of Go) This may cause more total allocs, but should result in a lower steady-state of consumed memory. Benchmarking is the only way to really know. :) I'd make sure we don't parse kconfig if the caller doesn't need it in any case. (Note: haven't read the patches yet, will give this a pass next week) Not sure if the same has been done here, but in conntracct, I stripped the CONFIG_ prefix of all keys on store and lookup to cut down on memory usage. 4 years later and there are still exclusively entries starting with CONFIG_ in my config.gz, so we could do the same here if we haven't already. |
This makes sense, I just wanted to mimic
This makes sense too, I will check how I can modify the code to add this! |
ae82cb2
to
d4ba2ab
Compare
c39dae9
to
6e9c256
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I polished everything, thank you for the commits!
Nonetheless, the CI does not pass because the virtme-run
kernel does not find a corresponding config under /boot
and was not compiled with CONFIG_IKCONFIG*
.
Should we enable the above CONFIG_IKCONFIG*
for the tested kernels or just skip this test in the CI?
2243bb3
to
8f6522c
Compare
db899a8
to
637a477
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, especially the tests are really thorough! Please take a look at my suggestion to only parse when necessary and let me know what you think.
I've rebuilt all of our CI kernels so that /proc/config.gz is now a thing.
collection.go
Outdated
return fmt.Errorf("cannot find a kconfig file: %w", err) | ||
} | ||
|
||
config, err := kconfig.Parse(scanner) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Timo is right, it would be nice if we only parsed the kconfig when required.
I've updated the benchmark you added to use a bigger file and looked at the results very briefly: it takes around 1ms on my machine.
What if you did the following?
- in the default switch case collect the necessary config items into
map[string][]byte
or similar - after we've iterated the items once, parse kconfig
if len(map[string][]byte) > 0
and fill in the values
That should be doable without too many code changes.
Coming back to the benchmark, most of the time is spent constructing the full map[string]string
. That's fine for now, since it's nice and simple. But later on we could experiment using the map[string][]byte
to only parse the lines we're actually interested in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice if we only parsed the kconfig when required.
This is what is done libbpf
and they even only really deal with value of interest (i.e. CONFIG_*
which are both in eBPF code and kconfig
):
https://github.com/torvalds/linux/blob/0d85b27b0cc6b5cf54567c5ad913a247a71583ce/tools/lib/bpf/libbpf.c#L1906-L1908
We should do the same with something like this: Parse(source io.ReaderAt, configsOfInterest []string)
where the second argument is used a filter, so the return map key are value of configsOfInterest
.
In this way, we avoid having a big map for only few CONFIG_*
in eBPF code.
Note that, I suggested this previously in this PR but my comment should have pass below the radar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, sorry if I didn't see that. I think we might also be able to cut down memory usage to a point where we can afford to keep a cached map around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem at all, should we implement the Parse(source io.ReaderAt, configsOfInterest []string)
in this PR or in a follow up one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented the filtering in the last two commits of this PR.
If you think this a good addition, I will squash them with the first two and polish the messages.
40d308a
to
b7d2444
Compare
The CI is passing, thank you for adding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nits. Can you please update the commit messages? The first commit message describes the old API and the second one doesn't really exist :) So far we've also not used a .
at the end of commit titles, I think it makes sense to stick to that. Can you please remove the .
? You can squash your (first) two commits if you'd like, then there is only one message to write. Maybe you could include that we don't support weak symbols at the moment, or other things which we don't support yet?
1d41aa1
to
2d917f7
Compare
I modified the commit messages so they describe better the actual code. |
This new package exports several functions: 1. Find() which tries to find a kconfig file on the host. 2. Parse(source io.ReaderAt) which parses the kconfig source given as parameter and returns a map[string]string of its content. 3. PutValue(data []byte, typ btf.Type, value string) which translates the value given as parameter depending of the typ and writes the corresponding to data. The whole code is mainly adapted from libbpf C code and from Timo Beckers golang package [1]. Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com> [1]: https://github.com/ti-mo/kconfig
This permits supporting CONFIG_ in eBPF code. For the moment, weak CONFIG_ symbols are not supported. Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
test.kconfig is really small and therefore won't give results that match the real world. Use a recent fedora kconfig instead. goos: linux goarch: amd64 pkg: github.com/cilium/ebpf/internal/kconfig cpu: 12th Gen Intel(R) Core(TM) i7-1260P │ base.txt │ │ sec/op │ Parse-16 1.964m ± 6% │ base.txt │ │ B/op │ Parse-16 1.542Mi ± 0% │ base.txt │ │ allocs/op │ Parse-16 9.717k ± 0% Signed-off-by: Lorenz Bauer <lmb@isovalent.com> Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
f88f04f
to
9d9dc7b
Compare
Thanks for tidying up! I've removed the two commits since I don't have the time to review them right now, sorry. Let's get this merged and then we can iterate on the performance. |
Thank you for the review and the merge! |
This code uses existing golang package and writes the corresponding value to .kconfig section [1].
It is mainly a translation from C of libbpf code.
Signed-off-by: Francis Laniel flaniel@linux.microsoft.com
[1] https://github.com/ti-mo/kconfig