Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

program: relocation of program targeting a module fails if CONFIG_DEBUG_INFO_BTF_MODULES is disabled #1436

Closed
orishuss opened this issue Apr 16, 2024 · 3 comments · Fixed by #1440
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@orishuss
Copy link
Contributor

orishuss commented Apr 16, 2024

Describe the bug

Support for btf parsing of kernel modules, recently introduced, was long awaited, but I believe it breaks some bpf programs. The programs affected are ones that are attached to loaded kernel module functions, on certain systems according to kernel configuration.

CONFIG_DEBUG_INFO_BTF_MODULES is a kernel flag, introduced in kernel 5.11, that makes the btf files for kernel modules appear in /sys/kernel/btf. The machine on which I encountered the error is kernel 5.10 which explains why this bug happens, but the same should happen on newer kernels if the flag is not set.

A practical solution could be to make the dependence on the module's specific btf file optional (so that, for example, if /sys/kernel/btf/nf_conntrack_netlink doesn't exist, it just skips it). This way, the module could still use CORE relocations for normal vmlinux structs.

Notice that changing the section name (for example SEC("kprobe/ctnetlink_fill_info") to SEC("kprobe/ctnetlink_fill_info_")) "fixes" the problem because it makes the library only check vmlinux btf, as it can't find what module loaded the function.

Commands outputs

kernel version 5.10.213

package version github.com/cilium/ebpf v0.14.0

missing btf file for module:

~ ls -larth /sys/kernel/btf/
total 0      
drwxr-xr-x   14 root     root           0 Apr 15 10:42 ..
-r--r--r--    1 root     root        3.8M Apr 16 09:48 vmlinux
drwxr-xr-x    2 root     root           0 Apr 16 09:48 .

kernel module is indeed loaded:

~ lsmod | grep nf_conntrack_netlink
nf_conntrack_netlink    61440  0 
nfnetlink              20480  2 nf_tables,nf_conntrack_netlink
nf_conntrack          180224  8 xt_REDIRECT,xt_state,xt_connmark,nf_conntrack_netlink,xt_nat,xt_MASQUERADE,nf_nat,xt_conntrack

kallsyms (where the module name of the function comes from):

~ cat /proc/kallsyms | grep ctnetlink_fill_info
ffffffffxxxxxxxx t ctnetlink_fill_info  [nf_conntrack_netlink]

Output difference between version 0.12.3 and 0.14.0 (the version of the package is the only thing that was changed)

~ ./kmerror-0.12.3
it worked!
~ ./kmerror-0.14.0
2024/04/16 09:46:00 loading objects: field CtnetlinkFillInfo: program ctnetlink_fill_info: apply CO-RE relocations: load kernel module spec: open /sys/kernel/btf/nf_conntrack_netlink: no such file or directory

How to reproduce

use make and run ./kmerror on a machine that is misconfigured

kmerror.zip

@orishuss orishuss added the bug Something isn't working label Apr 16, 2024
@orishuss orishuss changed the title kernel module btf support breaks programs when kernel misconfigured kernel module btf support breaks programs when kernel is misconfigured Apr 16, 2024
@lmb
Copy link
Collaborator

lmb commented Apr 16, 2024

Thanks for the detailed report. @brycekahle maybe this is as easy as not generating an error if module BTF doesn't exist?

@brycekahle
Copy link
Contributor

Yeah, that would probably work. If the program needs the module BTF then the CO-RE relocations will fail and poison the instructions. That error is a little harder to debug, but will still fail regardless.

@lmb
Copy link
Collaborator

lmb commented Apr 17, 2024

The fix is going to be here somewhere:

ebpf/linker.go

Lines 149 to 155 in 7719d2f

if kmodName != "" {
kmodTarget, err := btf.LoadKernelModuleSpec(kmodName)
if err != nil {
return fmt.Errorf("load kernel module spec: %w", err)
}
targets = append(targets, kmodTarget)
}

  • LoadKernelModuleSpec should return a discernible error when the load fails due to the file not existing. Something like errors.Is(err, os.ErrNotFound).
  • The code above should ignore the error if it is due to not being found.

@orishuss are you up for contributing a fix? We can probably test this by using a bogus kmodName and asserting that we don't get an error.

@lmb lmb changed the title kernel module btf support breaks programs when kernel is misconfigured program: relocation of program targeting a module fails if CONFIG_DEBUG_INFO_BTF_MODULES is disabled Apr 17, 2024
@lmb lmb added the help wanted Extra attention is needed label Apr 17, 2024
lmb pushed a commit to orishuss/ebpf that referenced this issue Apr 22, 2024
In kernels where the flag CONFIG_DEBUG_INFO_BTF_MODULES is not set,
including kernels 5.10 and below (because the flag was introduced in 5.11),
loading a program that attaches to kernel module functions and relied on
CORE failed because the module's BTF is not available.

This fix allows the program to run, obviously only as long as it only
relies on the kernel's BTF and not on the specific module's BTF.

Fixes: cilium#1436
Signed-off-by: Ori Shussman <orishuss@gmail.com>
lmb pushed a commit to orishuss/ebpf that referenced this issue Apr 22, 2024
In kernels where the flag CONFIG_DEBUG_INFO_BTF_MODULES is not set,
including kernels 5.10 and below (because the flag was introduced in 5.11),
loading a program that attaches to kernel module functions and relied on
CORE failed because the module's BTF is not available.

This fix allows the program to run, obviously only as long as it only
relies on the kernel's BTF and not on the specific module's BTF.

Fixes: cilium#1436
Signed-off-by: Ori Shussman <orishuss@gmail.com>
@lmb lmb closed this as completed in #1440 Apr 22, 2024
lmb pushed a commit that referenced this issue Apr 22, 2024
In kernels where the flag CONFIG_DEBUG_INFO_BTF_MODULES is not set,
including kernels 5.10 and below (because the flag was introduced in 5.11),
loading a program that attaches to kernel module functions and relied on
CORE failed because the module's BTF is not available.

This fix allows the program to run, obviously only as long as it only
relies on the kernel's BTF and not on the specific module's BTF.

Fixes: #1436
Signed-off-by: Ori Shussman <orishuss@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants