New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for kernel module BTF #705
Comments
Thanks for the writeup, I think I have a better understanding of the full picture. I'm still not sure why my suggestion without an explicit Merge() is bad. LoadSplitSpecFromReader should be implemented onto Spec itself and should add those kmod types to the spec.
I don't see how having split Specs is useful, sorry. Is there a strong need for this? Note on older kernels and e.g. btfhub: split BTF isn't a thing there, it's just one blob. Need to check if kmods are even included there, not sure. Re: reinflating, that could also be solved by always merging kmods into Spec and indexing them by their BTF IDs, or even by Handle? That gives us both a BTF id (so we can reiterate and avoid re-parsing) as well as an fd and typeid for use in specifying attach targets and we wouldn't need Handle.Spec() either. Not sure if kmod BTFs get new IDs when they get reloaded, though. I like your idea of iterating them only when vmlinux lookup fails. With this caching in place, that's something we can perform for every prog without too much impact. |
For CO-RE we need to query the superset of vmlinx + all kmods, but we don't care whether the type is from a module or vmlinux proper. Your preferred solution works well here. For attach points (fentry/fexit), we need to query vmlinux and all kmods invidivually, since we care where a function resides. Only allowing merged Specs falls flat here.
This is of limited use actually. For CO-RE we can get away with not providing kmod type IDs at all (hopefully). For fentry / fexit we do need type IDs, but we also need to know which kmod they came from! See below for your idea re indexing by ID.
Turns out BTF IDs can be reused if no handle is open to them and their program / module is unloaded. It's unlikely (as in, you need to go through close to 2^32 objects), but not worth adding an edge case for.
I take this to mean: "a Spec that contains only the types contained in the raw BTF". It's a natural way to represent the result of
Including vmlinux types in kmod types would cause us to do more work for each kmod we have to search. Furthermore, such a split spec composes well with
Gut feeling: the ID will change.
Yes, otherwise I would have changed my proposal accordingly :P I started out favouring your idea, but changed my mind as I started writing this document. |
Load the bpf_testmod into the kernel when booting the test VM. This allows getting rid of the workaround to load vmlinux at BTF ID 1, since loading a module also loads vmlinux BTF. Later on we can use this to test split BTF / kmod support. Updates cilium#705
Load the bpf_testmod into the kernel when booting the test VM. This allows getting rid of the workaround to load vmlinux at BTF ID 1, since loading a module also loads vmlinux BTF. Later on we can use this to test split BTF / kmod support. Updates #705
I am not a big fan of this method, as this might give a wrong impression to users that BTFs are composable which is not the case - you can "merge" at most two BTFs (base and split). Merging multiple split BTFs will mess up type IDs. |
That's why the function has the following "caveat":
So, if you need typeIDs you will have to deal with un-merged Spec. If you don't care about type IDs you can merge away. It's awkward, but solves the issue that we need a way for users to merge "external" kernel BTF to pass in Does this make more sense? |
Are we aware of any real life use cases for merging more than two BTFs? |
We'd use it in |
Here are a few extra constraints I had in mind that I don't find mentioned explicitly, although these might be described by your pseudocode above:
From my perspective, allowing Specs to be merged already adds 90% of this complexity. I propose going just one step further and indexing 'incoming' kmod Specs by some type of identifier so we can also retrieve type IDs of kmod types after they've been merged. Having tags just like progs would be really nice here, perhaps we can calculate something similar? That could be a solution if we can't index by btf ID, since those can and do change. Perhaps a checksum over the raw blob? This should be doable since Specs are always instantiated from raw bytes. |
Can't we just extend the |
Agreed with those. The last point is possible by querying a split spec obtained via
That is only useful if we can go from the identifier to a BTF ID somehow. Keep in mind that we can't just serialise the module types and load that into the kernel to get an ID, we explicitly have to go via NewHandleFromID.
The problem lies in what to key the modules by, see above. |
We currently decode BTF when calling NewHandleFromID. This can be very expensive for large BTF blobs like vmlinux, even though not all users will want to inspect decoded BTF in the first place. Move the decoding into Handle.Spec, which is a breaking change. Updates cilium#705
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD. NewHandleFromID therefore doesn't work when trying to obtain a handle for a kernel module BTF. At the same time, always parsing BTF when creating a handle from an ID is slow and wasteful. Change Handle.Spec to accept a base Spec to support split BTF from modules. This is a breaking change. Updates cilium#705
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD. NewHandleFromID therefore doesn't work when trying to obtain a handle for a kernel module BTF. At the same time, always parsing BTF when creating a handle from an ID is slow and wasteful. Change Handle.Spec to accept a base Spec to support split BTF from modules. This is a breaking change. Updates cilium#705
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD. NewHandleFromID therefore doesn't work when trying to obtain a handle for a kernel module BTF. At the same time, always parsing BTF when creating a handle from an ID is slow and wasteful. Change Handle.Spec to accept a base Spec to support split BTF from modules. This is a breaking change. Updates cilium#705
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD. NewHandleFromID therefore doesn't work when trying to obtain a handle for a kernel module BTF. At the same time, always parsing BTF when creating a handle from an ID is slow and wasteful. Change Handle.Spec to accept a base Spec to support split BTF from modules. This is a breaking change. Updates cilium#705
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD. NewHandleFromID therefore doesn't work when trying to obtain a handle for a kernel module BTF. At the same time, always parsing BTF when creating a handle from an ID is slow and wasteful. Change Handle.Spec to accept a base Spec to support split BTF from modules. This is a breaking change. Updates cilium#705
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD. NewHandleFromID therefore doesn't work when trying to obtain a handle for a kernel module BTF. At the same time, always parsing BTF when creating a handle from an ID is slow and wasteful. Change Handle.Spec to accept a base Spec to support split BTF from modules. This is a breaking change. Updates cilium#705
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD. NewHandleFromID therefore doesn't work when trying to obtain a handle for a kernel module BTF. At the same time, always parsing BTF when creating a handle from an ID is slow and wasteful. Change Handle.Spec to accept a base Spec to support split BTF from modules. This is a breaking change. Updates cilium#705
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD. This causes NewHandleFromID to fail for kmod BTF, since the function attempts to inflate types. Avoid inflating types in NewHandleFromID and change Handle.Spec to accept a base Spec to allow decoding types from modules. This is a breaking change. Updates cilium#705
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD. This causes NewHandleFromID to fail for kmod BTF, since the function attempts to inflate types. Avoid inflating types in NewHandleFromID and change Handle.Spec to accept a base Spec to allow decoding types from modules. This is a breaking change. Updates #705
Allow tracing functions in kernel modules via fentry, fexit, fmod_ret and tp_btf programs. The behaviour follows libbpf and is transparent to the user: if we can't find a target in vmlinux we attempt to find it in any loaded kernel module. Refactor TestProgramTypeLSM to test attaching via BTF. This removes LSM tests for BPF_F_SLEEPABLE, since they don't excercise library behaviour beyond passing ProgramSpec.Flags to the kernel. Updates cilium#705
Allow tracing functions in kernel modules via fentry, fexit, fmod_ret and tp_btf programs. The behaviour follows libbpf and is transparent to the user: if we can't find a target in vmlinux we attempt to find it in any loaded kernel module. Refactor TestProgramTypeLSM to test attaching via BTF. This removes LSM tests for BPF_F_SLEEPABLE, since they don't excercise library behaviour beyond passing ProgramSpec.Flags to the kernel. Updates cilium#705
Allow tracing functions in kernel modules via fentry, fexit, fmod_ret and tp_btf programs. The behaviour follows libbpf and is transparent to the user: if we can't find a target in vmlinux we attempt to find it in any loaded kernel module. Refactor TestProgramTypeLSM to test attaching via BTF. This removes LSM tests for BPF_F_SLEEPABLE, since they don't excercise library behaviour beyond passing ProgramSpec.Flags to the kernel. Updates cilium#705
Allow tracing functions in kernel modules via fentry, fexit, fmod_ret and tp_btf programs. The behaviour follows libbpf and is transparent to the user: if we can't find a target in vmlinux we attempt to find it in any loaded kernel module. Refactor TestProgramTypeLSM to test attaching via BTF. This removes LSM tests for BPF_F_SLEEPABLE, since they don't excercise library behaviour beyond passing ProgramSpec.Flags to the kernel. Updates cilium#705
Allow tracing functions in kernel modules via fentry, fexit, fmod_ret and tp_btf programs. The behaviour follows libbpf and is transparent to the user: if we can't find a target in vmlinux we attempt to find it in any loaded kernel module. Refactor TestProgramTypeLSM to test attaching via BTF. This removes LSM tests for BPF_F_SLEEPABLE, since they don't excercise library behaviour beyond passing ProgramSpec.Flags to the kernel. Updates cilium#705
Allow tracing functions in kernel modules via fentry, fexit, fmod_ret and tp_btf programs. The behaviour follows libbpf and is transparent to the user: if we can't find a target in vmlinux we attempt to find it in any loaded kernel module. Refactor TestProgramTypeLSM to test attaching via BTF. This removes LSM tests for BPF_F_SLEEPABLE, since they don't excercise library behaviour beyond passing ProgramSpec.Flags to the kernel. Updates cilium#705
Allow tracing functions in kernel modules via fentry, fexit, fmod_ret and tp_btf programs. The behaviour follows libbpf and is transparent to the user: if we can't find a target in vmlinux we attempt to find it in any loaded kernel module. Refactor TestProgramTypeLSM to test attaching via BTF. This removes LSM tests for BPF_F_SLEEPABLE, since they don't excercise library behaviour beyond passing ProgramSpec.Flags to the kernel. Updates #705
I think I've come round to the fact that |
Does this issue mean that a function exposed by a out-of-tree kernel module for example bpf_testmod_test_mod_kfunc in this upstream kernel module example, can't be used by an eBPF program loaded by cilium/ebpf? Or I'm missing some details, and this is not possible at all on the kernel/kernel-module/bpf level at all? |
@bonifaido kfunc's are a separate feature, which the library currently doesn't support. Timo Beckers is looking into it at the moment however. I'm not sure whether it's possible to call kfuncs from modules, your link seems broken? |
Sorry for the broken link, just fixed it. After experimenting with a plain kernel module and eBPF I can tell, that it is possible to invoke an kernel-module function exposed with |
Something to figure out: does opening a Handle to a kernel module, then unloading and loading the module, result in a new BTF ID being allocated for the kmod BTF? |
btfhub currently doesn't include kmod BTF, but we are looking into adding it. It seems like a single blob is easier on the consumption side, but requires us to generate + merge + dedup BTF for all the modules. We may be able to take the split generated BTF and concatenate it while adjusting ID numbers up to prevent conflicts. I'm looking into the specifics of how to best accomplish that now. |
Looks like split generation already does this part, although it seems each kernel module BTF type ids can conflict with each other. |
Yes, if the vmlinux base has type IDs 1-100, all modules will start their IDs at 101, so they all overlap. They can't really be merged into a single blob, unless BTFhub would roll its own, e.g. gzip them, but then users would need to take care to unarchive them and pass them to the lib individually. |
We were talking about adjusting the type IDs to prevent conflict, so they could be merged into a single blob. I'm not sure the simpler loading story is worth all the extra preprocessing work though, especially since we would have to build some of these tools. |
Hmm, sounds like a use case for |
Builder doesn't allow writing out split BTF (yet! patches welcome) but it'll do the renumbering, etc. just fine. Another missing piece is deduplication. |
Newer kernels ship with separate BTF for kernel modules (kmod):
The library is currently oblivious to their existence. This means that kmod BTF is not taken into account for CO-RE relocations and it's not possible to attach fentry / fexit (others?) programs to kernel modules. Both should be supported without the user needing to take additional steps in the happy case.
Proposed API
CO-RE
CO-RE is currently done entirely in the library, by matching types used by the BPF program to types found in the kernel.
To extend CO-RE to kmods, we should merge the set of
vmlinux
and allkmod
types and search this superset. Even though a kprobe is attached to a particular kmod's function, itmight still want to pull types from other modules, for example to cast void* to
appropriate types.
I believe this is what libbpf does.
Creating the superset of types will be done by
Spec.Merge()
. Merged types won'thave a canonical type ID, so
Spec.TypeID()
will return a well known error.Problem:
bpf_core_type_id_kernel
is supposed to return the ID of a type in the kernel.According to the documentation we could probably return
0
and be "compatible":The most interesting use case of the target type ID is probably bpf_snprintf_btf.
Builtin vs. external BTF
Newer upstream kernels ship BTF for vmlinux and kmods in
/sys/kernel/btf
.On older kernels and when distros disable BTF the user needs to provide this data out of band.
To use external BTF, users load a
*btf.Spec
and pass it via ProgramOptions.KernelTypes.To support kmods for external BTF,
LoadSplitSpecFromReader
allows reading split BTF fromdisk, which is followed by repeated calls to
Spec.Merge
.To support kmods for builtin BTF, changes to
NewHandleFromID
andHandle.Spec
are necessary.To avoid parsing vmlinux on each call to NewHandleFromID we move inflating into
Handle.Spec
.Additionally
Handle.Spec
will allow specifying a base Spec.fentry, fexit, etc.
fentry
, etc. programs require BTF information in the syscall boundary:To support kmods, we need to set
attach_btf_obj_fd
andattach_btf_id
. Theobj_fd
has to come from a call toNewHandleFromID
, it can't come froma file on disk. The library currently has a code path that allows using external
BTF via
ProgramOptions.KernelTypes
to findattach_btf_id
forvmlinux
targets,but that is most likely broken, since a kernel without builtin BTF will refuse attaching fentry, etc.
Going forward,
ProgramOptions.KernelTypes
is ignored for the purpose of finding an attach target.Instead we will first look for the attach target in
/sys/kernel/btf/vmlinux
.If successful, we use
attach_obj_fd == 0
.Otherwise we will iterate BTF IDs using
BPF_BTF_GET_NEXT_ID
to find kmod BTF.Via
NewHandleFromID
andHandle.Spec
we then search the kmod.If the target is found we use the handle and retrieve the (aliased) type ID via
Spec.TypeID
.Problem: this requires re-inflating kmods for each program, which
is very expensive. Need to cache this via handleCache somehow?
The text was updated successfully, but these errors were encountered: