Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for kernel module BTF #705

Closed
4 tasks done
lmb opened this issue Jun 10, 2022 · 20 comments
Closed
4 tasks done

Support for kernel module BTF #705

lmb opened this issue Jun 10, 2022 · 20 comments
Labels
enhancement New feature or request

Comments

@lmb
Copy link
Collaborator

lmb commented Jun 10, 2022

Newer kernels ship with separate BTF for kernel modules (kmod):

$ ls /sys/kernel/btf/ | head -n5
aes_ce_blk
aes_ce_cipher
aes_neon_blk
aes_neon_bs
async_memcpy

The library is currently oblivious to their existence. This means that kmod BTF is not taken into account for CO-RE relocations and it's not possible to attach fentry / fexit (others?) programs to kernel modules. Both should be supported without the user needing to take additional steps in the happy case.

  • LoadSplitSpecFromReader
  • CO-RE / Merge BTF
  • Load kmod BTF
  • fentry / fexit

Proposed API

// LoadSplitSpecFromReader loads split BTF from a reader.
//
// Types from base are used to resolve references in the split BTF.
// The returned Spec only contains types from the split BTF, not from the base.
func LoadSplitSpecFromReader(r io.ReaderAt, base *Spec) (*Spec, error)

// Merge the types from other into the current Spec.
//
// Merged types don't have a canonical type ID, and therefore TypeID() will
// return an error for them.
func (*Spec) Merge(other *Spec) error

// Spec retrieves the types associated with the Handle from the kernel.
//
// base is used to decode split BTF and may be nil.
func (h *Handle) Spec(base *Spec) (*Spec, error) // BREAKING CHANGE

CO-RE

CO-RE is currently done entirely in the library, by matching types used by the BPF program to types found in the kernel.

To extend CO-RE to kmods, we should merge the set of vmlinux and all kmod types and search this superset. Even though a kprobe is attached to a particular kmod's function, it
might still want to pull types from other modules, for example to cast void* to
appropriate types.
I believe this is what libbpf does.

Creating the superset of types will be done by Spec.Merge(). Merged types won't
have a canonical type ID, so Spec.TypeID() will return a well known error.

Problem: bpf_core_type_id_kernel is supposed to return the ID of a type in the kernel.
According to the documentation we could probably return 0 and be "compatible":

/*
 * Convenience macro to get BTF type ID of a target kernel's type that matches
 * specified local type.
 * Returns:
 *    - valid 32-bit unsigned type ID in kernel BTF;
 *    - 0, if no matching type was found in a target kernel BTF.
 */
#define bpf_core_type_id_kernel(type)					    \
	__builtin_btf_type_id(*(typeof(type) *)0, BPF_TYPE_ID_TARGET)

The most interesting use case of the target type ID is probably bpf_snprintf_btf.

Builtin vs. external BTF

Newer upstream kernels ship BTF for vmlinux and kmods in /sys/kernel/btf.
On older kernels and when distros disable BTF the user needs to provide this data out of band.
To use external BTF, users load a *btf.Spec and pass it via ProgramOptions.KernelTypes.

To support kmods for external BTF, LoadSplitSpecFromReader allows reading split BTF from
disk, which is followed by repeated calls to Spec.Merge.

To support kmods for builtin BTF, changes to NewHandleFromID and Handle.Spec are necessary.
To avoid parsing vmlinux on each call to NewHandleFromID we move inflating into
Handle.Spec.
Additionally Handle.Spec will allow specifying a base Spec.

fentry, fexit, etc.

fentry, etc. programs require BTF information in the syscall boundary:

	struct { /* anonymous struct used by BPF_PROG_LOAD command */
...
		__u32		attach_btf_id;	/* in-kernel BTF type id to attach to */
		union {
...
			/* or valid module BTF object fd or 0 to attach to vmlinux */
			__u32		attach_btf_obj_fd;
		};
...
	};

To support kmods, we need to set attach_btf_obj_fd and attach_btf_id. The
obj_fd has to come from a call to NewHandleFromID, it can't come from
a file on disk. The library currently has a code path that allows using external
BTF via ProgramOptions.KernelTypes to find attach_btf_id for vmlinux targets,
but that is most likely broken, since a kernel without builtin BTF will refuse attaching fentry, etc.

Going forward, ProgramOptions.KernelTypes is ignored for the purpose of finding an attach target.
Instead we will first look for the attach target in /sys/kernel/btf/vmlinux.
If successful, we use attach_obj_fd == 0.

Otherwise we will iterate BTF IDs using BPF_BTF_GET_NEXT_ID to find kmod BTF.
Via NewHandleFromID and Handle.Spec we then search the kmod.
If the target is found we use the handle and retrieve the (aliased) type ID via
Spec.TypeID.

Problem: this requires re-inflating kmods for each program, which
is very expensive. Need to cache this via handleCache somehow?

@lmb lmb added the enhancement New feature or request label Jun 10, 2022
@ti-mo
Copy link
Collaborator

ti-mo commented Jun 10, 2022

Thanks for the writeup, I think I have a better understanding of the full picture.

I'm still not sure why my suggestion without an explicit Merge() is bad. LoadSplitSpecFromReader should be implemented onto Spec itself and should add those kmod types to the spec.

types types would not be appended to, but instead, loading a kmod would append an entry to kmods []types. This would allow us to determine type IDs for kmod types as well.

I don't see how having split Specs is useful, sorry. Is there a strong need for this?

Note on older kernels and e.g. btfhub: split BTF isn't a thing there, it's just one blob. Need to check if kmods are even included there, not sure.

Re: reinflating, that could also be solved by always merging kmods into Spec and indexing them by their BTF IDs, or even by Handle? That gives us both a BTF id (so we can reiterate and avoid re-parsing) as well as an fd and typeid for use in specifying attach targets and we wouldn't need Handle.Spec() either.

Not sure if kmod BTFs get new IDs when they get reloaded, though. I like your idea of iterating them only when vmlinux lookup fails. With this caching in place, that's something we can perform for every prog without too much impact.

@lmb
Copy link
Collaborator Author

lmb commented Jun 11, 2022

I'm still not sure why my suggestion without an explicit Merge() is bad.

For CO-RE we need to query the superset of vmlinx + all kmods, but we don't care whether the type is from a module or vmlinux proper. Your preferred solution works well here.

For attach points (fentry/fexit), we need to query vmlinux and all kmods invidivually, since we care where a function resides. Only allowing merged Specs falls flat here.
Your fix is to add "module awareness" to Spec, but that actually works out to be more complicated than my proposal, see below.

This would allow us to determine type IDs for kmod types as well.

This is of limited use actually. For CO-RE we can get away with not providing kmod type IDs at all (hopefully). For fentry / fexit we do need type IDs, but we also need to know which kmod they came from! See below for your idea re indexing by ID.

Re: reinflating, that could also be solved by always merging kmods into Spec and indexing them by their BTF IDs, or even by Handle? That gives us both a BTF id (so we can reiterate and avoid re-parsing) as well as an fd and typeid for use in specifying attach targets and we wouldn't need Handle.Spec() either.

Turns out BTF IDs can be reused if no handle is open to them and their program / module is unloaded. It's unlikely (as in, you need to go through close to 2^32 objects), but not worth adding an edge case for.
Adding a handle (which wraps an FD) to Spec is going to cause a world of pain. All of a sudden we need Spec.Close and that is a big no go for me.

I don't see how having split Specs is useful, sorry.

I take this to mean: "a Spec that contains only the types contained in the raw BTF".

It's a natural way to represent the result of NewHandleFromID(kmod ID), that fits well with what we need for attach points. That algo will go something like this:

if type in vmlinux:
  return 0, vmlinux type id

for each BTF ID of a kmod:
  get fd to kmod BTF via BPF_BTF_GET_FD_BY_ID
  if type in kmod:
    return fd, kmod type id

Including vmlinux types in kmod types would cause us to do more work for each kmod we have to search.

Furthermore, such a split spec composes well with Merge.

Not sure if kmod BTFs get new IDs when they get reloaded, though.

Gut feeling: the ID will change.

Is there a strong need for this?

Yes, otherwise I would have changed my proposal accordingly :P I started out favouring your idea, but changed my mind as I started writing this document.

lmb added a commit to lmb/ebpf that referenced this issue Jun 12, 2022
Load the bpf_testmod into the kernel when booting the test VM. This
allows getting rid of the workaround to load vmlinux at BTF ID 1,
since loading a module also loads vmlinux BTF.

Later on we can use this to test split BTF / kmod support.

Updates cilium#705
lmb added a commit that referenced this issue Jun 13, 2022
Load the bpf_testmod into the kernel when booting the test VM. This
allows getting rid of the workaround to load vmlinux at BTF ID 1,
since loading a module also loads vmlinux BTF.

Later on we can use this to test split BTF / kmod support.

Updates #705
@brb
Copy link
Member

brb commented Jun 17, 2022

func (*Spec) Merge(other *Spec) error

I am not a big fan of this method, as this might give a wrong impression to users that BTFs are composable which is not the case - you can "merge" at most two BTFs (base and split). Merging multiple split BTFs will mess up type IDs.

@lmb
Copy link
Collaborator Author

lmb commented Jun 24, 2022

That's why the function has the following "caveat":

// Merged types don't have a canonical type ID, and therefore TypeID() will
// return an error for them.

So, if you need typeIDs you will have to deal with un-merged Spec. If you don't care about type IDs you can merge away. It's awkward, but solves the issue that we need a way for users to merge "external" kernel BTF to pass in ProgramOptions.KernelTypes.

Does this make more sense?

@brb
Copy link
Member

brb commented Jun 27, 2022

that we need a way for users to merge "external" kernel BTF to pass in ProgramOptions.KernelTypes

Are we aware of any real life use cases for merging more than two BTFs?

@lmb
Copy link
Collaborator Author

lmb commented Jun 28, 2022

We'd use it in btf.LoadKernelSpec to return a Spec that includes types from all kmods. Otherwise CO-RE of a kprobe in a kernel module doesn't work out of the box.

@ti-mo
Copy link
Collaborator

ti-mo commented Jun 28, 2022

Here are a few extra constraints I had in mind that I don't find mentioned explicitly, although these might be described by your pseudocode above:

  • All *Type(s)By* methods should be usable to retrieve types that appear in kmods, so the need to use Spec to represent kmods is a given unless we want to duplicate API. (just in case we'd consider modeling something like Module{Spec})
  • Finding a Type in vmlinux + kmods should be possible with a single scan (so, ideally, using a single method call)
  • Finding a Type in a specific kmod (or vmlinux) should also be possible without having to scan the world (e.g. when you're sure the type is declared somewhere in netfilter). I forgot to take this into account before, this could be useful for performance, but the cost of having to inflate vmlinux will always remain, hence my skepticism.

Your fix is to add "module awareness" to Spec, but that actually works out to be more complicated

From my perspective, allowing Specs to be merged already adds 90% of this complexity. I propose going just one step further and indexing 'incoming' kmod Specs by some type of identifier so we can also retrieve type IDs of kmod types after they've been merged. Having tags just like progs would be really nice here, perhaps we can calculate something similar? That could be a solution if we can't index by btf ID, since those can and do change.

Perhaps a checksum over the raw blob? This should be doable since Specs are always instantiated from raw bytes.

@brb
Copy link
Member

brb commented Jun 30, 2022

We'd use it in btf.LoadKernelSpec to return a Spec that includes types from all kmods

Can't we just extend the Spec with baseSpec *spec and splitSpec map[string] *spec? That would allow us to keep the TypeIDs. The new spec would contain all relevant low level details.

@lmb
Copy link
Collaborator Author

lmb commented Jul 5, 2022

Here are a few extra constraints I had in mind that I don't find mentioned explicitly [...]

Agreed with those. The last point is possible by querying a split spec obtained via NewHandleFromID().Spec().

I propose going just one step further and indexing 'incoming' kmod Specs by some type of identifier so we can also retrieve type IDs of kmod types after they've been merged.

That is only useful if we can go from the identifier to a BTF ID somehow. Keep in mind that we can't just serialise the module types and load that into the kernel to get an ID, we explicitly have to go via NewHandleFromID.

Can't we just extend the Spec with baseSpec *spec and splitSpec map[string] *spec?

The problem lies in what to key the modules by, see above.

lmb added a commit to lmb/ebpf that referenced this issue Jul 5, 2022
We currently decode BTF when calling NewHandleFromID. This can be very
expensive for large BTF blobs like vmlinux, even though not all users
will want to inspect decoded BTF in the first place.

Move the decoding into Handle.Spec, which is a breaking change.

Updates cilium#705
lmb added a commit to lmb/ebpf that referenced this issue Jul 5, 2022
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD.
NewHandleFromID therefore doesn't work when trying to obtain a handle for a kernel
module BTF. At the same time, always parsing BTF when creating a handle from an ID
is slow and wasteful.

Change Handle.Spec to accept a base Spec to support split BTF from modules.
This is a breaking change.

Updates cilium#705
lmb added a commit to lmb/ebpf that referenced this issue Jul 5, 2022
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD.
NewHandleFromID therefore doesn't work when trying to obtain a handle for a kernel
module BTF. At the same time, always parsing BTF when creating a handle from an ID
is slow and wasteful.

Change Handle.Spec to accept a base Spec to support split BTF from modules.
This is a breaking change.

Updates cilium#705
lmb added a commit to lmb/ebpf that referenced this issue Jul 5, 2022
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD.
NewHandleFromID therefore doesn't work when trying to obtain a handle for a kernel
module BTF. At the same time, always parsing BTF when creating a handle from an ID
is slow and wasteful.

Change Handle.Spec to accept a base Spec to support split BTF from modules.
This is a breaking change.

Updates cilium#705
lmb added a commit to lmb/ebpf that referenced this issue Jul 5, 2022
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD.
NewHandleFromID therefore doesn't work when trying to obtain a handle for a kernel
module BTF. At the same time, always parsing BTF when creating a handle from an ID
is slow and wasteful.

Change Handle.Spec to accept a base Spec to support split BTF from modules.
This is a breaking change.

Updates cilium#705
lmb added a commit to lmb/ebpf that referenced this issue Jul 5, 2022
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD.
NewHandleFromID therefore doesn't work when trying to obtain a handle for a kernel
module BTF. At the same time, always parsing BTF when creating a handle from an ID
is slow and wasteful.

Change Handle.Spec to accept a base Spec to support split BTF from modules.
This is a breaking change.

Updates cilium#705
lmb added a commit to lmb/ebpf that referenced this issue Jul 6, 2022
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD.
NewHandleFromID therefore doesn't work when trying to obtain a handle for a kernel
module BTF. At the same time, always parsing BTF when creating a handle from an ID
is slow and wasteful.

Change Handle.Spec to accept a base Spec to support split BTF from modules.
This is a breaking change.

Updates cilium#705
lmb added a commit to lmb/ebpf that referenced this issue Jul 7, 2022
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD.
NewHandleFromID therefore doesn't work when trying to obtain a handle for a kernel
module BTF. At the same time, always parsing BTF when creating a handle from an ID
is slow and wasteful.

Change Handle.Spec to accept a base Spec to support split BTF from modules.
This is a breaking change.

Updates cilium#705
lmb added a commit to lmb/ebpf that referenced this issue Jul 8, 2022
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD. This
causes NewHandleFromID to fail for kmod BTF, since the function attempts to inflate
types.

Avoid inflating types in NewHandleFromID and change Handle.Spec to accept a base
Spec to allow decoding types from modules.

This is a breaking change.

Updates cilium#705
lmb added a commit that referenced this issue Jul 14, 2022
Kernel modules only expose split BTF when queried via BPF_OBJ_GET_INFO_BY_FD. This
causes NewHandleFromID to fail for kmod BTF, since the function attempts to inflate
types.

Avoid inflating types in NewHandleFromID and change Handle.Spec to accept a base
Spec to allow decoding types from modules.

This is a breaking change.

Updates #705
lmb added a commit to lmb/ebpf that referenced this issue Jul 18, 2022
Allow tracing functions in kernel modules via fentry, fexit, fmod_ret
and tp_btf programs. The behaviour follows libbpf and is transparent
to the user: if we can't find a target in vmlinux we attempt to find
it in any loaded kernel module.

Refactor TestProgramTypeLSM to test attaching via BTF. This removes
LSM tests for BPF_F_SLEEPABLE, since they don't excercise library
behaviour beyond passing ProgramSpec.Flags to the kernel.

Updates cilium#705
lmb added a commit to lmb/ebpf that referenced this issue Jul 18, 2022
Allow tracing functions in kernel modules via fentry, fexit, fmod_ret
and tp_btf programs. The behaviour follows libbpf and is transparent
to the user: if we can't find a target in vmlinux we attempt to find
it in any loaded kernel module.

Refactor TestProgramTypeLSM to test attaching via BTF. This removes
LSM tests for BPF_F_SLEEPABLE, since they don't excercise library
behaviour beyond passing ProgramSpec.Flags to the kernel.

Updates cilium#705
lmb added a commit to lmb/ebpf that referenced this issue Jul 18, 2022
Allow tracing functions in kernel modules via fentry, fexit, fmod_ret
and tp_btf programs. The behaviour follows libbpf and is transparent
to the user: if we can't find a target in vmlinux we attempt to find
it in any loaded kernel module.

Refactor TestProgramTypeLSM to test attaching via BTF. This removes
LSM tests for BPF_F_SLEEPABLE, since they don't excercise library
behaviour beyond passing ProgramSpec.Flags to the kernel.

Updates cilium#705
lmb added a commit to lmb/ebpf that referenced this issue Jul 18, 2022
Allow tracing functions in kernel modules via fentry, fexit, fmod_ret
and tp_btf programs. The behaviour follows libbpf and is transparent
to the user: if we can't find a target in vmlinux we attempt to find
it in any loaded kernel module.

Refactor TestProgramTypeLSM to test attaching via BTF. This removes
LSM tests for BPF_F_SLEEPABLE, since they don't excercise library
behaviour beyond passing ProgramSpec.Flags to the kernel.

Updates cilium#705
lmb added a commit to lmb/ebpf that referenced this issue Jul 19, 2022
Allow tracing functions in kernel modules via fentry, fexit, fmod_ret
and tp_btf programs. The behaviour follows libbpf and is transparent
to the user: if we can't find a target in vmlinux we attempt to find
it in any loaded kernel module.

Refactor TestProgramTypeLSM to test attaching via BTF. This removes
LSM tests for BPF_F_SLEEPABLE, since they don't excercise library
behaviour beyond passing ProgramSpec.Flags to the kernel.

Updates cilium#705
lmb added a commit to lmb/ebpf that referenced this issue Jul 19, 2022
Allow tracing functions in kernel modules via fentry, fexit, fmod_ret
and tp_btf programs. The behaviour follows libbpf and is transparent
to the user: if we can't find a target in vmlinux we attempt to find
it in any loaded kernel module.

Refactor TestProgramTypeLSM to test attaching via BTF. This removes
LSM tests for BPF_F_SLEEPABLE, since they don't excercise library
behaviour beyond passing ProgramSpec.Flags to the kernel.

Updates cilium#705
lmb added a commit that referenced this issue Jul 20, 2022
Allow tracing functions in kernel modules via fentry, fexit, fmod_ret
and tp_btf programs. The behaviour follows libbpf and is transparent
to the user: if we can't find a target in vmlinux we attempt to find
it in any loaded kernel module.

Refactor TestProgramTypeLSM to test attaching via BTF. This removes
LSM tests for BPF_F_SLEEPABLE, since they don't excercise library
behaviour beyond passing ProgramSpec.Flags to the kernel.

Updates #705
@lmb lmb self-assigned this Jul 20, 2022
@lmb
Copy link
Collaborator Author

lmb commented Dec 6, 2022

I think I've come round to the fact that Merge() is probably a bad idea. It creates all sorts of inconsistencies around split BTF. Probably the better solution would be to make ProgramOptions.KernelTypes a slice of Specs instead.

@bonifaido
Copy link

bonifaido commented Feb 6, 2023

Does this issue mean that a function exposed by a out-of-tree kernel module for example bpf_testmod_test_mod_kfunc in this upstream kernel module example, can't be used by an eBPF program loaded by cilium/ebpf?

Or I'm missing some details, and this is not possible at all on the kernel/kernel-module/bpf level at all?

@lmb
Copy link
Collaborator Author

lmb commented Feb 15, 2023

@bonifaido kfunc's are a separate feature, which the library currently doesn't support. Timo Beckers is looking into it at the moment however. I'm not sure whether it's possible to call kfuncs from modules, your link seems broken?

@bonifaido
Copy link

Sorry for the broken link, just fixed it.

After experimenting with a plain kernel module and eBPF I can tell, that it is possible to invoke an kernel-module function exposed with register_btf_kfunc_id_set from eBPF if the program types in the call to register_btf_kfunc_id_set and the eBPF program itself matches.

@lmb lmb removed their assignment Mar 2, 2023
@ti-mo
Copy link
Collaborator

ti-mo commented Apr 24, 2023

Something to figure out: does opening a Handle to a kernel module, then unloading and loading the module, result in a new BTF ID being allocated for the kmod BTF?

@brycekahle
Copy link
Contributor

Note on older kernels and e.g. btfhub: split BTF isn't a thing there, it's just one blob. Need to check if kmods are even included there, not sure.

btfhub currently doesn't include kmod BTF, but we are looking into adding it. It seems like a single blob is easier on the consumption side, but requires us to generate + merge + dedup BTF for all the modules. We may be able to take the split generated BTF and concatenate it while adjusting ID numbers up to prevent conflicts. I'm looking into the specifics of how to best accomplish that now.

@brycekahle
Copy link
Contributor

brycekahle commented Apr 26, 2023

while adjusting ID numbers up to prevent conflicts

Looks like split generation already does this part, although it seems each kernel module BTF type ids can conflict with each other.

@ti-mo
Copy link
Collaborator

ti-mo commented Jun 13, 2023

Looks like split generation already does this part, although it seems each kernel module BTF type ids can conflict with each other.

Yes, if the vmlinux base has type IDs 1-100, all modules will start their IDs at 101, so they all overlap. They can't really be merged into a single blob, unless BTFhub would roll its own, e.g. gzip them, but then users would need to take care to unarchive them and pass them to the lib individually.

@brycekahle
Copy link
Contributor

but then users would need to take care to unarchive them and pass them to the lib individually.

We were talking about adjusting the type IDs to prevent conflict, so they could be merged into a single blob. I'm not sure the simpler loading story is worth all the extra preprocessing work though, especially since we would have to build some of these tools.

@ti-mo
Copy link
Collaborator

ti-mo commented Jul 18, 2023

Hmm, sounds like a use case for btf.Builder, WDYT @lmb?

@lmb
Copy link
Collaborator Author

lmb commented Jul 18, 2023

Builder doesn't allow writing out split BTF (yet! patches welcome) but it'll do the renumbering, etc. just fine. Another missing piece is deduplication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants