Skip to content

Conversation

@kernel-patches-daemon-bpf
Copy link

Pull request for series with
subject: support inline tracing with BTF
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1009483

Add BTF_KIND_LOC_PARAM, BTF_KIND_LOC_PROTO and BTF_KIND_LOCSEC
to help represent location information for functions.

BTF_KIND_LOC_PARAM is used to represent how we retrieve data at a
location; either via a register, or register+offset or a
constant value.

BTF_KIND_LOC_PROTO represents location information about a location
with multiple BTF_KIND_LOC_PARAMs.

And finally BTF_KIND_LOCSEC is a set of location sites, each
of which has

- a name (function name)
- a function prototype specifying which types are associated
  with parameters
- a location prototype specifying where to find those parameters
- an address offset

This can be used to represent

- a fully-inlined function
- a partially-inlined function where some _LOC_PROTOs represent
  inlined sites as above and others have normal _FUNC representations
- a function with optimized parameters; again the FUNC_PROTO
  represents the original function, with LOC info telling us
  where to obtain each parameter (or 0 if the parameter is
  unobtainable)

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Add support for new kinds to libbpf.  BTF_KIND_LOC_PARAM and
BTF_KIND_LOC_PROTO are dedup-able so add support for their
deduplication, whereas since BTF_KIND_LOCSEC contains a unique
offset it is not.

Other considerations: because BTF_KIND_LOCSEC has multiple
member type ids we need to increase the number of member elements
to 2 in the field iterator.

Add APIs to add location param, location prototypes and location
sections.

For BTF relocation we add location info to split BTF.

One small thing noticed during testing; the test for adding_to_base
relies on the split BTF start id being > 1; however it is possible
to have empty distilled base BTF, so this test should be generalized
to check for the base BTF pointer (it will be non-NULL for split
BTF even if the base BTF is empty).

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
When creating split BTF for the .BTF.extra section to record location
information, we need to add function prototypes that refer to base BTF
(vmlinux) types.  However since .BTF.extra is split BTF we have a
problem; since collecting those type ids for the parameters, the base
vmlinux BTF has been deduplicated so the type ids are stale.  As a
result it is valuable to be able to access the map from old->new type
ids that is constructed as part of deduplication.  This allows us to
update the out-of-date type ids in the FUNC_PROTOs.

In order to pass the map back, we need to fill out all of the hypot
map mappings; as an optimization normal dedup only computes type id
mappings needed in existing BTF type id references.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
When creating multi-split BTF we correctly set the start string offset
to be the size of the base string section plus the base BTF start
string offset; the latter is needed for multi-split BTF since the
offset is non-zero there.

Unfortunately the BTF parsing case needed that logic and it was
missed.

Fixes: 4e29128 ("libbpf/btf: Fix string handling to support
multi-split BTF")
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
In raw mode ensure we can dump new BTF kinds in normal/json format.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
For bpftool to be able to dump .BTF.extra data in /sys/kernel/btf_extra
for modules, it needs to support multi-split BTF because the
parent-child relationship of BTF extra data for modules is

vmlinux BTF data
	module BTF data
		module BTF extra data

So for example to dump BTF extra info for xfs we would run

$ bpftool btf dump -B /sys/kernel/btf/vmlinux -B /sys/kernel/btf/xfs file /sys/kernel/btf_extra/xfs

Multiple bases are specified with the vmlinux base BTF first (parent)
followed by the xfs BTF (child), and finally the XFS BTF extra.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Add support to dump, encode and validate new location-related kinds.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
BTF_KIND_LOC[_PARAM|_PROTO|SEC] need to work with field iteration, so extend
the selftest to cover these and ensure iteration over all types
and names succeeds.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Ensure that location params/protos are deduplicated and location
sections are not, and that references to deduplicated locations within
location prototypes and sections are updated after deduplication.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
…split BTF

When creating distilled BTF, BTF_KIND_LOC_PARAM and BTF_KIND_LOC_PROTO
should be added to split BTF.  This means potentially some duplication
of location information, but only for out-of-tree modules that use
distilled base/split BTF.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
information

.BTF.extra sections wil be used to add additional BTF information
for inlines etc.  .BTF.extra sections are split BTF relative to
kernel/module BTF and are enabled via CONFIG_DEBUG_INFO_BTF_EXTRA.
It is bool for now but will become tristate in a later patch when
support for a separate module is added (vmlinux .BTF.extra is
9Mb so 'y' is not a good option for most cases since it will
bloat vmlinux size).

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Allow module-based delivery of potentially large vmlinux .BTF.extra section;
section; also support visibility of BTF data in kernel, modules in
/sys/kernel/btf_extra.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Add btf__load_btf_extra() function to load extra BTF relative to
base passed in.  Base can be vmlinux or module BTF.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Add support for BTF-based location attachment via multiple kprobes
attaching to each instance of an inline site. Note this is not kprobe
multi attach since that requires fprobe on entry and sites are within
functions. Implementation similar to USDT manager where we use BTF
to create a location manager and populate expected arg values with
metadata based upon BTF_KIND_LOC_PARAM/LOC_PROTOs.

Add new auto-attach SEC("kloc/module:name") where the module is
vmlinux/kernel module and the name is the name of the associated
location; all sites associated with that name will be attached via
kprobes for tracing.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Add a test tracing a vmlinux inline function called from
__sys_bpf() and ensure one of its arguments - if available -
is as expected.

A simple test as a starting point but it does demonstrate the
viability of the approach.

Ideally we would add a bunch of inlines to bpf_testmod, but
need to have BTF distillation/relocation working for .BTF.extra
sections; that is not yet implemented.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 56b4d16
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1009483
version: 1

@kernel-patches-review-bot
Copy link

AI review job summary: https://github.com/kernel-patches/bpf/actions/runs/18353369106

Inline review:

commit d670431ff21de24675988cf327641ce14e396aa9
Author: Alan Maguire <alan.maguire@oracle.com>

libbpf: Add support for BTF kinds LOC_PARAM, LOC_PROTO and LOCSEC

This patch adds support for three new BTF kinds to libbpf. It includes
support for deduplication of LOC_PARAM and LOC_PROTO types, new APIs for
adding these types, and validation logic.

> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 18907f0fc..0abd7831d 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c

[ ... ]

> @@ -588,6 +621,34 @@ static int btf_validate_type(const struct btf *btf, const struct btf_type *t, __
>  		}
>  		break;
>  	}
> +	case BTF_KIND_LOC_PARAM:
> +		break;
> +	case BTF_KIND_LOC_PROTO: {
> +		__u32 *p = btf_loc_proto_params(t);
> +
> +		n = btf_vlen(t);
> +		for (i = 0; i < n; i++, p++) {
> +			err = btf_validate_id(btf, *p, id);
> +			if (err)
> +				return err;
> +		}
> +		break;
> +	}
> +	case BTF_KIND_LOCSEC: {
> +		const struct btf_loc *l = btf_locsec_locs(t);
> +
> +		n = btf_vlen(t);
> +		for (i = 0; i < n; i++, l++) {
> +			err = btf_validate_str(btf, l->name_off, "loc name", id);
> +			if (!err)
> +				err = btf_validate_id(btf, l->func_proto, id);
> +			if (!err)
> +				btf_validate_id(btf, l->loc_proto, id);
                                        ^^^^

Does this validation correctly capture errors? The third btf_validate_id()
call is missing the "err = " assignment. If l->loc_proto contains an
invalid type ID, the error return value gets discarded. Compare with
similar validation patterns in BTF_KIND_FUNC_PROTO and BTF_KIND_LOC_PROTO
cases above which consistently assign to err.

Should this be:

    err = btf_validate_id(btf, l->loc_proto, id);

> +			if (err)
> +				return err;
> +		}
> +		break;
> +	}
>  	default:
>  		pr_warn("btf: type [%u]: unrecognized kind %u\n", id, kind);
>  		return -EINVAL;


@kernel-patches-review-bot
Copy link

AI review job summary: https://github.com/kernel-patches/bpf/actions/runs/18353369106

Inline review:

commit 743a220d451dade3bd061bb95665d84061ba8fd1
Author: Alan Maguire <alan.maguire@oracle.com>

libbpf: add support for BTF location attachment

This patch adds support for BTF-based location attachment via multiple
kprobes attaching to each instance of an inline site. The implementation
is similar to USDT manager and includes new auto-attach SEC("kloc").

> diff --git a/tools/lib/bpf/loc.c b/tools/lib/bpf/loc.c
> new file mode 100644
> index 000000000..345b248bb
> --- /dev/null
> +++ b/tools/lib/bpf/loc.c

[ ... ]

> @@ -0,0 +1,653 @@
> +// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
> +/* Copyright (c) 2025, Oracle and/or its affiliates. */

[ ... ]

> +#elif defined(__aarch64__)
> +
> +static int calc_pt_regs_off(int num)
> +{
> +	if (num >= 0 && num < 31)
> +		return offsetof(struct user_pt_regs, regs[reg_num]);
                                                         ^^^^^^^
> +	else if (num == 33)
> +		return offsetof(struct user_pt_regs, sp);
> +	pr_warn("loc: unsupported register '%d'\n", num);
> +	return -ENOENT;
> +}

Does this code compile on aarch64? The variable name is 'reg_num' but
the function parameter is 'num'. Should this be regs[num] instead?


@kernel-patches-review-bot
Copy link

AI review job summary: https://github.com/kernel-patches/bpf/actions/runs/18353369106

Inline review:

commit bdefad3f32c7a90f675ac685e6a375cb0eb4cbbc
Author: Alan Maguire <alan.maguire@oracle.com>

bpftool: Handle multi-split BTF by supporting multiple base BTFs

This patch changes bpftool to support multi-split BTF by allowing
multiple -B options to chain base BTF files together for dumping
module BTF extra data.

> diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
> index a829a6a49..aa16560b4 100644
> --- a/tools/bpf/bpftool/main.c
> +++ b/tools/bpf/bpftool/main.c
> @@ -514,7 +514,8 @@ int main(int argc, char **argv)
>  			verifier_logs = true;
>  			break;
>  		case 'B':
> -			base_btf = btf__parse(optarg, NULL);
> +			/* handle multi-split BTF */
> +			base_btf = btf__parse_split(optarg, base_btf);
>  			if (!base_btf) {
>  				p_err("failed to parse base BTF at '%s': %d\n",
>  				      optarg, -errno);

Does this leak the BTF objects when -B is used multiple times?

Looking at the pattern here: when -B is specified twice (as shown in the
commit message with "bpftool btf dump -B vmlinux -B xfs file xfs_extra"),
the first call creates BTF1 and stores it in base_btf. The second call
creates BTF2 with BTF1 as its base_btf, then overwrites the global
base_btf variable with BTF2. This loses the direct reference to BTF1.

At cleanup (main.c line 570), only btf__free(base_btf) is called, which
frees BTF2. However, in libbpf/btf.c btf__free() only frees btf->base_btf
if btf->owns_base is true. Looking at btf_new() in btf.c, owns_base is
never set when a base_btf is passed as a parameter - it's only set to true
for dist_base_btf (embedded base BTF from ELF sections).

This means BTF1 is never freed. The caller is expected to maintain
ownership of the base_btf pointer passed to btf__parse_split(), but the
reassignment pattern "base_btf = btf__parse_split(path, base_btf)" loses
that pointer.

Should this either: (1) track all intermediate BTF objects in an array and
free them at cleanup, or (2) set owns_base=true for externally-provided
base_btf so the chain is freed automatically?


@kernel-patches-review-bot
Copy link

AI review job summary: https://github.com/kernel-patches/bpf/actions/runs/18353369106

Inline review:

commit d5500c62587c0ca3cd71be4a8f2e0be62bb6afec
Author: Alan Maguire <alan.maguire@oracle.com>

kbuild, module, bpf: Support CONFIG_DEBUG_INFO_BTF_EXTRA=m

Allow module-based delivery of potentially large vmlinux .BTF.extra
section; also support visibility of BTF data in kernel, modules in
/sys/kernel/btf_extra.

> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 29cec549f..749e04c67 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c

[ ... ]

> @@ -8384,32 +8417,45 @@ static int btf_module_notify(struct notifier_block *nb, unsigned long op,
>  		list_add(&btf_mod->list, &btf_modules);
>  		mutex_unlock(&btf_module_mutex);
>
> -		if (IS_ENABLED(CONFIG_SYSFS)) {
> -			struct bin_attribute *attr;
> -
> -			attr = kzalloc(sizeof(*attr), GFP_KERNEL);
> -			if (!attr)
> -				goto out;
> -
> -			sysfs_bin_attr_init(attr);
> -			attr->attr.name = btf->name;
> -			attr->attr.mode = 0444;
> -			attr->size = btf->data_size;
> -			attr->private = btf->data;
> -			attr->read = sysfs_bin_attr_simple_read;
> -
> -			err = sysfs_create_bin_file(btf_kobj, attr);
> -			if (err) {
> -				pr_warn("failed to register module [%s] BTF in sysfs: %d\n",
> -					mod->name, err);
> -				kfree(attr);
> -				err = 0;
> +		if (IS_ENABLED(CONFIG_SYSFS) && btf) {
> +			attr = sysfs_btf_add(btf_kobj, btf->name, btf->data, btf->data_size);
> +			if (IS_ERR(attr)) {
> +				err = PTR_ERR(attr);
>  				goto out;
>  			}
> -
>  			btf_mod->sysfs_attr = attr;
>  		}
> +#if IS_ENABLED(CONFIG_DEBUG_INFO_BTF_EXTRA)
> +		if (mod->btf_extra_data_size > 0) {
> +			const char *name = mod->name;
> +			void *data;
> +
> +			/* vmlinux .BTF.extra is SHF_ALLOC; other modules
> +			 * are not, so for them we need to kvmemdup() the data.
> +			 */
> +			if (strcmp(mod->name, "btf_extra") == 0) {
> +				name = "vmlinux";
> +				data = mod->btf_extra_data;
> +			} else {
> +				data = kvmemdup(mod->btf_extra_data, mod->btf_extra_data_size,
> +						GFP_KERNEL | __GFP_NOWARN);
> +				if (!data) {
> +					err = -ENOMEM;
> +					goto out;
> +				}
> +				btf_mod->btf_extra_data = data;
> +			}
> +			attr = sysfs_btf_add(btf_extra_kobj, name, data,
> +					     mod->btf_extra_data_size);
> +			if (IS_ERR(attr)) {
> +				err = PTR_ERR(attr);
> +				kfree(btf_mod->sysfs_attr);
> +				kvfree(btf_mod->btf_extra_data);
> +				goto out;
> +			}
> +			btf_mod->sysfs_extra_attr = attr;
> +		}
> +#endif
>  		break;

Can the error path at the second sysfs_btf_add() failure cause resource
management issues? At this point btf_mod is on the btf_modules list (added
earlier), and when the function returns an error, MODULE_STATE_GOING will
be called via blocking_notifier_call_chain_robust().

The error handling frees btf_mod->sysfs_attr and btf_mod->btf_extra_data,
but these pointers aren't set to NULL. When MODULE_STATE_GOING runs:

  btf_module_notify()->MODULE_STATE_GOING:
    if (btf_mod->sysfs_attr)
      sysfs_remove_bin_file(btf_kobj, btf_mod->sysfs_attr);

This accesses freed memory (use-after-free).

    if (btf_mod->btf_extra_data)
      kvfree(btf_mod->btf_extra_data);

This frees it again (double-free).

    if (btf_mod->btf) {
      purge_cand_cache(btf_mod->btf);
      btf_put(btf_mod->btf);
      kfree(btf_mod->sysfs_attr);
    }

This also frees sysfs_attr again (double-free).

Similar issue at the first error path (kvmemdup failure): btf_mod is on
the list with sysfs_attr and btf set, but the goto out doesn't clean them
up before MODULE_STATE_GOING tries to.

> diff --git a/kernel/bpf/sysfs_btf.c b/kernel/bpf/sysfs_btf.c
> index 9cbe15ce3..0298a0936 100644
> --- a/kernel/bpf/sysfs_btf.c
> +++ b/kernel/bpf/sysfs_btf.c
> @@ -49,7 +49,15 @@ static struct bin_attribute bin_attr_btf_vmlinux __ro_after_init = {
>  	.mmap = btf_sysfs_vmlinux_mmap,
>  };
>
> -struct kobject *btf_kobj;
> +struct kobject *btf_kobj, *btf_extra_kobj;
> +
> +#if IS_BUILTIN(CONFIG_DEBUG_INFO_BTF_EXTRA)
> +/* See scripts/link-vmlinux.sh, gen_btf() func for details */
> +extern char __start_BTF_extra[];
> +extern char __stop_BTF_extra[];
> +
> +struct bin_attribute *extra_attr;
> +#endif
>
>  static int __init btf_vmlinux_init(void)
>  {
> @@ -62,6 +70,17 @@ static int __init btf_vmlinux_init(void)
>  	btf_kobj = kobject_create_and_add("btf", kernel_kobj);
>  	if (!btf_kobj)
>  		return -ENOMEM;
> +	if (IS_ENABLED(CONFIG_DEBUG_INFO_BTF_EXTRA)) {
> +		btf_extra_kobj = kobject_create_and_add("btf_extra", kernel_kobj);
> +		if (!btf_extra_kobj)
> +			return -ENOMEM;

Does this leak btf_kobj? If kobject_create_and_add() fails for
btf_extra_kobj, the function returns without cleaning up the btf_kobj
that was successfully created above.

> +#if IS_BUILTIN(CONFIG_DEBUG_INFO_BTF_EXTRA)
> +		extra_attr = sysfs_btf_add(btf_extra_kobj, "vmlinux", __start_BTF_extra,
> +					   __stop_BTF_extra - __start_BTF_extra);
> +		if (IS_ERR(extra_attr))
> +			return PTR_ERR(extra_attr);

Same question here - does this leak both btf_kobj and btf_extra_kobj if
sysfs_btf_add() fails?

> +#endif
> +	}
>
>  	return sysfs_create_bin_file(btf_kobj, &bin_attr_btf_vmlinux);
>  }


@kernel-patches-daemon-bpf
Copy link
Author

At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=1009483 expired. Closing PR.

@kernel-patches-daemon-bpf kernel-patches-daemon-bpf bot deleted the series/1009483=>bpf-next branch October 11, 2025 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants