Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assemble a dummy BTF blob for probing StructOps maps #344

Closed
ti-mo opened this issue Jul 14, 2021 · 10 comments
Closed

Assemble a dummy BTF blob for probing StructOps maps #344

ti-mo opened this issue Jul 14, 2021 · 10 comments
Labels
enhancement New feature or request

Comments

@ti-mo
Copy link
Collaborator

ti-mo commented Jul 14, 2021

With #321, an API for probing available map types in the kernel was added. However, a StructOps map requires a valid BTF blob to be specified in order to make creation work.

@qmonnet was able to get this (partially?) working in bpftool: #321 (comment).

This issue is for implementing the equivalent using a pre-baked (or assembled at runtime) BTF blob to be able to probe this map type successfully.

cc @rgo3

@qmonnet
Copy link
Member

qmonnet commented Jul 14, 2021

Works completely for bpftool, but I haven't submitted a patch upstream yet.

Full working patch:
diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
index ecaae2927ab8..629d39c98f10 100644
--- a/tools/lib/bpf/libbpf_probes.c
+++ b/tools/lib/bpf/libbpf_probes.c
@@ -203,6 +203,22 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
 	__u32 btf_key_type_id = 0, btf_value_type_id = 0;
 	struct bpf_create_map_attr attr = {};
 	int fd = -1, btf_fd = -1, fd_inner;
+	int btf_vmlinux_value_type_id = 0;
+	struct btf *btf_vmlinux;
+
+	/* [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED */
+	__u8 const btf_data[] = {
+		0x9f, 0xeb, 0x01, 0x00, 0x18, 0x00, 0x00, 0x00,
+		0x00, 0x00, 0x00, 0x00, 0x30, 0x00, 0x00, 0x00,
+		0x30, 0x00, 0x00, 0x00, 0x09, 0x00, 0x00, 0x00,
+		0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01,
+		0x04, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x01,
+		0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x0d,
+		0x00, 0x00, 0x00, 0x00, 0x07, 0x00, 0x00, 0x00,
+		0x01, 0x00, 0x00, 0x00, 0x05, 0x00, 0x00, 0x00,
+		0x00, 0x00, 0x00, 0x0c, 0x02, 0x00, 0x00, 0x00,
+		0x00, 0x69, 0x6e, 0x74, 0x00, 0x78, 0x00, 0x61,
+		0x00 };
 
 	key_size	= sizeof(__u32);
 	value_size	= sizeof(__u32);
@@ -245,6 +261,17 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
 		value_size = 0;
 		max_entries = 4096;
 		break;
+	case BPF_MAP_TYPE_STRUCT_OPS:
+		btf_fd = bpf_load_btf(btf_data, sizeof(btf_data), NULL, 0, false);
+		if (btf_fd < 0)
+			return false;
+		value_size = 256;
+		btf_vmlinux = libbpf_find_kernel_btf();
+		if (libbpf_get_error(btf_vmlinux))
+			return false;
+		btf_vmlinux_value_type_id = btf__find_by_name_kind(btf_vmlinux,
+		     "bpf_struct_ops_tcp_congestion_ops", BTF_KIND_STRUCT);
+		break;
 	case BPF_MAP_TYPE_UNSPEC:
 	case BPF_MAP_TYPE_HASH:
 	case BPF_MAP_TYPE_ARRAY:
@@ -264,7 +291,6 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
 	case BPF_MAP_TYPE_XSKMAP:
 	case BPF_MAP_TYPE_SOCKHASH:
 	case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY:
-	case BPF_MAP_TYPE_STRUCT_OPS:
 	default:
 		break;
 	}
@@ -292,6 +318,7 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
 		attr.max_entries = max_entries;
 		attr.map_flags = map_flags;
 		attr.map_ifindex = ifindex;
+		attr.btf_vmlinux_value_type_id = btf_vmlinux_value_type_id;
 		if (btf_fd >= 0) {
 			attr.btf_fd = btf_fd;
 			attr.btf_key_type_id = btf_key_type_id;

(I got the BTF blob from examining (strace) the load of a BTF object created from int foo = 0; or something like that; Maybe I can find a nicer way to present it in the code before submitting. But that's a detail.)

@ti-mo
Copy link
Collaborator Author

ti-mo commented Jul 15, 2021

@qmonnet Thank you for the example! Looks like this does indeed rely on being able to obtain the vmlinux BTF, which requires sysfs to be mounted and accessible at /sys. Be aware (also for bpftool) that this is not guaranteed when running containerized apps, though Docker seems to mount sysfs by default. Things might be different on other container runtimes and schedulers.

Just echoing here the discussion(s) we had in this PR:

For now, it seems unlikely we'll be able to create a (mock) StructOps map, so best to conclude that StructOps maps are not supported from the perspective of the current process if the process can't obtain a copy of the vmlinux BTF blob.

@ti-mo
Copy link
Collaborator Author

ti-mo commented Jul 16, 2021

(continuation of #321 (comment))

So yes, the vmlinux BTF is loaded into the kernel somehow?

Yes, (all?) kernel BTF seems to be preloaded as far as I can see:

~ strace bpftool btf dump id 1
...
bpf(BPF_BTF_GET_FD_BY_ID, {btf_id=1}, 120)
...
[1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none)
... etc.

The next fd's seem to be BTFs for various other subsystems.

This makes me doubt if sysfs is really needed to obtain the vmlinux BTF blob, or maybe BPF_BTF_GET_FD_BY_ID is a more recent addition.

It still requires a way to find the id for kernel BTF, though 🤔.

vmlinux seems to be fixed at 1, but we should make sure there's a const that can be depended on.

Then once we have this fd we could assign it to attr->btf_fd, this won't be valid and the map won't be created,

I don't see why that would be invalid. 😅 If we can obtain the vmlinux BTF reliably using a syscall and parse its graph for bpf_struct_ops_tcp_congestion_ops, we have our probe.

@qmonnet
Copy link
Member

qmonnet commented Jul 20, 2021

I don't see why that would be invalid. 😅

Because the kernel explicitly checks that this BTF object is not kernel BTF in the case of struct_ops maps, see map_create() in kernel/bpf/syscall.c:

	if ( [...] || attr->btf_vmlinux_value_type_id) {
		struct btf *btf;

		btf = btf_get_by_fd(attr->btf_fd);
		[...]
		if (btf_is_kernel(btf)) {
			btf_put(btf);
			err = -EACCES;
			goto free_map;
		}

Agreed on the other points.

@lmb
Copy link
Collaborator

lmb commented Jul 20, 2021

A random thing to keep in mind:

bpf(BPF_BTF_GET_FD_BY_ID, {btf_id=1}, 120)

That syscall requires CAP_SYS_ADMIN, so it won't work for feature probes I'd say.

@qmonnet
Copy link
Member

qmonnet commented Jul 20, 2021

But then most of the probes will require some level of privilege anyway?

@ti-mo ti-mo added the enhancement New feature or request label Jan 21, 2022
@ti-mo
Copy link
Collaborator Author

ti-mo commented Apr 28, 2022

This will become possible to implement after #641 has been merged.

@ti-mo
Copy link
Collaborator Author

ti-mo commented Jul 22, 2022

Closing this as we no longer really need this for probing (see #746) and #641 will be pushed over the line at some point.

@ti-mo ti-mo closed this as completed Jul 22, 2022
@ti-mo
Copy link
Collaborator Author

ti-mo commented Jul 29, 2022

Reopened as we'll still need to gain the ability to craft a valid StructOps program at some point for probing helper type availability.

@ti-mo ti-mo reopened this Jul 29, 2022
@ti-mo
Copy link
Collaborator Author

ti-mo commented Mar 3, 2023

There's currently no strong driver for probing helpers in tracing, struct_ops, ext and lsm programs. Implementing these are not so trivial, as programs need to be loaded with certain attach targets, etc.

To revisit later if there is a need.

@ti-mo ti-mo closed this as completed Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants