Add support for weak kfuncs #1364

dylandreimerink · 2024-03-02T14:27:08Z

This PR allows users to mark their kfunc defintions as __weak. Weak kfuncs do not cause an error during loading if the kfunc in question can't be found. Instead, a poison value is written which will cause the verifier to bail out if the instruction is evaluated.

In addition, this PR relocates LDIMM64 instructions with kfunc relocation entries. When relocated, the BTF id of the kfunc is written to the instruction's immediate field.

Together these two changes allow users to write eBPF programs which can gracefully handle the absence of a kfunc. To do so the bpf_ksym_exists macro from bpf_helpers.h can be used to check if a kfunc is present.

void invalid_kfunc(void) __ksym __weak;

__section("tp_btf/task_newtask") int weak_kfunc_missing(void *ctx) {
	if (bpf_ksym_exists(invalid_kfunc)) {
		invalid_kfunc();
		return 0;
	}

	return 1;
}

In this example the kfunc is always invalid, yet the program can still load since the branch calling the kfunc is unreachable.

So this effectively provides CO-RE capabilities for kfuncs. Especially important since kfuncs are less stable than helpers, their availability depending on kernel version, kernel compilation flags, or kernel modules being loaded or not.

Fixes: #1355

lmb

Nice work! Just some nits.

testdata/kfunc.c

linker.go

elf_reader.go

This commit allows users to mark their kfunc defintions as `__weak`. Weak kfuncs do not cause an error during loading if the kfunc in question can't be found. Instead, a poison value is written which will cause the verifier to bail out if the instruction is evaluated. In addition, this commit relocates LDIMM64 instructions with kfunc relocation entries. When relocated, the BTF id of the kfunc is written to the instruction's immediate field. Together these two changes allow users to write eBPF programs which can gracefully handle the absence of a kfunc. To do so the `bpf_ksym_exists` macro from `bpf_helpers.h` can be used to check if a kfunc is present. ``` void invalid_kfunc(void) __ksym __weak; __section("tp_btf/task_newtask") int weak_kfunc_missing(void *ctx) { if (bpf_ksym_exists(invalid_kfunc)) { invalid_kfunc(); return 0; } return 1; } ``` In this example the kfunc is always invalid, yet the program can still load since the branch calling the kfunc is unreachable. So this effectively provides CO-RE capabilities for kfuncs. Especially important since kfuncs are less stable than helpers, their availability depending on kernel version, kernel compilation flags, or kernel modules being loaded or not. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>

The "test_log_fixup" selftest is broken on purpose, with the goal of testing error messages. Due to changes in cilium#1364 we now emit a poison instruction in `bad_relo_subprog` on a instruction which also has a reference to a BPF-to-BPF function. This causes the symbol/reference resolution step to fail. This error is flaky, it only happens when `bad_relo_subprog` is loaded first. If any other program in the collection is loaded first due to random hash map ordering we skip the test instead due to the test checking for verifier errors. Since this test is broken on purpose, we previously just always hit this skip condition. This commit disabled this selftest explicitly which resolves the CI flakiness. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>

The "test_log_fixup" selftest is broken on purpose, with the goal of testing error messages. Due to changes in #1364 we now emit a poison instruction in `bad_relo_subprog` on a instruction which also has a reference to a BPF-to-BPF function. This causes the symbol/reference resolution step to fail. This error is flaky, it only happens when `bad_relo_subprog` is loaded first. If any other program in the collection is loaded first due to random hash map ordering we skip the test instead due to the test checking for verifier errors. Since this test is broken on purpose, we previously just always hit this skip condition. This commit disabled this selftest explicitly which resolves the CI flakiness. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>

dylandreimerink force-pushed the feature/weak-kfuncs branch from 07aa2bb to 41fcb79 Compare March 2, 2024 14:28

dylandreimerink marked this pull request as ready for review March 5, 2024 14:22

dylandreimerink requested a review from a team as a code owner March 5, 2024 14:22

lmb requested changes Mar 15, 2024

View reviewed changes

testdata/kfunc.c Outdated Show resolved Hide resolved

linker.go Show resolved Hide resolved

linker.go Outdated Show resolved Hide resolved

elf_reader.go Outdated Show resolved Hide resolved

dylandreimerink force-pushed the feature/weak-kfuncs branch 3 times, most recently from e95384b to fb4c071 Compare March 15, 2024 12:26

dylandreimerink requested a review from lmb March 15, 2024 12:32

lmb force-pushed the feature/weak-kfuncs branch from fb4c071 to 034e302 Compare March 15, 2024 13:18

lmb approved these changes Mar 15, 2024

View reviewed changes

lmb merged commit 42cbe8f into cilium:main Mar 15, 2024
15 checks passed

dylandreimerink mentioned this pull request Mar 21, 2024

elf_reader_test: Fix CI by disabling the "test_log_fixup" selftest #1385

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for weak kfuncs #1364

Add support for weak kfuncs #1364

dylandreimerink commented Mar 2, 2024

lmb left a comment

Add support for weak kfuncs #1364

Add support for weak kfuncs #1364

Conversation

dylandreimerink commented Mar 2, 2024

lmb left a comment

Choose a reason for hiding this comment