Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf_tests: Bumped CoverBee to v0.3.2 blank state value fix #24180

Merged

Conversation

dylandreimerink
Copy link
Member

There was a bug in CoverBee which caused instrumentation to fail on some programs if the verifier log contained frame pointer dumps at the end of a log line without a value.

This fixes a bug reported by @jspaleta and @jschwinger233

Bumped CoverBee version to v0.3.2

@dylandreimerink dylandreimerink added the release-note/misc This PR makes changes that have no direct user impact. label Mar 6, 2023
@dylandreimerink dylandreimerink requested a review from a team as a code owner March 6, 2023 10:51
@dylandreimerink dylandreimerink force-pushed the feature/coverbee-blank-state-value-fix branch from 07d79c5 to 19e7844 Compare March 6, 2023 10:52
@jschwinger233
Copy link
Member

jschwinger233 commented Mar 6, 2023

Hi @dylandreimerink, thanks for your PR, and it does fix the issue of local verifier error.

However I just hit another issue related to ipv6. Let me elaborate the steps.

First I pulled this pr to local: git fetch upstream pull/24180/head:pr-24180 && git checkout pr-24180

Then I created a new bpf test file in bpf/tests/tc_ipv6_test.c with these code:

// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
/* Copyright Authors of Cilium */

#include "common.h"

#include <bpf/ctx/skb.h>
#include <bpf/helpers_skb.h>
#include "pktgen.h"

#define ENABLE_IPV6

#define ETH_HLEN		0
#define ENABLE_HOST_ROUTING	1
#define ENABLE_NODEPORT
#define DISABLE_LOOPBACK_LB

#define TEST_IP_LOCAL		v4_pod_one
#define TEST_IP_REMOTE		v4_pod_two
#define TEST_IPV6_LOCAL		v6_pod_one
#define TEST_IPV6_REMOTE	v6_pod_two
#define TEST_LXC_ID_LOCAL	233

#define SECCTX_FROM_IPCACHE 1

#include "bpf_host.c"

PKTGEN("tc", "ipv6_l3_to_l2_fast_redirect")
int ipv6_l3_to_l2_fast_redirect_pktgen(struct __ctx_buff *ctx)
{
	struct pktgen builder;

	/* Init packet builder */
	pktgen__init(&builder, ctx);

	pktgen__finish(&builder);

	return 0;
}

SETUP("tc", "ipv6_l3_to_l2_fast_redirect")
int ipv6_l3_to_l2_fast_redirect_setup(struct __ctx_buff __maybe_unused *ctx)
{
	return TEST_PASS;
}

CHECK("tc", "ipv6_l3_to_l2_fast_redirect")
int ipv6_l3_to_l2_fast_redirect_check(__maybe_unused const struct __ctx_buff __maybe_unused *ctx)
{
	test_init();
	test_finish();
}

After that, I ran make -C bpf/tests clean && docker run -u $(id -u):$(id -g) -it --rm -v ~:/go quay.io/cilium/cilium-builder:f4da8a7ffb1e124ded9004fdaf58a42edc16387a make -C bpf/tests && make -C test run_bpf_tests RUN=TestBPF/tc_ipv6_test.o COVER=1, and verifier raised errors.

An interesting fact is, the verifier will pass the check once I delete the macro #define ENABLE_IPV6, so I presume this issue could have something to do with ipv6.

The verifier error logs are

--- FAIL: TestBPF (0.12s)
    --- FAIL: TestBPF/tc_ipv6_test.o (0.12s)
        bpf_test.go:150: verifier error: load program: permission denied:
                func#0 @0
                0: R1=ctx(off=0,imm=0) R10=fp0
                ; int tail_icmp6_handle_ns(struct __ctx_buff *ctx)
                0: (bf) r6 = r1                       ; R1=ctx(off=0,imm=0) R6_w=ctx(off=0,imm=0)
                ; return ctx->cb[off];
                1: (61) r8 = *(u32 *)(r6 +48)         ; R6_w=ctx(off=0,imm=0) R8_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
                2: (b7) r7 = 0                        ; R7_w=0
                ; ctx->cb[off] = data;
                3: (63) *(u32 *)(r6 +48) = r7         ; R6_w=ctx(off=0,imm=0) R7_w=0
                ; return ctx->cb[off];
                4: (61) r1 = *(u32 *)(r6 +52)         ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R6_w=ctx(off=0,imm=0)
                ; if (ctx_load_bytes(ctx, nh_off + ICMP6_ND_TARGET_OFFSET, target.addr,
                5: (7b) *(u64 *)(r10 -144) = r1       ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 fp-144_w=
                ; if (ctx_load_bytes(ctx, nh_off + ICMP6_ND_TARGET_OFFSET, target.addr,
                6: (bf) r2 = r8                       ; R2_w=scalar(id=1,umax=4294967295,var_off=(0x0; 0xffffffff)) R8_w=scalar(id=1,umax=4294967295,var_off=(0x0; 0xffffffff))
                7: (07) r2 += 48                      ; R2_w=scalar(umin=48,umax=4294967343,var_off=(0x0; 0x1ffffffff))
                8: (bf) r3 = r10                      ; R3_w=fp0 R10=fp0
                ;
                9: (07) r3 += -136                    ; R3_w=fp-136
                ; if (ctx_load_bytes(ctx, nh_off + ICMP6_ND_TARGET_OFFSET, target.addr,
                10: (bf) r1 = r6                      ; R1_w=ctx(off=0,imm=0) R6_w=ctx(off=0,imm=0)
                11: (b7) r4 = 16                      ; R4_w=16
                12: (85) call bpf_skb_load_bytes#26
                last_idx 12 first_idx 0
                regs=10 stack=0 before 11: (b7) r4 = 16
                13: R0_w=scalar() fp-128=mmmmmmmm fp-136=mmmmmmmm
                13: (18) r9 = 0xffffff7a              ; R9_w=4294967162
                15: (67) r0 <<= 32                    ; R0_w=scalar(smax=9223372032559808512,umax=18446744069414584320,var_off=(0x0; 0xffffffff00000000),s32_min=0,s32_max=0,u32_max=0)
                16: (c7) r0 s>>= 32                   ; R0_w=scalar(smin=-2147483648,smax=2147483647)
                ; if (ctx_load_bytes(ctx, nh_off + ICMP6_ND_TARGET_OFFSET, target.addr,
                17: (c5) if r0 s< 0x0 goto pc+609     ; R0_w=scalar(umax=2147483647,var_off=(0x0; 0x7fffffff))
                ; cilium_dbg(ctx, DBG_ICMP6_NS, target.p3, target.p4);
                18: (79) r1 = *(u64 *)(r10 -128)      ; R1_w=scalar() R10=fp0
                19: (b7) r2 = 278007042               ; R2_w=278007042
                ; struct debug_msg msg = {
                20: (63) *(u32 *)(r10 -24) = r2       ; R2_w=278007042 R10=fp0 fp-24_w=278007042
                ; __notify_common_hdr(CILIUM_NOTIFY_DBG_MSG, type),
                21: (61) r2 = *(u32 *)(r6 +68)        ; R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R6_w=ctx(off=0,imm=0)
                ; struct debug_msg msg = {
                22: (63) *(u32 *)(r10 -8) = r7
                last_idx 22 first_idx 0
                regs=80 stack=0 before 21: (61) r2 = *(u32 *)(r6 +68)
                regs=80 stack=0 before 20: (63) *(u32 *)(r10 -24) = r2
                regs=80 stack=0 before 19: (b7) r2 = 278007042
                regs=80 stack=0 before 18: (79) r1 = *(u64 *)(r10 -128)
                regs=80 stack=0 before 17: (c5) if r0 s< 0x0 goto pc+609
                regs=80 stack=0 before 16: (c7) r0 s>>= 32
                regs=80 stack=0 before 15: (67) r0 <<= 32
                regs=80 stack=0 before 13: (18) r9 = 0xffffff7a
                regs=80 stack=0 before 12: (85) call bpf_skb_load_bytes#26
                regs=80 stack=0 before 11: (b7) r4 = 16
                regs=80 stack=0 before 10: (bf) r1 = r6
                regs=80 stack=0 before 9: (07) r3 += -136
                regs=80 stack=0 before 8: (bf) r3 = r10
                regs=80 stack=0 before 7: (07) r2 += 48
                regs=80 stack=0 before 6: (bf) r2 = r8
                regs=80 stack=0 before 5: (7b) *(u64 *)(r10 -144) = r1
                regs=80 stack=0 before 4: (61) r1 = *(u32 *)(r6 +52)
                regs=80 stack=0 before 3: (63) *(u32 *)(r6 +48) = r7
                regs=80 stack=0 before 2: (b7) r7 = 0
                23: R7_w=P0 R10=fp0 fp-8=????0000
                23: (7b) *(u64 *)(r10 -16) = r1       ; R1_w=scalar() R10=fp0 fp-16_w=mmmmmmmm
                24: (63) *(u32 *)(r10 -20) = r2       ; R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 fp-24_w=mmmmmmmm
                25: (bf) r4 = r10                     ; R4_w=fp0 R10=fp0
                ; cilium_dbg(ctx, DBG_ICMP6_NS, target.p3, target.p4);
                26: (07) r4 += -24                    ; R4_w=fp-24
                ; ctx_event_output(ctx, &EVENTS_MAP, BPF_F_CURRENT_CPU,
                27: (bf) r1 = r6                      ; R1_w=ctx(off=0,imm=0) R6_w=ctx(off=0,imm=0)
                28: (18) r2 = 0xffff88a914a7c000      ; R2_w=map_ptr(off=0,ks=4,vs=4,imm=0)
                30: (18) r3 = 0xffffffff              ; R3_w=4294967295
                32: (b7) r5 = 20                      ; R5_w=20
                33: (85) call bpf_perf_event_output#25
                last_idx 33 first_idx 0
                regs=20 stack=0 before 32: (b7) r5 = 20
                34: R0=scalar()
                ; if (!tmp)
                34: (18) r1 = 0xffff88a914a7fb1c      ; R1_w=map_value(off=12,ks=4,vs=76,imm=0)
                36: (67) r1 <<= 32
                R1 pointer arithmetic with <<= operator prohibited
                processed 33 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 0

My host is Linux gray-Latitude-5530 5.19.0-35-generic #36~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Feb 17 15:17:25 UTC 2 x86_64 x86_64 x86_64 GNU/Linux.

@dylandreimerink
Copy link
Member Author

@jschwinger233 the R1 pointer arithmetic with <<= operator prohibited verifier error tends to happen due to the usage of "static data".

Our codebase has to run on kernel versions which do no yet support "global data" from maps, so we have a mechanism called "static data" that patches values in the program just before loading instead of using global data from maps. See bpf/lib/static_data.h for the macros used to define these patched variables.

In a lot of locations we have surrounded the definitions of these static data variables with #ifndef statements so we can overwrite them in the tests. There is also the bpf/tests/config_replacement.h file which defines a lot of these already. If including that early doesn't fix it, you may need to hunt through the IPv6 code to see where variables/contants are used which are declared with DEFINE_U* or fetch_u* and replace those definitions during the test.

@jschwinger233
Copy link
Member

@dylandreimerink Of course, thanks for sharing. It turns out it's tail_icmp6_handle_ns which breaks the verifier, and I can easily skip this by #define SKIP_ICMPV6_NS_HANDLING.

Can't wait to see this PR merged.

There was a bug in CoverBee which caused instrumentation to fail on
some programs if the verifier log contained frame pointer dumps
at the end of a log line without a value.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
@dylandreimerink dylandreimerink force-pushed the feature/coverbee-blank-state-value-fix branch from 19e7844 to 151b4a0 Compare March 9, 2023 11:05
@dylandreimerink
Copy link
Member Author

Alright, PR is accepted by all code owners. All tests are green, except for ConformanceGatewayAPI which is hitting #24217. CoverBee is only used by the BPF test CI which passes, so no need for e2e tests. Marking as ready-to-merge.

@dylandreimerink dylandreimerink added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 9, 2023
@borkmann borkmann merged commit 69b0e65 into cilium:master Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants