Skip to content

Commit

Permalink
bpf: Fix test verif_scale_strobemeta_subprogs failure due to llvm19
Browse files Browse the repository at this point in the history
With latest llvm19, I hit the following selftest failures with

  $ ./test_progs -j
  libbpf: prog 'on_event': BPF program load failed: Permission denied
  libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG --
  combined stack size of 4 calls is 544. Too large
  verification time 1344153 usec
  stack depth 24+440+0+32
  processed 51008 insns (limit 1000000) max_states_per_insn 19 total_states 1467 peak_states 303 mark_read 146
  -- END PROG LOAD LOG --
  libbpf: prog 'on_event': failed to load: -13
  libbpf: failed to load object 'strobemeta_subprogs.bpf.o'
  scale_test:FAIL:expect_success unexpected error: -13 (errno 13)
  #498     verif_scale_strobemeta_subprogs:FAIL

The verifier complains too big of the combined stack size (544 bytes) which
exceeds the maximum stack limit 512. This is a regression from llvm19 ([1]).

In the above error log, the original stack depth is 24+440+0+32.
To satisfy interpreter's need, in verifier the stack depth is adjusted to
32+448+32+32=544 which exceeds 512, hence the error. The same adjusted
stack size is also used for jit case.

But the jitted codes could use smaller stack size.

  $ egrep -r stack_depth | grep round_up
  arm64/net/bpf_jit_comp.c:       ctx->stack_size = round_up(prog->aux->stack_depth, 16);
  loongarch/net/bpf_jit.c:        bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
  powerpc/net/bpf_jit_comp.c:     cgctx.stack_size = round_up(fp->aux->stack_depth, 16);
  riscv/net/bpf_jit_comp32.c:             round_up(ctx->prog->aux->stack_depth, STACK_ALIGN);
  riscv/net/bpf_jit_comp64.c:     bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
  s390/net/bpf_jit_comp.c:        u32 stack_depth = round_up(fp->aux->stack_depth, 8);
  sparc/net/bpf_jit_comp_64.c:            stack_needed += round_up(stack_depth, 16);
  x86/net/bpf_jit_comp.c:         EMIT3_off32(0x48, 0x81, 0xEC, round_up(stack_depth, 8));
  x86/net/bpf_jit_comp.c: int tcc_off = -4 - round_up(stack_depth, 8);
  x86/net/bpf_jit_comp.c:                     round_up(stack_depth, 8));
  x86/net/bpf_jit_comp.c: int tcc_off = -4 - round_up(stack_depth, 8);
  x86/net/bpf_jit_comp.c:         EMIT3_off32(0x48, 0x81, 0xC4, round_up(stack_depth, 8));

In the above, STACK_ALIGN in riscv/net/bpf_jit_comp32.c is defined as 16.
So stack is aligned in either 8 or 16, x86/s390 having 8-byte stack alignment and
the rest having 16-byte alignment.

This patch calculates total stack depth based on 16-byte alignment if jit is requested.
For the above failing case, the new stack size will be 32+448+0+32=512 and no verification
failure. llvm19 regression will be discussed separately in llvm upstream.

The verifier change caused three test failures as these tests compared messages
with stack size. More specifically,
  - test_global_funcs/global_func1: fail with interpreter mode and success with jit mode.
    Adjusted stack sizes so both jit and interpreter modes will fail.
  - async_stack_depth/{pseudo_call_check, async_call_root_check}: since jit and interpreter
    will calculate different stack sizes, the failure msg is adjusted to omit those
    specific stack size numbers.

  [1] https://lore.kernel.org/bpf/32bde0f0-1881-46c9-931a-673be566c61d@linux.dev/

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
  • Loading branch information
Yonghong Song authored and Kernel Patches Daemon committed Feb 15, 2024
1 parent 678eed0 commit 1b44265
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 9 deletions.
18 changes: 13 additions & 5 deletions kernel/bpf/verifier.c
Original file line number Diff line number Diff line change
Expand Up @@ -5812,6 +5812,17 @@ static int check_ptr_alignment(struct bpf_verifier_env *env,
strict);
}

static int round_up_stack_depth(struct bpf_verifier_env *env, int stack_depth)
{
if (env->prog->jit_requested)
return round_up(stack_depth, 16);

/* round up to 32-bytes, since this is granularity
* of interpreter stack size
*/
return round_up(max_t(u32, stack_depth, 1), 32);
}

/* starting from main bpf function walk all instructions of the function
* and recursively walk all callees that given function can call.
* Ignore jump and exit insns.
Expand Down Expand Up @@ -5855,10 +5866,7 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx)
depth);
return -EACCES;
}
/* round up to 32-bytes, since this is granularity
* of interpreter stack size
*/
depth += round_up(max_t(u32, subprog[idx].stack_depth, 1), 32);
depth += round_up_stack_depth(env, subprog[idx].stack_depth);
if (depth > MAX_BPF_STACK) {
verbose(env, "combined stack size of %d calls is %d. Too large\n",
frame + 1, depth);
Expand Down Expand Up @@ -5952,7 +5960,7 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx)
*/
if (frame == 0)
return 0;
depth -= round_up(max_t(u32, subprog[idx].stack_depth, 1), 32);
depth -= round_up_stack_depth(env, subprog[idx].stack_depth);
frame--;
i = ret_insn[frame];
idx = ret_prog[frame];
Expand Down
4 changes: 2 additions & 2 deletions tools/testing/selftests/bpf/progs/async_stack_depth.c
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ static int bad_timer_cb(void *map, int *key, struct bpf_timer *timer)
}

SEC("tc")
__failure __msg("combined stack size of 2 calls is 576. Too large")
__failure __msg("combined stack size of 2 calls is")
int pseudo_call_check(struct __sk_buff *ctx)
{
struct hmap_elem *elem;
Expand All @@ -45,7 +45,7 @@ int pseudo_call_check(struct __sk_buff *ctx)
}

SEC("tc")
__failure __msg("combined stack size of 2 calls is 608. Too large")
__failure __msg("combined stack size of 2 calls is")
int async_call_root_check(struct __sk_buff *ctx)
{
struct hmap_elem *elem;
Expand Down
8 changes: 6 additions & 2 deletions tools/testing/selftests/bpf/progs/test_global_func1.c
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
#include <bpf/bpf_helpers.h>
#include "bpf_misc.h"

#define MAX_STACK (512 - 3 * 32 + 8)
#define MAX_STACK 260

static __attribute__ ((noinline))
int f0(int var, struct __sk_buff *skb)
Expand All @@ -30,6 +30,10 @@ int f3(int, struct __sk_buff *skb, int);
__attribute__ ((noinline))
int f2(int val, struct __sk_buff *skb)
{
volatile char buf[MAX_STACK] = {};

__sink(buf[MAX_STACK - 1]);

return f1(skb) + f3(val, skb, 1);
}

Expand All @@ -44,7 +48,7 @@ int f3(int val, struct __sk_buff *skb, int var)
}

SEC("tc")
__failure __msg("combined stack size of 4 calls is 544")
__failure __msg("combined stack size of 3 calls is")
int global_func1(struct __sk_buff *skb)
{
return f0(1, skb) + f1(skb) + f2(2, skb) + f3(3, skb, 4);
Expand Down

0 comments on commit 1b44265

Please sign in to comment.