Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Join broken by LLVM 14 #2222

Closed
jeromemarchand opened this issue May 17, 2022 · 7 comments · Fixed by #2296
Closed

Join broken by LLVM 14 #2222

jeromemarchand opened this issue May 17, 2022 · 7 comments · Fixed by #2296
Labels
bug Something isn't working

Comments

@jeromemarchand
Copy link
Contributor

It looks like join() is broken with LLVM 14.

What reproduces the bug?

$ bpftrace -e 'tracepoint:syscalls:sys_enter_execve { join(args->argv); }' 
Attaching 1 probe...
ERROR: Error loading program: tracepoint:syscalls:sys_enter_execve (try -v)

Now with -v option:

$ bpftrace -v -e 'tracepoint:syscalls:sys_enter_execve { join(args->argv); }'
INFO: node count: 8
Attaching 1 probe...

Error log: 
0: R1=ctx(off=0,imm=0) R10=fp0
0: (bf) r6 = r1                       ; R1=ctx(off=0,imm=0) R6_w=ctx(off=0,imm=0)
1: (79) r8 = *(u64 *)(r6 +24)         ; R6_w=ctx(off=0,imm=0) R8_w=scalar()
2: (b7) r9 = 0                        ; R9_w=0
3: (63) *(u32 *)(r10 -12) = r9        ; R9_w=P0 R10=fp0 fp-16=0000????
4: (18) r1 = 0xffff9814c8d09200       ; R1_w=map_ptr(off=0,ks=4,vs=16400,imm=0)
6: (bf) r2 = r10                      ; R2_w=fp0 R10=fp0
7: (07) r2 += -12                     ; R2_w=fp-12
8: (85) call bpf_map_lookup_elem#1    ; R0_w=map_value_or_null(id=1,off=0,ks=4,vs=16400,imm=0)
9: (bf) r7 = r0                       ; R0_w=map_value_or_null(id=1,off=0,ks=4,vs=16400,imm=0) R7_w=map_value_or_null(id=1,off=0,ks=4,vs=16400,imm=0)
10: (55) if r7 != 0x0 goto pc+2 13: R0_w=map_value(off=0,ks=4,vs=16400,imm=0) R6_w=ctx(off=0,imm=0) R7_w=map_value(off=0,ks=4,vs=16400,imm=0) R8_w=scalar() R9_w=P0 R10=fp0 fp-16=mmmm????
13: (7b) *(u64 *)(r7 +8) = r9         ; R7_w=map_value(off=0,ks=4,vs=16400,imm=0) R9_w=P0
14: (b7) r1 = 30005                   ; R1_w=30005
15: (7b) *(u64 *)(r7 +0) = r1         ; R1_w=30005 R7_w=map_value(off=0,ks=4,vs=16400,imm=0)
16: (bf) r1 = r10                     ; R1_w=fp0 R10=fp0
17: (07) r1 += -8                     ; R1_w=fp-8
18: (b7) r2 = 8                       ; R2_w=8
19: (bf) r3 = r8                      ; R3_w=scalar(id=2) R8_w=scalar(id=2)
20: (85) call bpf_probe_read_user#112         ; R0=scalar() fp-8=mmmmmmmm
21: (79) r3 = *(u64 *)(r10 -8)        ; R3_w=scalar() R10=fp0
22: (b7) r2 = 1024                    ; R2_w=1024
23: (85) call bpf_probe_read_user_str#114
R1 !read_ok
processed 23 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1

ERROR: Error loading program: tracepoint:syscalls:sys_enter_execve

Access to args->argv seems to work fine with str(). E.g.:

$ bpftrace -e 'tracepoint:syscalls:sys_enter_execve { printf("%s\n", str(args->argv[0])); }'
Attaching 1 probe...
ls

It used to work fine when compiled with LLVM 13.

bpftrace --info

System
OS: Linux 5.18.0-0.rc2.23.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Apr 11 14:21:25 UTC 2022
Arch: x86_64

Build
version: v0.14.0-119-g1c29
LLVM: 14.0.0
ORC: v2
foreach_sym: yes
unsafe uprobe: no
bfd: yes
bpf_attach_kfunc: yes
bcc_usdt_addsem: yes
bcc bpf_attach_uprobe refcount: yes
bcc library path resolution: yes
libbpf: yes
libbpf btf dump: yes
libbpf btf dump type decl: yes
libdw (DWARF support): yes

Kernel helpers
probe_read: yes
probe_read_str: yes
probe_read_user: yes
probe_read_user_str: yes
probe_read_kernel: yes
probe_read_kernel_str: yes
get_current_cgroup_id: yes
send_signal: yes
override_return: no
get_boot_ns: yes
dpath: yes

Kernel features
Instruction limit: 1000000
Loop support: yes
btf (depends on Build:libbpf): yes
map batch (depends on Build:libbpf): yes
uprobe refcount (depends on Build:bcc bpf_attach_uprobe refcount): yes

Map types
hash: yes
percpu hash: yes
array: yes
percpu array: yes
stack_trace: yes
perf_event_array: yes

Probe types
kprobe: yes
tracepoint: yes
perf_event: yes
kfunc: yes
iter:task: yes
iter:task_file: yes

@jeromemarchand jeromemarchand added the bug Something isn't working label May 17, 2022
@viktormalik
Copy link
Contributor

I did some analysis and this seems to be a problem with LLVM optimization.

The relevant pieces of LLVM IR after optimization (they are identical before optimization):
LLVM 13

entry:
  [...]
  %pseudo = tail call i64 @llvm.bpf.pseudo(i64 1, i64 0)
  %lookup_elem = call i8* inttoptr (i64 1 to i8* (i64, i32*)*)(i64 %pseudo, i32* nonnull %key)
  [...]

joinnotzero:                                      ; preds = %entry
  store i64 30005, i8* %lookup_elem, align 8
  %7 = getelementptr i8, i8* %lookup_elem, i64 8
  store i64 0, i8* %7, align 8
  [...]
  %10 = add i8* %lookup_elem, i64 16
  %probe_read_user_str = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %10, i32 1024, i64 %9)
  [...]

vs LLVM 14

entry:
  [...]
  %pseudo = tail call i64 @llvm.bpf.pseudo(i64 1, i64 0)
  %lookup_elem = call i8* inttoptr (i64 1 to i8* (i64, i32*)*)(i64 %pseudo, i32* nonnull %key)
  [...]

joinnotzero:                                      ; preds = %entry
  store i64 30005, i8* %lookup_elem, align 8
  %7 = getelementptr i8, i8* %lookup_elem, i64 8
  store i64 0, i8* %7, align 8
  [...]
  %probe_read_user_str = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %9)
  [...]

For some reason, LLVM optimized out %10 = add i8* %lookup_elem, i64 16, resulting into undef in the proberead call. I'm not sure if this is an LLVM bug or we're doing something wrong.

I'd appreciate some help here, I can provide full LLVM IRs if necessary.

@fbs
Copy link
Contributor

fbs commented May 18, 2022

If you can paste the original and optimized llvm output for 13 and 14 I can take a look.

@viktormalik
Copy link
Contributor

Ok, it'll be long, but I guess that's fine.

Before optimization (same for 13 and 14):

; ModuleID = 'bpftrace'
source_filename = "bpftrace"
target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "bpf-pc-linux"

; Function Attrs: nounwind
declare i64 @llvm.bpf.pseudo(i64 %0, i64 %1) #0

define i64 @"tracepoint:syscalls:sys_enter_execve"(i8* %0) section "s_tracepoint:syscalls:sys_enter_execve_1" {
entry:
  %join_r0 = alloca i64, align 8
  %key = alloca i32, align 4
  %join_second = alloca i64, align 8
  %join_first = alloca i64, align 8
  %1 = ptrtoint i8* %0 to i64
  %2 = add i64 %1, 24
  %3 = inttoptr i64 %2 to i64*
  %4 = load volatile i64, i64* %3, align 8
  %5 = bitcast i64* %join_first to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* %5)
  %6 = bitcast i64* %join_second to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* %6)
  %7 = bitcast i32* %key to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* %7)
  store i32 0, i32* %key, align 4
  %pseudo = call i64 @llvm.bpf.pseudo(i64 1, i64 0)
  %lookup_elem = call i8* inttoptr (i64 1 to i8* (i64, i32*)*)(i64 %pseudo, i32* %key)
  %joinzerocond = icmp ne i8* %lookup_elem, null
  br i1 %joinzerocond, label %joinnotzero, label %joinzero

joinzero:                                         ; preds = %joinnotzero, %entry
  ret i64 1

joinnotzero:                                      ; preds = %entry
  store i64 30005, i8* %lookup_elem, align 8
  %8 = getelementptr i8, i8* %lookup_elem, i64 8
  store i64 0, i8* %8, align 8
  %9 = bitcast i64* %join_r0 to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* %9)
  %probe_read_user = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_r0, i32 8, i64 %4)
  %10 = load i64, i64* %join_r0, align 8
  %11 = add i8* %lookup_elem, i64 16
  %probe_read_user_str = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %11, i32 1024, i64 %10)
  %12 = add i64 %4, 8
  store i64 %12, i64* %join_first, align 8
  %13 = load i64, i64* %join_first, align 8
  %probe_read_user1 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_second, i32 8, i64 %13)
  %14 = load i64, i64* %join_second, align 8
  %15 = add i8* %lookup_elem, i64 1040
  %probe_read_user_str2 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %15, i32 1024, i64 %14)
  %16 = add i64 %4, 16
  store i64 %16, i64* %join_first, align 8
  %17 = load i64, i64* %join_first, align 8
  %probe_read_user3 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_second, i32 8, i64 %17)
  %18 = load i64, i64* %join_second, align 8
  %19 = add i8* %lookup_elem, i64 2064
  %probe_read_user_str4 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %19, i32 1024, i64 %18)
  %20 = add i64 %4, 24
  store i64 %20, i64* %join_first, align 8
  %21 = load i64, i64* %join_first, align 8
  %probe_read_user5 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_second, i32 8, i64 %21)
  %22 = load i64, i64* %join_second, align 8
  %23 = add i8* %lookup_elem, i64 3088
  %probe_read_user_str6 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %23, i32 1024, i64 %22)
  %24 = add i64 %4, 32
  store i64 %24, i64* %join_first, align 8
  %25 = load i64, i64* %join_first, align 8
  %probe_read_user7 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_second, i32 8, i64 %25)
  %26 = load i64, i64* %join_second, align 8
  %27 = add i8* %lookup_elem, i64 4112
  %probe_read_user_str8 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %27, i32 1024, i64 %26)
  %28 = add i64 %4, 40
  store i64 %28, i64* %join_first, align 8
  %29 = load i64, i64* %join_first, align 8
  %probe_read_user9 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_second, i32 8, i64 %29)
  %30 = load i64, i64* %join_second, align 8
  %31 = add i8* %lookup_elem, i64 5136
  %probe_read_user_str10 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %31, i32 1024, i64 %30)
  %32 = add i64 %4, 48
  store i64 %32, i64* %join_first, align 8
  %33 = load i64, i64* %join_first, align 8
  %probe_read_user11 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_second, i32 8, i64 %33)
  %34 = load i64, i64* %join_second, align 8
  %35 = add i8* %lookup_elem, i64 6160
  %probe_read_user_str12 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %35, i32 1024, i64 %34)
  %36 = add i64 %4, 56
  store i64 %36, i64* %join_first, align 8
  %37 = load i64, i64* %join_first, align 8
  %probe_read_user13 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_second, i32 8, i64 %37)
  %38 = load i64, i64* %join_second, align 8
  %39 = add i8* %lookup_elem, i64 7184
  %probe_read_user_str14 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %39, i32 1024, i64 %38)
  %40 = add i64 %4, 64
  store i64 %40, i64* %join_first, align 8
  %41 = load i64, i64* %join_first, align 8
  %probe_read_user15 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_second, i32 8, i64 %41)
  %42 = load i64, i64* %join_second, align 8
  %43 = add i8* %lookup_elem, i64 8208
  %probe_read_user_str16 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %43, i32 1024, i64 %42)
  %44 = add i64 %4, 72
  store i64 %44, i64* %join_first, align 8
  %45 = load i64, i64* %join_first, align 8
  %probe_read_user17 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_second, i32 8, i64 %45)
  %46 = load i64, i64* %join_second, align 8
  %47 = add i8* %lookup_elem, i64 9232
  %probe_read_user_str18 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %47, i32 1024, i64 %46)
  %48 = add i64 %4, 80
  store i64 %48, i64* %join_first, align 8
  %49 = load i64, i64* %join_first, align 8
  %probe_read_user19 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_second, i32 8, i64 %49)
  %50 = load i64, i64* %join_second, align 8
  %51 = add i8* %lookup_elem, i64 10256
  %probe_read_user_str20 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %51, i32 1024, i64 %50)
  %52 = add i64 %4, 88
  store i64 %52, i64* %join_first, align 8
  %53 = load i64, i64* %join_first, align 8
  %probe_read_user21 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_second, i32 8, i64 %53)
  %54 = load i64, i64* %join_second, align 8
  %55 = add i8* %lookup_elem, i64 11280
  %probe_read_user_str22 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %55, i32 1024, i64 %54)
  %56 = add i64 %4, 96
  store i64 %56, i64* %join_first, align 8
  %57 = load i64, i64* %join_first, align 8
  %probe_read_user23 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_second, i32 8, i64 %57)
  %58 = load i64, i64* %join_second, align 8
  %59 = add i8* %lookup_elem, i64 12304
  %probe_read_user_str24 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %59, i32 1024, i64 %58)
  %60 = add i64 %4, 104
  store i64 %60, i64* %join_first, align 8
  %61 = load i64, i64* %join_first, align 8
  %probe_read_user25 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_second, i32 8, i64 %61)
  %62 = load i64, i64* %join_second, align 8
  %63 = add i8* %lookup_elem, i64 13328
  %probe_read_user_str26 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %63, i32 1024, i64 %62)
  %64 = add i64 %4, 112
  store i64 %64, i64* %join_first, align 8
  %65 = load i64, i64* %join_first, align 8
  %probe_read_user27 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_second, i32 8, i64 %65)
  %66 = load i64, i64* %join_second, align 8
  %67 = add i8* %lookup_elem, i64 14352
  %probe_read_user_str28 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %67, i32 1024, i64 %66)
  %68 = add i64 %4, 120
  store i64 %68, i64* %join_first, align 8
  %69 = load i64, i64* %join_first, align 8
  %probe_read_user29 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* %join_second, i32 8, i64 %69)
  %70 = load i64, i64* %join_second, align 8
  %71 = add i8* %lookup_elem, i64 15376
  %probe_read_user_str30 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %71, i32 1024, i64 %70)
  %pseudo31 = call i64 @llvm.bpf.pseudo(i64 1, i64 1)
  %perf_event_output = call i64 inttoptr (i64 25 to i64 (i8*, i64, i64, i8*, i64)*)(i8* %0, i64 %pseudo31, i64 4294967295, i8* %lookup_elem, i64 16400)
  br label %joinzero
}

; Function Attrs: argmemonly nofree nosync nounwind willreturn
declare void @llvm.lifetime.start.p0i8(i64 immarg %0, i8* nocapture %1) #1

attributes #0 = { nounwind }
attributes #1 = { argmemonly nofree nosync nounwind willreturn }

After optimization (13):

; ModuleID = 'bpftrace'
source_filename = "bpftrace"
target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "bpf-pc-linux"

; Function Attrs: nounwind
declare i64 @llvm.bpf.pseudo(i64 %0, i64 %1) #0

define i64 @"tracepoint:syscalls:sys_enter_execve"(i8* %0) local_unnamed_addr section "s_tracepoint:syscalls:sys_enter_execve_1" {
entry:
  %join_r0 = alloca i64, align 8
  %key = alloca i32, align 4
  %join_second = alloca i64, align 8
  %1 = ptrtoint i8* %0 to i64
  %2 = add i64 %1, 24
  %3 = inttoptr i64 %2 to i64*
  %4 = load volatile i64, i64* %3, align 8
  %5 = bitcast i64* %join_second to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %5)
  %6 = bitcast i32* %key to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %6)
  store i32 0, i32* %key, align 4
  %pseudo = tail call i64 @llvm.bpf.pseudo(i64 1, i64 0)
  %lookup_elem = call i8* inttoptr (i64 1 to i8* (i64, i32*)*)(i64 %pseudo, i32* nonnull %key)
  %joinzerocond.not = icmp eq i8* %lookup_elem, null
  br i1 %joinzerocond.not, label %joinzero, label %joinnotzero

joinzero:                                         ; preds = %joinnotzero, %entry
  ret i64 1

joinnotzero:                                      ; preds = %entry
  store i64 30005, i8* %lookup_elem, align 8
  %7 = getelementptr i8, i8* %lookup_elem, i64 8
  store i64 0, i8* %7, align 8
  %8 = bitcast i64* %join_r0 to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %8)
  %probe_read_user = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_r0, i32 8, i64 %4)
  %9 = load i64, i64* %join_r0, align 8
  %10 = add i8* %lookup_elem, i64 16
  %probe_read_user_str = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %10, i32 1024, i64 %9)
  %11 = add i64 %4, 8
  %probe_read_user1 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %11)
  %12 = load i64, i64* %join_second, align 8
  %13 = add i8* %lookup_elem, i64 1040
  %probe_read_user_str2 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %13, i32 1024, i64 %12)
  %14 = add i64 %4, 16
  %probe_read_user3 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %14)
  %15 = load i64, i64* %join_second, align 8
  %16 = add i8* %lookup_elem, i64 2064
  %probe_read_user_str4 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %16, i32 1024, i64 %15)
  %17 = add i64 %4, 24
  %probe_read_user5 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %17)
  %18 = load i64, i64* %join_second, align 8
  %19 = add i8* %lookup_elem, i64 3088
  %probe_read_user_str6 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %19, i32 1024, i64 %18)
  %20 = add i64 %4, 32
  %probe_read_user7 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %20)
  %21 = load i64, i64* %join_second, align 8
  %22 = add i8* %lookup_elem, i64 4112
  %probe_read_user_str8 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %22, i32 1024, i64 %21)
  %23 = add i64 %4, 40
  %probe_read_user9 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %23)
  %24 = load i64, i64* %join_second, align 8
  %25 = add i8* %lookup_elem, i64 5136
  %probe_read_user_str10 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %25, i32 1024, i64 %24)
  %26 = add i64 %4, 48
  %probe_read_user11 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %26)
  %27 = load i64, i64* %join_second, align 8
  %28 = add i8* %lookup_elem, i64 6160
  %probe_read_user_str12 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %28, i32 1024, i64 %27)
  %29 = add i64 %4, 56
  %probe_read_user13 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %29)
  %30 = load i64, i64* %join_second, align 8
  %31 = add i8* %lookup_elem, i64 7184
  %probe_read_user_str14 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %31, i32 1024, i64 %30)
  %32 = add i64 %4, 64
  %probe_read_user15 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %32)
  %33 = load i64, i64* %join_second, align 8
  %34 = add i8* %lookup_elem, i64 8208
  %probe_read_user_str16 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %34, i32 1024, i64 %33)
  %35 = add i64 %4, 72
  %probe_read_user17 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %35)
  %36 = load i64, i64* %join_second, align 8
  %37 = add i8* %lookup_elem, i64 9232
  %probe_read_user_str18 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %37, i32 1024, i64 %36)
  %38 = add i64 %4, 80
  %probe_read_user19 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %38)
  %39 = load i64, i64* %join_second, align 8
  %40 = add i8* %lookup_elem, i64 10256
  %probe_read_user_str20 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %40, i32 1024, i64 %39)
  %41 = add i64 %4, 88
  %probe_read_user21 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %41)
  %42 = load i64, i64* %join_second, align 8
  %43 = add i8* %lookup_elem, i64 11280
  %probe_read_user_str22 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %43, i32 1024, i64 %42)
  %44 = add i64 %4, 96
  %probe_read_user23 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %44)
  %45 = load i64, i64* %join_second, align 8
  %46 = add i8* %lookup_elem, i64 12304
  %probe_read_user_str24 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %46, i32 1024, i64 %45)
  %47 = add i64 %4, 104
  %probe_read_user25 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %47)
  %48 = load i64, i64* %join_second, align 8
  %49 = add i8* %lookup_elem, i64 13328
  %probe_read_user_str26 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %49, i32 1024, i64 %48)
  %50 = add i64 %4, 112
  %probe_read_user27 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %50)
  %51 = load i64, i64* %join_second, align 8
  %52 = add i8* %lookup_elem, i64 14352
  %probe_read_user_str28 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %52, i32 1024, i64 %51)
  %53 = add i64 %4, 120
  %probe_read_user29 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %53)
  %54 = load i64, i64* %join_second, align 8
  %55 = add i8* %lookup_elem, i64 15376
  %probe_read_user_str30 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* %55, i32 1024, i64 %54)
  %pseudo31 = call i64 @llvm.bpf.pseudo(i64 1, i64 1)
  %perf_event_output = call i64 inttoptr (i64 25 to i64 (i8*, i64, i64, i8*, i64)*)(i8* %0, i64 %pseudo31, i64 4294967295, i8* nonnull %lookup_elem, i64 16400)
  br label %joinzero
}

; Function Attrs: argmemonly nofree nosync nounwind willreturn
declare void @llvm.lifetime.start.p0i8(i64 immarg %0, i8* nocapture %1) #1

attributes #0 = { nounwind }
attributes #1 = { argmemonly nofree nosync nounwind willreturn }

After optimization (14):

; ModuleID = 'bpftrace'
source_filename = "bpftrace"
target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "bpf-pc-linux"

; Function Attrs: nounwind
declare i64 @llvm.bpf.pseudo(i64 %0, i64 %1) #0

define i64 @"tracepoint:syscalls:sys_enter_execve"(i8* %0) local_unnamed_addr section "s_tracepoint:syscalls:sys_enter_execve_1" {
entry:
  %join_r0 = alloca i64, align 8
  %key = alloca i32, align 4
  %join_second = alloca i64, align 8
  %1 = ptrtoint i8* %0 to i64
  %2 = add i64 %1, 24
  %3 = inttoptr i64 %2 to i64*
  %4 = load volatile i64, i64* %3, align 8
  %5 = bitcast i64* %join_second to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %5)
  %6 = bitcast i32* %key to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %6)
  store i32 0, i32* %key, align 4
  %pseudo = tail call i64 @llvm.bpf.pseudo(i64 1, i64 0)
  %lookup_elem = call i8* inttoptr (i64 1 to i8* (i64, i32*)*)(i64 %pseudo, i32* nonnull %key)
  %joinzerocond.not = icmp eq i8* %lookup_elem, null
  br i1 %joinzerocond.not, label %joinzero, label %joinnotzero

joinzero:                                         ; preds = %joinnotzero, %entry
  ret i64 1

joinnotzero:                                      ; preds = %entry
  store i64 30005, i8* %lookup_elem, align 8
  %7 = getelementptr i8, i8* %lookup_elem, i64 8
  store i64 0, i8* %7, align 8
  %8 = bitcast i64* %join_r0 to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %8)
  %probe_read_user = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_r0, i32 8, i64 %4)
  %9 = load i64, i64* %join_r0, align 8
  %probe_read_user_str = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %9)
  %10 = add i64 %4, 8
  %probe_read_user1 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %10)
  %11 = load i64, i64* %join_second, align 8
  %probe_read_user_str2 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %11)
  %12 = add i64 %4, 16
  %probe_read_user3 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %12)
  %13 = load i64, i64* %join_second, align 8
  %probe_read_user_str4 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %13)
  %14 = add i64 %4, 24
  %probe_read_user5 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %14)
  %15 = load i64, i64* %join_second, align 8
  %probe_read_user_str6 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %15)
  %16 = add i64 %4, 32
  %probe_read_user7 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %16)
  %17 = load i64, i64* %join_second, align 8
  %probe_read_user_str8 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %17)
  %18 = add i64 %4, 40
  %probe_read_user9 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %18)
  %19 = load i64, i64* %join_second, align 8
  %probe_read_user_str10 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %19)
  %20 = add i64 %4, 48
  %probe_read_user11 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %20)
  %21 = load i64, i64* %join_second, align 8
  %probe_read_user_str12 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %21)
  %22 = add i64 %4, 56
  %probe_read_user13 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %22)
  %23 = load i64, i64* %join_second, align 8
  %probe_read_user_str14 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %23)
  %24 = add i64 %4, 64
  %probe_read_user15 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %24)
  %25 = load i64, i64* %join_second, align 8
  %probe_read_user_str16 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %25)
  %26 = add i64 %4, 72
  %probe_read_user17 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %26)
  %27 = load i64, i64* %join_second, align 8
  %probe_read_user_str18 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %27)
  %28 = add i64 %4, 80
  %probe_read_user19 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %28)
  %29 = load i64, i64* %join_second, align 8
  %probe_read_user_str20 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %29)
  %30 = add i64 %4, 88
  %probe_read_user21 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %30)
  %31 = load i64, i64* %join_second, align 8
  %probe_read_user_str22 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %31)
  %32 = add i64 %4, 96
  %probe_read_user23 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %32)
  %33 = load i64, i64* %join_second, align 8
  %probe_read_user_str24 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %33)
  %34 = add i64 %4, 104
  %probe_read_user25 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %34)
  %35 = load i64, i64* %join_second, align 8
  %probe_read_user_str26 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %35)
  %36 = add i64 %4, 112
  %probe_read_user27 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %36)
  %37 = load i64, i64* %join_second, align 8
  %probe_read_user_str28 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %37)
  %38 = add i64 %4, 120
  %probe_read_user29 = call i64 inttoptr (i64 112 to i64 (i64*, i32, i64)*)(i64* nonnull %join_second, i32 8, i64 %38)
  %39 = load i64, i64* %join_second, align 8
  %probe_read_user_str30 = call i64 inttoptr (i64 114 to i64 (i8*, i32, i64)*)(i8* undef, i32 1024, i64 %39)
  %pseudo31 = call i64 @llvm.bpf.pseudo(i64 1, i64 1)
  %perf_event_output = call i64 inttoptr (i64 25 to i64 (i8*, i64, i64, i8*, i64)*)(i8* %0, i64 %pseudo31, i64 4294967295, i8* nonnull %lookup_elem, i64 16400)
  br label %joinzero
}

; Function Attrs: argmemonly mustprogress nofree nosync nounwind willreturn
declare void @llvm.lifetime.start.p0i8(i64 immarg %0, i8* nocapture %1) #1

attributes #0 = { nounwind }
attributes #1 = { argmemonly mustprogress nofree nosync nounwind willreturn }

@lenticularis39
Copy link
Contributor

Looking through the generated code for join, there is quite a lot invalid LLVM instructions in it, including the one that gets optimized out, e.g. as detected by opt (using LLVM 12, the code is the same as in 13 and 14):

$ opt < 12-unopt.ll
opt: <stdin>:35:9: error: stored value and pointer type do not match
  store i64 30005, i8* %lookup_elem, align 8
        ^

See LLVM Language Reference:

The type of the <pointer> operand must be a pointer to the first class type of the <value> operand.

After adding a cast to fix that, the instruction that gets optimized out in LLVM 14 appears in an error:

$ opt < 12f-unopt.ll 
opt: <stdin>:44:31: error: expected value token
  %13 = add i8* %lookup_elem, i64 16
                              ^

Again, according to the LLVM Language Reference:

The two arguments to the ‘add’ instruction must be integer or vector of integer values. Both arguments must have identical types.

I'll try testing join() on LLVM 14 again after fixing the invalid LLVM IR code, then the broken pass in LLVM 14 can be bisected (if there is one, I believe the invalid instructions are the cause).

@lenticularis39
Copy link
Contributor

Yeah, it was the cause, the program referenced in this issue now works with LLVM 14 - I opened a PR for the fix.

lenticularis39 added a commit to lenticularis39/bpftrace that referenced this issue Jul 12, 2022
Add pointer casts to store instruction where required and use GEP
instructions to do pointer arithmetic instead of incorrect use of
add instruction.

Fixes bpftrace#2222
lenticularis39 added a commit to lenticularis39/bpftrace that referenced this issue Jul 12, 2022
Add pointer casts to store instruction where required and use GEP
instructions to do pointer arithmetic instead of incorrect use of
add instruction.

Fixes bpftrace#2222
lenticularis39 added a commit to lenticularis39/bpftrace that referenced this issue Jul 12, 2022
Add pointer casts to store instruction where required and use GEP
instructions to do pointer arithmetic instead of incorrect use of
add instruction.

Fixes bpftrace#2222
@fbs
Copy link
Contributor

fbs commented Jul 12, 2022

woops lost track of this one, thanks @lenticularis39 for the work :)

The CI runs llvm-as on the codegen tests to find issues like these which behaves the same as opt like @lenticularis39 did:

$ llvm-as /tmp/file.llvm
/opt/homebrew/Cellar/llvm/13.0.1_1/bin/llvm-as: /tmp/file.llvm:35:9: error: stored value and pointer type do not match
  store i64 30005, i8* %lookup_elem, align 8

Wonder if we can somehow extend this and make it part of bpftrace itself, a debug mode that validates the IR it generates. That way we can include it on all test we have.

lenticularis39 added a commit to lenticularis39/bpftrace that referenced this issue Jul 12, 2022
Add pointer casts to store instruction where required and use GEP
instructions to do pointer arithmetic instead of incorrect use of
add instruction.

Fixes bpftrace#2222
@lenticularis39
Copy link
Contributor

Wonder if we can somehow extend this and make it part of bpftrace itself, a debug mode that validates the IR it generates. That way we can include it on all test we have.

This could be what llvm::VerifierPass does.

viktormalik pushed a commit that referenced this issue Jul 12, 2022
Add pointer casts to store instruction where required and use GEP
instructions to do pointer arithmetic instead of incorrect use of
add instruction.

Fixes #2222
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants