Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miscalulation of offset with consecutive bitfields #4890

Closed
shunghsiyu opened this issue Jan 30, 2024 · 9 comments
Closed

Miscalulation of offset with consecutive bitfields #4890

shunghsiyu opened this issue Jan 30, 2024 · 9 comments

Comments

@shunghsiyu
Copy link
Contributor

shunghsiyu commented Jan 30, 2024

With latest bcc v0.29.1 and LLVM 17.0.6. bcc will produce a BPF program with incorrect offset for struct my_struct.ptr in the following (incomplete) snipped:

struct my_struct
{
    long: 64;
    long: 64;
    long: 64;
    void* ptr;
};

int kprobe__netif_rx(struct pt_regs *ctx)
{
    struct event_t event = {};
    void* p1 = (void*)ctx->di;
    void* p2 = NULL;

    event.offset1 = ((long)&((struct my_struct*)p1)->ptr) - (long)p1;
    event.offset2 = ((long)&((struct my_struct*)p2)->ptr) - (long)p2;
    events.perf_submit(ctx, &event, sizeof(event));
    return 0;
}

where the produced BPF program is

Disassembly of function kprobe__netif_rx
; { // Line  34
   0:   b7 02 00 00 18 00 00 00 r2 = 24
; event.offset2 = (long)&((struct my_struct*)p2)->ptr - (long)p2; // Line  41
   1:   7b 2a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r2
   2:   b7 02 00 00 20 00 00 00 r2 = 32
; event.offset1 = (long)&((struct my_struct*)p1)->ptr - (long)p1; // Line  40
   3:   7b 2a f0 ff 00 00 00 00 *(u64 *)(r10 - 16) = r2
; bpf_perf_event_output(ctx, (void *)bpf_pseudo_fd(1, -1), CUR_CPU_IDENTIFIER, &event, sizeof(event)); // Line  42
   ...

In the above we can see offset1 == 32 while offset2 == 24 despite both are calculating the offset of ptr.

Bisect landed me on f35dae07 (merged in #2863) which was meant to address #2827 by introducing i128 type to the BPF layout spec (supported since LLVM 11); but I don't know what's the proper fix here.

To my best knowledge this has to do with the consecutive bitfields merging together and become a singe i192 type, which will use the alignment of the largest defined type (i64:64/8 bytes before and i128:128/16 bytes after f35dae07).

%struct.my_struct = type { i192, ptr }

Note: LLVM release 18.1.0-rc1 has llvm/llvm-project@a21abc7 that made the issue go away for me. However, I cannot judge whether it is the proper fix in this case.

(CC @yonghong-song)

@shunghsiyu
Copy link
Contributor Author

the full reproducer can be found here, and below is the LLVM IR of the program for completeness. As far as I can tell the IR looks correct.

define dso_local i32 @kprobe__netif_rx(ptr noundef %0) local_unnamed_addr #0 section ".bpf.fn.kprobe__netif_rx" !dbg !67 {
  %2 = alloca %struct.event_t, align 8, !DIAssignID !104
  %3 = getelementptr inbounds %struct.pt_regs, ptr %0, i64 0, i32 14, !dbg !108
  %4 = load i64, ptr %3, align 8, !dbg !108, !tbaa !109
  %5 = inttoptr i64 %4 to ptr, !dbg !114
  %6 = getelementptr inbounds %struct.my_struct, ptr %5, i64 0, i32 1, !dbg !115
  %7 = ptrtoint ptr %6 to i64, !dbg !116
  %8 = sub nsw i64 %7, %4, !dbg !117
  store i64 %8, ptr %2, align 8, !dbg !118, !tbaa !119, !DIAssignID !121
  %9 = getelementptr inbounds %struct.event_t, ptr %2, i64 0, i32 1, !dbg !122
  store i64 24, ptr %9, align 8, !dbg !123, !tbaa !124, !DIAssignID !125
  ...
}

@yonghong-song
Copy link
Collaborator

@shunghsiyu thanks for reporting. This is not a compiler bug but rather it is an unfortunate situation with the interaction between x86 IR and BPF backend. Note that currently, currently, the original bpf program is compiled first with x86 arch. The reason is we need x86 headers in compiling bpf programs. Once IR is generated, the x86 IR will feed into bpf backend. In certain situations, this may cause the problem. The following is a simple example to demonstrate:

$ cat t.c
struct my_struct {
        long: 64;
        long: 64;
        long: 64;
        void *ptr;
};

struct event_t {
        long offset0;
        long offset1;
        long offset2;
};
void bar(void *);

int foo(void *p1) {
        struct event_t event = {};
        void *p2 = 0;

        event.offset1 = ((long)&((struct my_struct*)p1)->ptr) - (long)p1;
        event.offset2 = ((long)&((struct my_struct*)p2)->ptr) - (long)p2;
        bar(&event);
        return 0;
}

The compilation steps:

  clang -O2 -S -emit-llvm t.c -Xclang -disable-llvm-passes -o t.ll
  llc -march=bpf -O2 t.ll

With llvm17, the llvm with x86 target has %struct.my_struct = type { i192, ptr }.
With llvm18, we have %struct.my_struct = type { [24 x i8], ptr }.

Since BPF does not have i192, it rounds up to alignment 16 and this caused a problem.

The fix is rather simple, do not use bitfield and proper alignments will be the same for both x86 and bpf.

@shunghsiyu
Copy link
Contributor Author

shunghsiyu commented Jan 31, 2024

Thanks for the explanation. So if I understand correctly the reason that offset1 and offset2 differs is because they came from different sources?

offset2 == 24 came from LLVM x86 IR directly, because the (long)&((struct my_struct *)0)->ptr can be easily inferred to be offset calculation by the x86 IR and transformed into a constant. The using the x86 alignment is used here and thus i192 is aligned to 8.

On the other hand offset1 == 32 came from LLVM BPF backend, because x86 IR cannot infer that ((long)&((struct my_struct*)p1)->ptr) - (long)p1 is offset calculation (because -disable-llvm-passes maybe?), therefore compiles it literally into a sequence of instructions involving sub. The BPF backend was then able to transform such sequence into a constant, but there it uses BPF alignment where i192 is aligned to 16 (since f35dae07).

Is the above correct?

@yonghong-song
Copy link
Collaborator

Your interpretation largely correct. I didn't dig out in llvm why for i192 type BPF backend put an alignment of 16 though. This probably related to how to handle i<> -> alignment in the arch-specific string.

@shunghsiyu
Copy link
Contributor Author

shunghsiyu commented Feb 1, 2024

I didn't dig out in llvm why for i192 type BPF backend put an alignment of 16 though

I think I read on the LLVM's Phabricator archive that "the alignment of the largest defined type (i128:128) will be chosen as the alignment" for types that are not explicitly defined in data layout.

@shunghsiyu
Copy link
Contributor Author

I'm still trying to figure out from the original bug report why consecutive bitfields are used in the first place, and whether it is strictly necessary.

Let's close this issue for now, and I'll reopen if needed. Thanks!

@shunghsiyu
Copy link
Contributor Author

shunghsiyu commented Mar 8, 2024

I'm still trying to figure out from the original bug report why consecutive bitfields are used in the first place, and whether it is strictly necessary.

The consecutive bitfields came from vmlinux.h generated by bpftool, which had more long :64 padding than usual because the original BTF has been processed by gen min_core_btf first.

From the existence of libbpf-tools I would guess that bcc is not meant to be used with vmlinux.h, rather it should stick to the kernel header that are produced as part of the kernel build process.

But so far I haven't found reference that confirm or deny the above guess. @yonghong-song do yon know if this is the case? It'd be nice to get an explicit statement on this subject. Thanks!

@yonghong-song
Copy link
Collaborator

Yes, bcc won't use vmlinux.h and libbpf-tools is using vmlinux.h.

shunghsiyu added a commit to shunghsiyu/bcc that referenced this issue May 3, 2024
With commit f35dae0 "adjust layout string in JIT with llvm11 128bit
spec support", the alignment of i128 type in LLVM's BPF backend is
explicitly defined, which made the alignment different from the those
used in LLVM's x86 IR. Such alignment difference can result in incorrect
offset, for example:

  struct my_struct
  {
      long: 64;
      long: 64;
      long: 64;
      void* ptr;
  };

  int kprobe__netif_rx(struct pt_regs *ctx)
  {
      struct event_t event = {};
      void* p1 = (void*)ctx->di;
      void* p2 = NULL;

      event.offset1 = ((long)&((struct my_struct*)p1)->ptr) - (long)p1;
      event.offset2 = ((long)&((struct my_struct*)p2)->ptr) - (long)p2;
      events.perf_submit(ctx, &event, sizeof(event));
      return 0;
  }

Will produce a BPF program seen below, where the struct my_struct.ptr
field is calculated to be 32 for p1, but 24 for p2, which is absurd
because both are suppose to be the offset of the same field.

  Disassembly of function kprobe__netif_rx
  ; { // Line  34
     0:   b7 02 00 00 18 00 00 00 r2 = 24
  ; event.offset2 = (long)&((struct my_struct*)p2)->ptr - (long)p2; // Line  41
     1:   7b 2a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r2
     2:   b7 02 00 00 20 00 00 00 r2 = 32
  ; event.offset1 = (long)&((struct my_struct*)p1)->ptr - (long)p1; // Line  40
     3:   7b 2a f0 ff 00 00 00 00 *(u64 *)(r10 - 16) = r2
  ; bpf_perf_event_output(ctx, (void *)bpf_pseudo_fd(1, -1), CUR_CPU_IDENTIFIER, &event, sizeof(event)); // Line  42
     ...

This happens because the three consecutive `long :64` anonymous
bitfields were merged into a single i192 type, which uses the alignment
of the largest defined datatype, thus i128 for BPF and i64 for x86,
which were 16 and 8, respectively. Since this mismatch in alignment is
mainly an issue only where there's a large, undefined data type, we can
simply pass the -ffine-grained-bitfield-access to Clang during the x86
IR pass to prevent consecutive bitfield from being merged, and thus
workaround the issue.

Link: iovisor#4890
Link: https://bugzilla.suse.com/show_bug.cgi?id=1219096
shunghsiyu added a commit to shunghsiyu/bcc that referenced this issue May 3, 2024
With commit f35dae0 "adjust layout string in JIT with llvm11 128bit
spec support", the alignment of i128 type in LLVM's BPF backend is
explicitly defined, which made the alignment different from the those
used in LLVM's x86 IR. Such alignment difference can result in incorrect
offset, for example:

  struct my_struct
  {
      long: 64;
      long: 64;
      long: 64;
      void* ptr;
  };

  int kprobe__netif_rx(struct pt_regs *ctx)
  {
      struct event_t event = {};
      void* p1 = (void*)ctx->di;
      void* p2 = NULL;

      event.offset1 = ((long)&((struct my_struct*)p1)->ptr) - (long)p1;
      event.offset2 = ((long)&((struct my_struct*)p2)->ptr) - (long)p2;
      events.perf_submit(ctx, &event, sizeof(event));
      return 0;
  }

Will produce a BPF program seen below, where the struct my_struct.ptr
field is calculated to be 32 for p1, but 24 for p2, which is absurd
because both are suppose to be the offset of the same field.

  Disassembly of function kprobe__netif_rx
  ; { // Line  34
     0:   b7 02 00 00 18 00 00 00 r2 = 24
  ; event.offset2 = (long)&((struct my_struct*)p2)->ptr - (long)p2; // Line  41
     1:   7b 2a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r2
     2:   b7 02 00 00 20 00 00 00 r2 = 32
  ; event.offset1 = (long)&((struct my_struct*)p1)->ptr - (long)p1; // Line  40
     3:   7b 2a f0 ff 00 00 00 00 *(u64 *)(r10 - 16) = r2
  ; bpf_perf_event_output(ctx, (void *)bpf_pseudo_fd(1, -1), CUR_CPU_IDENTIFIER, &event, sizeof(event)); // Line  42
     ...

This happens because the three consecutive `long :64` anonymous
bitfields were merged into a single i192 type, which uses the alignment
of the largest defined datatype, thus i128 for BPF and i64 for x86,
which were 16 and 8, respectively. Since this mismatch in alignment is
mainly an issue only where there's a large, undefined data type, we can
simply pass the -ffine-grained-bitfield-access to Clang during the x86
IR pass to prevent consecutive bitfield from being merged, and thus
workaround the issue.

Link: iovisor#4890
Link: https://bugzilla.suse.com/show_bug.cgi?id=1219096
@shunghsiyu
Copy link
Contributor Author

shunghsiyu commented May 3, 2024

FWIW we end up working around this by passing the -ffine-grained-bitfield-accesses flag to Clang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants