Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BPF] Handle unreachable with a kfunc call #131731

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

yonghong-song
Copy link
Contributor

@yonghong-song yonghong-song commented Mar 18, 2025

Currently, middle-end generates 'unreachable' insn if the compiler
feels the code is indeed unreachable or the code becomes invalid
due to some optimizaiton (e.g. code optimization with uninitialized
variables).

Right now BPF backend ignores 'unreachable' insn during selectiondag
lowering. For cases where 'unreachable' is due to invalid code
transformation, such a signal will be missed. Later on, users needs
some effort to debug it which impacts developer productivity.

This patch enabled selectiondag lowering for 'unreachable' insn.

Previous attempt ([1]) tries to have a backend IR pass to filter
out 'unreachable' insns in a number of cases. But such pattern
matching may misalign with future middle-end optimization with
'unreachable' insns.

This patch takes a different approach. The 'unreachable' insn is
lowered with special encoding in bpf object file and verifier
will do proper verification for the bpf prog. More specifically,
the 'unreachable' insn is replaced by a 'bpf_unreachable'
function. This function will be a kfunc (in ".ksyms" section)
with a weak attribute, but does not have definition. The function
is also present in prog btf. This way, the extern
'bpf_unreachable' can be handled properly in libbpf func
poison_kfunc_call().

I tested this patch on bpf selftests and all tests are passed.
I also tried original example in [2] and the error is properly
detected by verifier:

      func#0 @0
      last insn is not an exit or jmp

In another internal sched-ext bpf prog, with the patch we have bpf code:

      Disassembly of section .text:
      0000000000000000 <scx_storage_init_single>:
      ; {
           0:       bc 13 00 00 00 00 00 00 w3 = w1
           1:       b4 01 00 00 00 00 00 00 w1 = 0x0
      ;       const u32 zero = 0;
      ...
      0000000000003a80 <create_dom>:
      ; {
        1872:       bc 16 00 00 00 00 00 00 w6 = w1
      ;       bpf_printk("dom_id %d", dom_id);
        1873:       18 01 00 00 3f 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x3f ll
                    0000000000003a88:  R_BPF_64_64  .rodata
        1875:       b4 02 00 00 0a 00 00 00 w2 = 0xa
        1876:       bc 63 00 00 00 00 00 00 w3 = w6
        1877:       85 00 00 00 06 00 00 00 call 0x6
      ;       ret = scx_bpf_create_dsq(dom_id, 0);
        1878:       bc 61 00 00 00 00 00 00 w1 = w6
        1879:       b4 02 00 00 00 00 00 00 w2 = 0x0
        1880:       85 10 00 00 ff ff ff ff call -0x1
                    0000000000003ac0:  R_BPF_64_32  scx_bpf_create_dsq
      ;       domc->node_cpumask = node_data[node_id];
        1881:       85 10 00 00 ff ff ff ff call -0x1
                    0000000000003ac8:  R_BPF_64_32  bpf_unreachable
        <END>

The verifier can easily report the error too.

A bpf flag -bpf-disable-trap-unreachable is introduced to disable
trapping unreachable.

[1] #126858
[2] https://github.com/msune/clang_bpf/blob/main/Makefile#L3

@yonghong-song
Copy link
Contributor Author

cc @anakryiko @jemarch

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general approach here looks reasonable to me.

How does the resulting verifier error look like?

MachineFunction &MF = DAG.getMachineFunction();
Function &F = MF.getFunction();
if (F.hasFnAttribute(Attribute::Naked))
return Op;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this should be part of the generic handling for TrapUnreachable. IMHO we should never insert extra code into a naked function, for other targets as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. Let me check more and may have a patch for this later.

@@ -1622,6 +1627,25 @@ void BTFDebug::endModule() {
// Collect global types/variables except MapDef globals.
processGlobals(false);

// Create a BTF entry for func __unreachable_helper.
const Module *M = MMI->getModule();
Function *F = M->getFunction("__unreachable_helper");
Copy link

@anakryiko anakryiko Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the approach of Clang emitting kfunc call, putting it in .ksyms section (which is just a libbpf convention, not some sort of gold standard or anything), and so on. I think the appropriate way to do this with would be a BPF instruction.
But if we have to go with kfunc approach, let's at least call it bpf_unreachable(). We can even implement this in the kernel (it would just WARN/panic), so libbpf successfully resolves it without unnecessary warnings on modern kernels. And BPF verifier would provide a meaningful error message if it detects reachable bpf_unreachable() call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From llvm perspective, adding a new insn is easy, likes below:

diff --git a/llvm/lib/Target/BPF/BPFInstrInfo.td b/llvm/lib/Target/BPF/BPFInstrInfo.td
index 2dcf1eae086b..59bad209ea54 100644
--- a/llvm/lib/Target/BPF/BPFInstrInfo.td
+++ b/llvm/lib/Target/BPF/BPFInstrInfo.td
@@ -790,11 +790,25 @@ class RET<string OpcodeStr>
   let BPFClass = BPF_JMP;
 }
 
+class TRAP<string OpcodeStr>
+    : TYPE_ALU_JMP<BPF_EXIT.Value, BPF_K.Value,
+                   (outs),
+                   (ins),
+                   !strconcat(OpcodeStr, ""),
+                   []> {
+  let Inst{31-0} = 1;
+  let BPFClass = BPF_JMP;
+}
+
 let isReturn = 1, isTerminator = 1, hasDelaySlot=0, isBarrier = 1,
     isNotDuplicable = 1 in {
   def RET : RET<"exit">;
 }
 
+let isTrap = 1 in {
+  def UNREACHABLE : TRAP<"unreachable">;
+} // isTerminator = 1
+
 // ADJCALLSTACKDOWN/UP pseudo insns
 let Defs = [R11], Uses = [R11], isCodeGenOnly = 1 in {
 def ADJCALLSTACKDOWN : Pseudo<(outs), (ins i64imm:$amt1, i64imm:$amt2),

Different encoding is possible. I am chosing current kfunc approach since it minimizes the kernel change, spares one new insn, and can work on old kernel.

But I am open to the new insn approach or real bpf_unreachable() kfunc approach if we can reach a consensus.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While .ksyms is just a convention, I think that emitting kfunc calls from clang might be useful. E.g. currently memcpy, memset, memmove, memcmp intrinsics are unrolled, but there is no point in doing so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While .ksyms is just a convention, I think that emitting kfunc calls from clang might be useful. E.g. currently memcpy, memset, memmove, memcmp intrinsics are unrolled, but there is no point in doing so.

This convention is pretty much a de-facto standard now. Several libraries support it.
It's not unusual for a compiler to use a particular section name like ".text" or ".rodata" for specific needs.
The name is target dependent. ".ksyms" is another special name.
So I think it's appropriate for a compiler to emit "call blah_unreachable" and "call blah_memcpy" with ".ksyms" section name. It's cleaner and more flexible than "call magic_const" or inventing a new trap-like insn.

@yonghong-song
Copy link
Contributor Author

yonghong-song commented Mar 18, 2025

How does the resulting verifier error look like?

The current verifier error looks like

  func#0 @0
  last insn is not an exit or jmp

This is due to that there is a logic in verifier to ensure the last insn in the func must be an 'exit' or 'jmp' insn. If this approach sounds reasonable, if __unreachable_helper is the last insn, the verifier log can be changed to e.g.,

    last insn is marked as unreachable, maybe due to uninitialized variable?

Such more explicit message will prompt users to check their auto variables.

Without this patch, the user will simply get error message

   last insn is not an exit or jmp

and users needs to dig into asm codes to find why the code is generated that way.

@eddyz87
Copy link
Contributor

eddyz87 commented Mar 18, 2025

Maybe consider adding a test case?
E.g. as below:

; RUN: llc -mtriple=bpfel -mcpu=v3 -filetype=obj -o %t1 %s
; RUN: llvm-objcopy --dump-section='.BTF'=%t2 %t1
; RUN: %python %p/print_btf.py %t2 | FileCheck -check-prefixes=CHECK-BTF %s
; RUN: llc -mtriple=bpfel -mcpu=v3 < %s | FileCheck -check-prefixes=CHECK %s

define void @foo() {
entry:
  tail call void @bar()
  unreachable
}

; CHECK:      foo:
; CHECK-NEXT: .Lfunc_begin0:
; CHECK-NEXT:   .cfi_startproc
; CHECK-NEXT: # %bb.0:
; CHECK-NEXT:    call bar
; CHECK-NEXT:    call __unreachable_helper
; CHECK-NEXT: .Lfunc_end0:

define void @buz() #0 {
entry:
  tail call void asm sideeffect "r0 = r1; exit;", ""()
  unreachable
}

; CHECK: buz:
; CHECK-NEXT: .Lfunc_begin1:
; CHECK-NEXT:   .cfi_startproc
; CHECK-NEXT: # %bb.0:
; CHECK-NEXT:   #APP
; CHECK-NEXT:   r0 = r1
; CHECK-NEXT:   exit
; CHECK-EMPTY:
; CHECK-NEXT:   #NO_APP
; CHECK-NEXT: .Lfunc_end1:

; CHECK-BTF: [1] FUNC_PROTO '(anon)' ret_type_id=0 vlen=0
; CHECK-BTF: [2] FUNC '__unreachable_helper' type_id=1 linkage=extern
; CHECK-BTF: [3] DATASEC '.ksyms' size=0 vlen=1
; CHECK-BTF:  type_id=2 offset=0 size=0

declare dso_local void @bar()

attributes #0 = { naked }

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!2, !3}

!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, emissionKind: FullDebug)
!1 = !DIFile(filename: "test.c", directory: "/some/dir")
!2 = !{i32 7, !"Dwarf Version", i32 5}
!3 = !{i32 2, !"Debug Info Version", i32 3}

Copy link
Contributor

@eddyz87 eddyz87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm, modulo absence of tests.

Function *NewF = Function::Create(FT, GlobalValue::ExternalWeakLinkage,
"__unreachable_helper", M);
NewF->setDSOLocal(true);
NewF->setCallingConv(CallingConv::C);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we do not intend to execute this function, or any code after this function, maybe consider CallingConv::PreserveAll? Like with fastcall.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed from CallingConv::C to CallingConv::PreserveAll and the generated asm code is the same. So looks like either is fine. Since The Function NewF has 0 argument and return void, I probably will keep CallingConv::C.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference might be visible only under heavy register pressure, as far as I understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try some experiments on this with heavy register pressure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some experiment with 10 local variables around the code, something like

<some other codes>
int a, b, c, d, e, f, g, h;
a = ip6->priority; b = ip6->version; c = ip6->payload_len; d = ip6->nexhdr;
e = ip6->hop_limit; f = ip6->flow_lbl[0]; g = ip6->flow_lbl[1]; h = ip6->lbl[2];
<some code which eventually becomes bpf_reachable>
bpf_printk("...", a, b, c, d, e, f, g, h);

During IR optimizations, at some point, it becomes

<some other codes>
int a, b, c, d, e, f, g, h;
a = ip6->priority; b = ip6->version; c = ip6->payload_len; d = ip6->nexhdr;
e = ip6->hop_limit; f = ip6->flow_lbl[0]; g = ip6->flow_lbl[1]; h = ip6->lbl[2];
unreachable

And further optimization it becomes

<some other codes>
unreachable

So in practice, we won't have heavy register pressure around 'unreachable' insn. So I think calling convention C should be okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be okay here. 5 stack slots are necessary due to if w7 != 0 goto LBB0_2 going to LBB0_2. I tested with both calling convention C and PreserveAll. The result is the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, my example was broken because of calls to bar() that needed stack spill, sorry about that.
If changed to something like this, there are no spills generated. But this makes me wonder how does it work:

$ llc -debug-only=isel -mtriple=bpfel -mcpu=v3 < test6.ll
...
*** MachineFunction at end of ISel ***
# Machine code for function foo: IsSSA, TracksLiveness
Function Live Ins: $w1 in %9, $r3 in %11

bb.0.entry:
  successors: %bb.2(0x80000000), %bb.1(0x00000000); %bb.2(100.00%), %bb.1(0.00%)
  liveins: $w1, $r3
  %11:gpr = COPY $r3
  %9:gpr32 = COPY $w1
  %0:gpr32 = LDW32 %11:gpr, 0 :: (volatile load (s32) from %ir.ap1)
  %1:gpr32 = LDW32 %11:gpr, 4 :: (volatile load (s32) from %ir.bp)
  %2:gpr32 = LDW32 %11:gpr, 8 :: (volatile load (s32) from %ir.cp)
  %3:gpr32 = LDW32 %11:gpr, 12 :: (volatile load (s32) from %ir.dp)
  %4:gpr32 = LDW32 %11:gpr, 16 :: (volatile load (s32) from %ir.ep)
  %5:gpr32 = LDW32 %11:gpr, 20 :: (volatile load (s32) from %ir.fp)
  %6:gpr32 = LDW32 %11:gpr, 24 :: (volatile load (s32) from %ir.gp)
  %7:gpr32 = LDW32 %11:gpr, 28 :: (volatile load (s32) from %ir.hp)
  %8:gpr32 = LDW32 %11:gpr, 32 :: (volatile load (s32) from %ir.kp)
  %12:gpr32 = AND_ri_32 %9:gpr32(tied-def 0), 1
  JNE_ri_32 killed %12:gpr32, 0, %bb.2
  JMP %bb.1

bb.1.flabel:
; predecessors: %bb.0

  ADJCALLSTACKDOWN 0, 0, implicit-def dead $r11, implicit $r11
  JAL &__unreachable_helper, <regmask $r6 $r7 $r8 $r9 $r10 $w6 $w7 $w8 $w9 $w10>, implicit $r11, implicit-def $r11
  ADJCALLSTACKUP 0, 0, implicit-def dead $r11, implicit $r11

bb.2.tlabel:
; predecessors: %bb.0

  %13:gpr32 = ADD_rr_32 %0:gpr32(tied-def 0), %1:gpr32
  %14:gpr32 = ADD_rr_32 %13:gpr32(tied-def 0), %2:gpr32
  %15:gpr32 = ADD_rr_32 %14:gpr32(tied-def 0), %3:gpr32
  %16:gpr32 = ADD_rr_32 %15:gpr32(tied-def 0), %4:gpr32
  %17:gpr32 = ADD_rr_32 %16:gpr32(tied-def 0), %5:gpr32
  %18:gpr32 = ADD_rr_32 %17:gpr32(tied-def 0), %6:gpr32
  %19:gpr32 = ADD_rr_32 %18:gpr32(tied-def 0), %7:gpr32
  %20:gpr32 = ADD_rr_32 %19:gpr32(tied-def 0), %8:gpr32
  ADJCALLSTACKDOWN 0, 0, implicit-def dead $r11, implicit $r11
  $w1 = COPY %20:gpr32
  JAL @buz, <regmask $r6 $r7 $r8 $r9 $r10 $w6 $w7 $w8 $w9 $w10>, implicit $r11, implicit $w1, implicit-def $r11
  ADJCALLSTACKUP 0, 0, implicit-def dead $r11, implicit $r11
  RET
...

Note that basic block bb.1.flabel does not have terminator after the __unreachable_helper call. Is it ok?
Is it necessary to mark to add CLI.DoesNotReturn = true in LowerTRAP?

Copy link
Contributor

@eddyz87 eddyz87 Mar 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basic block bb.1.flabel does not have terminator after the __unreachable_helper call

Basing on the comments for TargetInstrInfo::analyzeBranch absence of terminators means fallthrough.
But if that is so, how come call to __unreachable_helper does not clobber r1-r5 in this example?
From llvm point of view there is a path bb.0.entry -> bb.1.flabel -> bb.2.tlabel, but register allocator somehow concludes that r1-r5 are intact after bb.1.flabel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the real reason is that def-use chain will not be able to cross 'unreachable' insn, even after lowering. The optimization is really depending on data flow, so I think it is fine that we do not have CLI.DoesNotReturn = true for __unreachable_helper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking again, maybe adding CLI.DoesNotReturn = true is a good idea, just in case.

fail(CLI.DL, DAG,
Twine("A call to built-in function '" + StringRef(E->getSymbol()) +
"' is not supported."));
if (strcmp(E->getSymbol(), "__unreachable_helper") != 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: maybe declare "__unreachable_helper" as a constant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean a constant variable? This probably not a good idea as the func later is added to BTF. Or you mean a constant function? What does it mean for a constant function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean something like static const char *UNREACHABLE_HELPER = "__unreachable_helper"; at the file level, or a #define ....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Yes, it is a good idea. Will do this.

@@ -1622,6 +1627,25 @@ void BTFDebug::endModule() {
// Collect global types/variables except MapDef globals.
processGlobals(false);

// Create a BTF entry for func __unreachable_helper.
const Module *M = MMI->getModule();
Function *F = M->getFunction("__unreachable_helper");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While .ksyms is just a convention, I think that emitting kfunc calls from clang might be useful. E.g. currently memcpy, memset, memmove, memcmp intrinsics are unrolled, but there is no point in doing so.

@yonghong-song
Copy link
Contributor Author

I will add some tests in the next revision.

Maybe consider adding a test case? E.g. as below:

; RUN: llc -mtriple=bpfel -mcpu=v3 -filetype=obj -o %t1 %s
; RUN: llvm-objcopy --dump-section='.BTF'=%t2 %t1
; RUN: %python %p/print_btf.py %t2 | FileCheck -check-prefixes=CHECK-BTF %s
; RUN: llc -mtriple=bpfel -mcpu=v3 < %s | FileCheck -check-prefixes=CHECK %s

define void @foo() {
entry:
  tail call void @bar()
  unreachable
}

; CHECK:      foo:
; CHECK-NEXT: .Lfunc_begin0:
; CHECK-NEXT:   .cfi_startproc
; CHECK-NEXT: # %bb.0:
; CHECK-NEXT:    call bar
; CHECK-NEXT:    call __unreachable_helper
; CHECK-NEXT: .Lfunc_end0:

define void @buz() #0 {
entry:
  tail call void asm sideeffect "r0 = r1; exit;", ""()
  unreachable
}

As suggested by @nikic, I will later have a patch to prevent 'unreachable' for naked functions for all architectures. So this test will not be included.

; CHECK: buz:
; CHECK-NEXT: .Lfunc_begin1:
; CHECK-NEXT: .cfi_startproc
; CHECK-NEXT: # %bb.0:
; CHECK-NEXT: #APP
; CHECK-NEXT: r0 = r1
; CHECK-NEXT: exit
; CHECK-EMPTY:
; CHECK-NEXT: #NO_APP
; CHECK-NEXT: .Lfunc_end1:

; CHECK-BTF: [1] FUNC_PROTO '(anon)' ret_type_id=0 vlen=0
; CHECK-BTF: [2] FUNC '__unreachable_helper' type_id=1 linkage=extern
; CHECK-BTF: [3] DATASEC '.ksyms' size=0 vlen=1
; CHECK-BTF: type_id=2 offset=0 size=0

declare dso_local void @bar()

attributes #0 = { naked }

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!2, !3}

!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, emissionKind: FullDebug)
!1 = !DIFile(filename: "test.c", directory: "/some/dir")
!2 = !{i32 7, !"Dwarf Version", i32 5}
!3 = !{i32 2, !"Debug Info Version", i32 3}

@eddyz87
Copy link
Contributor

eddyz87 commented Mar 19, 2025

define void @buz() #0 {
entry:
tail call void asm sideeffect "r0 = r1; exit;", ""()
unreachable
}

As suggested by @nikic, I will later have a patch to prevent 'unreachable' for naked functions for all architectures. So this test will not be included.

As far as I understand:

  • that would be a separate PR merged at some later point;
  • the code change in BPFDAGToDAGISel::Select is only necessary to handle naked functions.

Thus, I'd keep the test for now, just to have all code paths covered.

@yonghong-song
Copy link
Contributor Author

define void @buz() #0 {
entry:
tail call void asm sideeffect "r0 = r1; exit;", ""()
unreachable
}

As suggested by @nikic, I will later have a patch to prevent 'unreachable' for naked functions for all architectures. So this test will not be included.

As far as I understand:

* that would be a separate PR merged at some later point;

Yes.

* the code change in `BPFDAGToDAGISel::Select` is only necessary to handle naked functions.

Yes.

Thus, I'd keep the test for now, just to have all code paths covered.
Okay, will keep the test for now at least.

yonghong-song pushed a commit to yonghong-song/llvm-project that referenced this pull request Mar 20, 2025
In [1], Nikita Popov suggested that during lowering 'unreachable' insn
should not generate extra code for naked functions, and this applies
to all architectures. Note that for naked functions, 'unreachable'
insn is necessary in IR since the basic block needs a terminator to end.

This patch checked whether a function is naked function or not.
If it is a naked function, 'unreachable' insn will not generate
ISD::TRAP.

I put the RFC in the subject as I am not sure how to find a test
which can validate the change in FastISel.cpp. Any advice is welcome.

  [1] llvm#131731
yonghong-song pushed a commit to yonghong-song/llvm-project that referenced this pull request Mar 20, 2025
In [1], Nikita Popov suggested that during lowering 'unreachable' insn
should not generate extra code for naked functions, and this applies
to all architectures. Note that for naked functions, 'unreachable'
insn is necessary in IR since the basic block needs a terminator to end.

This patch checked whether a function is naked function or not.
If it is a naked function, 'unreachable' insn will not generate
ISD::TRAP.

I put the RFC in the subject as I am not sure how to find a test
which can validate the change in FastISel.cpp. Any advice is welcome.

  [1] llvm#131731
yonghong-song pushed a commit to yonghong-song/llvm-project that referenced this pull request Mar 20, 2025
In [1], Nikita Popov suggested that during lowering 'unreachable' insn
should not generate extra code for naked functions, and this applies
to all architectures. Note that for naked functions, 'unreachable'
insn is necessary in IR since the basic block needs a terminator to end.

This patch checked whether a function is naked function or not.
If it is a naked function, 'unreachable' insn will not generate
ISD::TRAP.

  [1] llvm#131731
yonghong-song pushed a commit to yonghong-song/llvm-project that referenced this pull request Mar 20, 2025
In [1], Nikita Popov suggested that during lowering 'unreachable' insn
should not generate extra code for naked functions, and this applies
to all architectures. Note that for naked functions, 'unreachable'
insn is necessary in IR since the basic block needs a terminator to end.

This patch checked whether a function is naked function or not.
If it is a naked function, 'unreachable' insn will not generate
ISD::TRAP.

  [1] llvm#131731
yonghong-song added a commit that referenced this pull request Mar 21, 2025
In [1], Nikita Popov suggested that during lowering 'unreachable' insn
should not generate extra code for naked functions, and this applies to
all architectures. Note that for naked functions, 'unreachable' insn is
necessary in IR since the basic block needs a terminator to end.

This patch checked whether a function is naked function or not. If it is
a naked function, 'unreachable' insn will not generate ISD::TRAP.

  [1] #131731

Co-authored-by: Yonghong Song <yonghong.song@linux.dev>
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Mar 21, 2025
…147)

In [1], Nikita Popov suggested that during lowering 'unreachable' insn
should not generate extra code for naked functions, and this applies to
all architectures. Note that for naked functions, 'unreachable' insn is
necessary in IR since the basic block needs a terminator to end.

This patch checked whether a function is naked function or not. If it is
a naked function, 'unreachable' insn will not generate ISD::TRAP.

  [1] llvm/llvm-project#131731

Co-authored-by: Yonghong Song <yonghong.song@linux.dev>
@yonghong-song yonghong-song force-pushed the handle-undef-unreachable-call branch 2 times, most recently from a882fcf to da34dcc Compare March 21, 2025 15:44
@yonghong-song yonghong-song changed the title [BPF] Handle unreachable with a unimplemented kfunc call [BPF] Handle unreachable with a kfunc call Mar 21, 2025
Copy link

github-actions bot commented Mar 21, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@yonghong-song yonghong-song force-pushed the handle-undef-unreachable-call branch from da34dcc to efb1d55 Compare March 21, 2025 15:55
Currently, middle-end generates 'unreachable' insn if the compiler
feels the code is indeed unreachable or the code becomes invalid
due to some optimizaiton (e.g. code optimization with uninitialized
variables).

Right now BPF backend ignores 'unreachable' insn during selectiondag
lowering. For cases where 'unreachable' is due to invalid code
transformation, such a signal will be missed. Later on, users needs
some effort to debug it which impacts developer productivity.

This patch enabled selectiondag lowering for 'unreachable' insn.

Previous attempt ([1]) tries to have a backend IR pass to filter
out 'unreachable' insns in a number of cases. But such pattern
matching may misalign with future middle-end optimization with
'unreachable' insns.

This patch takes a different approach. The 'unreachable' insn is
lowered with special encoding in bpf object file and verifier
will do proper verification for the bpf prog. More specifically,
the 'unreachable' insn is replaced by a 'bpf_unreachable'
function. This function will be a kfunc (in ".ksyms" section)
with a weak attribute, but does not have definition. The function
is also present in prog btf. This way, the extern
'bpf_unreachable' can be handled properly in libbpf func
poison_kfunc_call().

I tested this patch on bpf selftests and all tests are passed.
I also tried original example in [2] and the error is properly
detected by verifier:
  func#0 @0
  last insn is not an exit or jmp

In another internal sched-ext bpf prog, with the patch we have bpf code:
  Disassembly of section .text:
  0000000000000000 <scx_storage_init_single>:
  ; {
       0:       bc 13 00 00 00 00 00 00 w3 = w1
       1:       b4 01 00 00 00 00 00 00 w1 = 0x0
  ;       const u32 zero = 0;
  ...
  0000000000003a80 <create_dom>:
  ; {
    1872:       bc 16 00 00 00 00 00 00 w6 = w1
  ;       bpf_printk("dom_id %d", dom_id);
    1873:       18 01 00 00 3f 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x3f ll
                0000000000003a88:  R_BPF_64_64  .rodata
    1875:       b4 02 00 00 0a 00 00 00 w2 = 0xa
    1876:       bc 63 00 00 00 00 00 00 w3 = w6
    1877:       85 00 00 00 06 00 00 00 call 0x6
  ;       ret = scx_bpf_create_dsq(dom_id, 0);
    1878:       bc 61 00 00 00 00 00 00 w1 = w6
    1879:       b4 02 00 00 00 00 00 00 w2 = 0x0
    1880:       85 10 00 00 ff ff ff ff call -0x1
                0000000000003ac0:  R_BPF_64_32  scx_bpf_create_dsq
  ;       domc->node_cpumask = node_data[node_id];
    1881:       85 10 00 00 ff ff ff ff call -0x1
                0000000000003ac8:  R_BPF_64_32  bpf_unreachable
    <END>
The verifier can easily report the error too.

A bpf flag `-bpf-disable-trap-unreachable` is introduced to disable
trapping unreachable.

  [1] llvm#126858
  [2] https://github.com/msune/clang_bpf/blob/main/Makefile#L3
@yonghong-song yonghong-song force-pushed the handle-undef-unreachable-call branch from efb1d55 to 7afaebe Compare March 21, 2025 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants