Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64][BTI] Prevent Machine Scheduler from moving branch targets #68313

Merged

Conversation

atrosinenko
Copy link
Contributor

Moving instructions that are recognized as branch targets by BTI can result in runtime crash.

In outliner tests, replaced "BRK 1" with "HINT 0" (a.k.a. NOP) as a generic outlinable instruction.

Moving instructions that are recognized as branch targets by BTI can
result in runtime crash.

In outliner tests, replaced "BRK 1" with "HINT 0" (a.k.a. NOP) as a
generic outlinable instruction.
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 5, 2023

@llvm/pr-subscribers-backend-aarch64

Changes

Moving instructions that are recognized as branch targets by BTI can result in runtime crash.

In outliner tests, replaced "BRK 1" with "HINT 0" (a.k.a. NOP) as a generic outlinable instruction.


Full diff: https://github.com/llvm/llvm-project/pull/68313.diff

5 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+33-5)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.h (+3)
  • (modified) llvm/test/CodeGen/AArch64/machine-outliner-noreturn-no-stack.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/machine-outliner-noreturn-save-lr.mir (+4-4)
  • (added) llvm/test/CodeGen/AArch64/misched-branch-targets.mir (+195)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 830bf77b4e5d05c..5a234bceb25ed0a 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -1133,6 +1133,11 @@ bool AArch64InstrInfo::isSchedulingBoundary(const MachineInstr &MI,
                                             const MachineFunction &MF) const {
   if (TargetInstrInfo::isSchedulingBoundary(MI, MBB, MF))
     return true;
+
+  // Do not move an instruction that can be recognized as a branch target.
+  if (hasBTISemantics(MI))
+    return true;
+
   switch (MI.getOpcode()) {
   case AArch64::HINT:
     // CSDB hints are scheduling barriers.
@@ -4081,6 +4086,32 @@ bool AArch64InstrInfo::isQForm(const MachineInstr &MI) {
   return llvm::any_of(MI.operands(), IsQFPR);
 }
 
+bool AArch64InstrInfo::hasBTISemantics(const MachineInstr &MI) {
+  switch (MI.getOpcode()) {
+  case AArch64::BRK:
+  case AArch64::HLT:
+  case AArch64::PACIASP:
+  case AArch64::PACIBSP:
+    // Implicit BTI behavior.
+    return true;
+  case AArch64::PAUTH_PROLOGUE:
+    // PAUTH_PROLOGUE expands to PACI(A|B)SP.
+    return true;
+  case AArch64::HINT: {
+    unsigned Imm = MI.getOperand(0).getImm();
+    // Explicit BTI instruction.
+    if (Imm == 32 || Imm == 34 || Imm == 36 || Imm == 38)
+      return true;
+    // PACI(A|B)SP instructions.
+    if (Imm == 25 || Imm == 27)
+      return true;
+    return false;
+  }
+  default:
+    return false;
+  }
+}
+
 bool AArch64InstrInfo::isFpOrNEON(const MachineInstr &MI) {
   auto IsFPR = [&](const MachineOperand &Op) {
     if (!Op.isReg())
@@ -8829,11 +8860,8 @@ AArch64InstrInfo::getOutliningTypeImpl(MachineBasicBlock::iterator &MIT,
 
   // Don't outline BTI instructions, because that will prevent the outlining
   // site from being indirectly callable.
-  if (MI.getOpcode() == AArch64::HINT) {
-    int64_t Imm = MI.getOperand(0).getImm();
-    if (Imm == 32 || Imm == 34 || Imm == 36 || Imm == 38)
-      return outliner::InstrType::Illegal;
-  }
+  if (hasBTISemantics(MI))
+    return outliner::InstrType::Illegal;
 
   return outliner::InstrType::Legal;
 }
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.h b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
index f1a4928939bcd9e..f5874d7856f8d24 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
@@ -120,6 +120,9 @@ class AArch64InstrInfo final : public AArch64GenInstrInfo {
   /// Returns whether the instruction is in Q form (128 bit operands)
   static bool isQForm(const MachineInstr &MI);
 
+  /// Returns whether the instruction can be compatible with non-zero BTYPE.
+  static bool hasBTISemantics(const MachineInstr &MI);
+
   /// Returns the index for the immediate for a given instruction.
   static unsigned getLoadStoreImmIdx(unsigned Opc);
 
diff --git a/llvm/test/CodeGen/AArch64/machine-outliner-noreturn-no-stack.mir b/llvm/test/CodeGen/AArch64/machine-outliner-noreturn-no-stack.mir
index b13887a02ebd232..f926040f3932d71 100644
--- a/llvm/test/CodeGen/AArch64/machine-outliner-noreturn-no-stack.mir
+++ b/llvm/test/CodeGen/AArch64/machine-outliner-noreturn-no-stack.mir
@@ -34,7 +34,7 @@ body:             |
     $x0 = ORRXrs $xzr, $x1, 0
     $x1 = ORRXrs $xzr, $x2, 0
     BL @baz, implicit-def dead $lr, implicit $sp, implicit $x8, implicit $x0, implicit $x1, implicit $x3, implicit $x4, implicit-def $sp, implicit-def $x5, implicit-def $x6, implicit-def $x7, implicit-def $x8,  implicit-def $x9, implicit-def $x10,  implicit-def $x11, implicit-def $x12, implicit-def $x13,  implicit-def $x14, implicit-def $x15, implicit-def $x18
-    BRK 1
+    HINT 0
 ...
 ---
 name:            stack_2
@@ -56,7 +56,7 @@ body:             |
     $x0 = ORRXrs $xzr, $x1, 0
     $x1 = ORRXrs $xzr, $x2, 0
     BL @baz, implicit-def dead $lr, implicit $sp, implicit $x8, implicit $x0, implicit $x1, implicit $x3, implicit $x4, implicit-def $sp, implicit-def $x5, implicit-def $x6, implicit-def $x7, implicit-def $x8,  implicit-def $x9, implicit-def $x10,  implicit-def $x11, implicit-def $x12, implicit-def $x13,  implicit-def $x14, implicit-def $x15, implicit-def $x18
-    BRK 1
+    HINT 0
 ...
 ---
 name:            baz
diff --git a/llvm/test/CodeGen/AArch64/machine-outliner-noreturn-save-lr.mir b/llvm/test/CodeGen/AArch64/machine-outliner-noreturn-save-lr.mir
index 157cbbb51b5b782..c688f4370b0e9dd 100644
--- a/llvm/test/CodeGen/AArch64/machine-outliner-noreturn-save-lr.mir
+++ b/llvm/test/CodeGen/AArch64/machine-outliner-noreturn-save-lr.mir
@@ -31,7 +31,7 @@ body:             |
     ; CHECK: $lr = ORRXrs $xzr, $x0, 0
     $w3 = ORRWri $wzr, 1
     $w4 = ORRWri $wzr, 1
-    BRK 1
+    HINT 0
     $w5 = ORRWri $wzr, 1
     $w6 = ORRWri $wzr, 1
 ...
@@ -53,7 +53,7 @@ body:             |
     ; CHECK: $lr = ORRXrs $xzr, $x0, 0
     $w3 = ORRWri $wzr, 1
     $w4 = ORRWri $wzr, 1
-    BRK 1
+    HINT 0
     $w5 = ORRWri $wzr, 1
     $w6 = ORRWri $wzr, 1
 ...
@@ -75,7 +75,7 @@ body:             |
     ; CHECK: $lr = ORRXrs $xzr, $x0, 0
     $w3 = ORRWri $wzr, 1
     $w4 = ORRWri $wzr, 1
-    BRK 1
+    HINT 0
     $w5 = ORRWri $wzr, 1
     $w6 = ORRWri $wzr, 1
 ...
@@ -97,7 +97,7 @@ body:             |
     ; CHECK: $lr = ORRXrs $xzr, $x0, 0
     $w3 = ORRWri $wzr, 1
     $w4 = ORRWri $wzr, 1
-    BRK 1
+    HINT 0
     $w5 = ORRWri $wzr, 1
     $w6 = ORRWri $wzr, 1
 ...
diff --git a/llvm/test/CodeGen/AArch64/misched-branch-targets.mir b/llvm/test/CodeGen/AArch64/misched-branch-targets.mir
new file mode 100644
index 000000000000000..f32c1e964f97356
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/misched-branch-targets.mir
@@ -0,0 +1,195 @@
+# RUN: llc -o - -run-pass=machine-scheduler -misched=shuffle %s | FileCheck %s
+# RUN: llc -o - -run-pass=postmisched %s | FileCheck %s
+
+# Check that instructions that are recognized as branch targets by BTI
+# are not reordered by machine instruction schedulers.
+
+--- |
+  target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+  target triple = "aarch64-unknown-linux-gnu"
+
+  define i32 @f_pac_pseudo(i32 %a, i32 %b, i32 %c) #0 "sign-return-address"="all" {
+  entry:
+    ret i32 0
+  }
+
+  define i32 @f_pac(i32 %a, i32 %b, i32 %c) #0 "sign-return-address"="all" {
+  entry:
+    ret i32 0
+  }
+
+  define i32 @f_bti(i32 %a, i32 %b, i32 %c) #0 {
+  entry:
+    ret i32 0
+  }
+
+  define i32 @f_brk(i32 %a, i32 %b, i32 %c) #0 {
+  entry:
+    ret i32 0
+  }
+
+  define i32 @f_hlt(i32 %a, i32 %b, i32 %c) #0 {
+  entry:
+    ret i32 0
+  }
+
+  define i32 @f_nop(i32 %a, i32 %b, i32 %c) #0 {
+  entry:
+    ret i32 0
+  }
+
+  attributes #0 = { nounwind memory(none) "target-features"="+v8.2a" }
+
+...
+---
+name:            f_pac_pseudo
+alignment:       4
+tracksRegLiveness: true
+body:             |
+  bb.0.entry:
+    liveins: $w0, $w1, $w2, $lr
+
+    frame-setup PAUTH_PROLOGUE implicit-def $lr, implicit $lr, implicit $sp
+    $w8 = ADDWrs $w0, $w1, 0
+    $w0 = MADDWrrr $w8, $w2, $wzr
+    RET undef $lr, implicit $w0
+
+# PAUTH_EPILOGUE instruction is omitted for simplicity as it is technically possible
+# to move it, so it may end up at a less obvious position in a basic block.
+
+# CHECK-LABEL: name: f_pac_pseudo
+# CHECK:       body:             |
+# CHECK-NEXT:    bb.0.entry:
+# CHECK-NEXT:      liveins: $w0, $w1, $w2, $lr
+#
+# CHECK:           frame-setup PAUTH_PROLOGUE implicit-def $lr, implicit {{.*}}$lr, implicit $sp
+# CHECK-NEXT:      $w8 = ADDWrs {{.*}}$w0, {{.*}}$w1, 0
+# CHECK-NEXT:      $w0 = MADDWrrr {{.*}}$w8, {{.*}}$w2, $wzr
+# CHECK-NEXT:      RET undef $lr, implicit {{.*}}$w0
+
+...
+---
+name:            f_pac
+alignment:       4
+tracksRegLiveness: true
+body:             |
+  bb.0.entry:
+    liveins: $w0, $w1, $w2, $lr
+
+    frame-setup PACIASP implicit-def $lr, implicit $lr, implicit $sp
+    $w8 = ADDWrs $w0, $w1, 0
+    $w0 = MADDWrrr $w8, $w2, $wzr
+    RET undef $lr, implicit $w0
+
+# AUTIASP is omitted, see above.
+
+# CHECK-LABEL: name: f_pac
+# CHECK:       body:             |
+# CHECK-NEXT:    bb.0.entry:
+# CHECK-NEXT:      liveins: $w0, $w1, $w2, $lr
+#
+# CHECK:           frame-setup PACIASP implicit-def $lr, implicit {{.*}}$lr, implicit $sp
+# CHECK-NEXT:      $w8 = ADDWrs {{.*}}$w0, {{.*}}$w1, 0
+# CHECK-NEXT:      $w0 = MADDWrrr {{.*}}$w8, {{.*}}$w2, $wzr
+# CHECK-NEXT:      RET undef $lr, implicit {{.*}}$w0
+
+...
+---
+name:            f_bti
+alignment:       4
+tracksRegLiveness: true
+body:             |
+  bb.0.entry:
+    liveins: $w0, $w1, $w2, $lr
+
+    HINT 34
+    $w8 = ADDWrs $w0, $w1, 0
+    $w0 = MADDWrrr $w8, $w2, $wzr
+    RET undef $lr, implicit $w0
+
+# CHECK-LABEL: name: f_bti
+# CHECK:       body:             |
+# CHECK-NEXT:    bb.0.entry:
+# CHECK-NEXT:      liveins: $w0, $w1, $w2, $lr
+#
+# CHECK:           HINT 34
+# CHECK-NEXT:      $w8 = ADDWrs {{.*}}$w0, {{.*}}$w1, 0
+# CHECK-NEXT:      $w0 = MADDWrrr {{.*}}$w8, {{.*}}$w2, $wzr
+# CHECK-NEXT:      RET undef $lr, implicit {{.*}}$w0
+
+...
+---
+name:            f_brk
+alignment:       4
+tracksRegLiveness: true
+body:             |
+  bb.0.entry:
+    liveins: $w0, $w1, $w2, $lr
+
+    BRK 1
+    $w8 = ADDWrs $w0, $w1, 0
+    $w0 = MADDWrrr $w8, $w2, $wzr
+    RET undef $lr, implicit $w0
+
+# CHECK-LABEL: name: f_brk
+# CHECK:       body:             |
+# CHECK-NEXT:    bb.0.entry:
+# CHECK-NEXT:      liveins: $w0, $w1, $w2, $lr
+#
+# CHECK:           BRK 1
+# CHECK-NEXT:      $w8 = ADDWrs {{.*}}$w0, {{.*}}$w1, 0
+# CHECK-NEXT:      $w0 = MADDWrrr {{.*}}$w8, {{.*}}$w2, $wzr
+# CHECK-NEXT:      RET undef $lr, implicit {{.*}}$w0
+
+...
+---
+name:            f_hlt
+alignment:       4
+tracksRegLiveness: true
+body:             |
+  bb.0.entry:
+    liveins: $w0, $w1, $w2, $lr
+
+    HLT 1
+    $w8 = ADDWrs $w0, $w1, 0
+    $w0 = MADDWrrr $w8, $w2, $wzr
+    RET undef $lr, implicit $w0
+
+# CHECK-LABEL: name: f_hlt
+# CHECK:       body:             |
+# CHECK-NEXT:    bb.0.entry:
+# CHECK-NEXT:      liveins: $w0, $w1, $w2, $lr
+#
+# CHECK:           HLT 1
+# CHECK-NEXT:      $w8 = ADDWrs {{.*}}$w0, {{.*}}$w1, 0
+# CHECK-NEXT:      $w0 = MADDWrrr {{.*}}$w8, {{.*}}$w2, $wzr
+# CHECK-NEXT:      RET undef $lr, implicit {{.*}}$w0
+
+...
+---
+name:            f_nop
+alignment:       4
+tracksRegLiveness: true
+body:             |
+  bb.0.entry:
+    liveins: $w0, $w1, $w2, $lr
+
+    HINT 0
+    $w8 = ADDWrs $w0, $w1, 0
+    $w0 = MADDWrrr $w8, $w2, $wzr
+    RET undef $lr, implicit $w0
+
+# Check that BTI-related instructions are left intact not because *anything*
+# is left intact.
+
+# CHECK-LABEL: name: f_nop
+# CHECK:       body:             |
+# CHECK-NEXT:    bb.0.entry:
+# CHECK-NEXT:      liveins: $w0, $w1, $w2, $lr
+#
+# CHECK:           $w8 = ADDWrs {{.*}}$w0, {{.*}}$w1, 0
+# CHECK-DAG:       $w0 = MADDWrrr {{.*}}$w8, {{.*}}$w2, $wzr
+# CHECK-DAG:       HINT 0
+# CHECK-NEXT:      RET undef $lr, implicit {{.*}}$w0
+
+...

Copy link
Collaborator

@momchil-velikov momchil-velikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@atrosinenko atrosinenko merged commit f1b2dd2 into llvm:main Oct 6, 2023
4 checks passed
@atrosinenko
Copy link
Contributor Author

CI failed for non-assertion build: https://lab.llvm.org/buildbot/#/builders/67/builds/12886 because -misched=shuffle requires assertions enabled. Fix implemented in #68424.

@atrosinenko atrosinenko deleted the aarch64-dont-move-branch-targets branch October 6, 2023 16:00
vhscampos added a commit to vhscampos/llvm-project that referenced this pull request Jan 23, 2024
…ndaries

Following llvm#68313 this patch
extends the idea to M-profile PACBTI.

The Machine Scheduler can reorder instructions within a scheduling
region depending on the scheduling policy set. If a BTI-clearing
instruction happens to partake in one such region, it might be moved
around, therefore ending up where it shouldn't.

The solution is to mark all BTI-clearing instructions as scheduling
region boundaries. This essentially means that they must not be part of
any scheduling region, and as consequence never get moved:

 - PAC
 - PACBTI
 - BTI
 - SG
 - CALL_BTI (pseudo-instruction for setjmp + bti)

Note that PAC isn't BTI-clearing, but it's replaced by PACBTI late in
the compilation pipeline.

As far as I know, currently it isn't possible to organically obtain code
that's susceptible to the bug:

 - Instructions that write to SP are region boundaries. PAC seems to
   always be followed by the pushing of r12 to the stack, so essentially
   PAC is always by itself in a scheduling region.
 - CALL_BTI is expanded into a machine instruction bundle. Bundles are
   unpacked only after the last machine scheduler run. Thus setjmp and
   BTI can be separated only if someone deliberately run the scheduler
   once more.
 - The BTI insertion pass is run late in the pipeline, only after the
   last machine scheduling has run. So once again it can be reordered
   only if someone deliberately runs the scheduler again.

Nevertheless, one can reasonably argue that we should prevent the bug in
spite of the compiler not being able to produce the required conditions
for it. If things change, the compiler will be robust against this
issue.

The tests written for this are contrived: bogus MIR instructions have
been added adjacent to the BTI-clearing instructions in order to have
them inside non-trivial scheduling regions.
vhscampos added a commit to vhscampos/llvm-project that referenced this pull request Jan 25, 2024
Following llvm#68313 this patch
extends the idea to M-profile PACBTI.

The Machine Scheduler can reorder instructions within a scheduling
region depending on the scheduling policy set. If a BTI-clearing
instruction happens to partake in one such region, it might be moved
around, therefore ending up where it shouldn't.

The solution is to mark all BTI-clearing instructions as scheduling
region boundaries. This essentially means that they must not be part of
any scheduling region, and as consequence never get moved:

 - PAC
 - PACBTI
 - BTI
 - SG
 - CALL_BTI (pseudo-instruction for setjmp + bti)

Note that PAC isn't BTI-clearing, but it's replaced by PACBTI late in
the compilation pipeline.

As far as I know, currently it isn't possible to organically obtain code
that's susceptible to the bug:

 - Instructions that write to SP are region boundaries. PAC seems to
   always be followed by the pushing of r12 to the stack, so essentially
   PAC is always by itself in a scheduling region.
 - CALL_BTI is expanded into a machine instruction bundle. Bundles are
   unpacked only after the last machine scheduler run. Thus setjmp and
   BTI can be separated only if someone deliberately runs the scheduler
   once more.
 - The BTI insertion pass is run late in the pipeline, only after the
   last machine scheduling has run. So once again it can be reordered
   only if someone deliberately runs the scheduler again.

Nevertheless, one can reasonably argue that we should prevent the bug in
spite of the compiler not being able to produce the required conditions
for it. If things change, the compiler will be robust against this
issue.

The tests written for this are contrived: bogus MIR instructions have
been added adjacent to the BTI-clearing instructions in order to have
them inside non-trivial scheduling regions.
vhscampos added a commit that referenced this pull request Apr 4, 2024
…ndaries (#79173)

Following #68313 this patch
extends the idea to M-profile PACBTI.

The Machine Scheduler can reorder instructions within a scheduling
region depending on the scheduling policy set. If a BTI-clearing
instruction happens to partake in one such region, it might be moved
around, therefore ending up where it shouldn't.

The solution is to mark all BTI-clearing instructions as scheduling
region boundaries. This essentially means that they must not be part of
any scheduling region, and as consequence never get moved:

 - PAC
 - PACBTI
 - BTI
 - SG

Note that PAC isn't BTI-clearing, but it's replaced by PACBTI late in
the compilation pipeline.

As far as I know, currently it isn't possible to organically obtain code
that's susceptible to the bug:

- Instructions that write to SP are region boundaries. PAC seems to
always be followed by the pushing of r12 to the stack, so essentially
PAC is always by itself in a scheduling region.
- CALL_BTI is expanded into a machine instruction bundle. Bundles are
unpacked only after the last machine scheduler run. Thus setjmp and BTI
can be separated only if someone deliberately run the scheduler once
more.
- The BTI insertion pass is run late in the pipeline, only after the
last machine scheduling has run. So once again it can be reordered only
if someone deliberately runs the scheduler again.

Nevertheless, one can reasonably argue that we should prevent the bug in
spite of the compiler not being able to produce the required conditions
for it. If things change, the compiler will be robust against this
issue.

The tests written for this are contrived: bogus MIR instructions have
been added adjacent to the BTI-clearing instructions in order to have
them inside non-trivial scheduling regions.
vhscampos added a commit to vhscampos/llvm-project that referenced this pull request Apr 8, 2024
…ndaries

Following llvm#68313 this patch
extends the idea to M-profile PACBTI.

The Machine Scheduler can reorder instructions within a scheduling
region depending on the scheduling policy set. If a BTI-clearing
instruction happens to partake in one such region, it might be moved
around, therefore ending up where it shouldn't.

The solution is to mark all BTI-clearing instructions as scheduling
region boundaries. This essentially means that they must not be part of
any scheduling region, and as consequence never get moved:

 - PAC
 - PACBTI
 - BTI
 - SG

Note that PAC isn't BTI-clearing, but it's replaced by PACBTI late in
the compilation pipeline.

As far as I know, currently it isn't possible to organically obtain code
that's susceptible to the bug:

- Instructions that write to SP are region boundaries. PAC seems to
always be followed by the pushing of r12 to the stack, so essentially
PAC is always by itself in a scheduling region.
- CALL_BTI is expanded into a machine instruction bundle. Bundles are
unpacked only after the last machine scheduler run. Thus setjmp and BTI
can be separated only if someone deliberately run the scheduler once
more.
- The BTI insertion pass is run late in the pipeline, only after the
last machine scheduling has run. So once again it can be reordered only
if someone deliberately runs the scheduler again.

Nevertheless, one can reasonably argue that we should prevent the bug in
spite of the compiler not being able to produce the required conditions
for it. If things change, the compiler will be robust against this
issue.

The tests written for this are contrived: bogus MIR instructions have
been added adjacent to the BTI-clearing instructions in order to have
them inside non-trivial scheduling regions.
vhscampos added a commit that referenced this pull request Apr 15, 2024
…ndaries (#87982)

Following #68313 this patch
extends the idea to M-profile PACBTI.

The Machine Scheduler can reorder instructions within a scheduling
region depending on the scheduling policy set. If a BTI-clearing
instruction happens to partake in one such region, it might be moved
around, therefore ending up where it shouldn't.

The solution is to mark all BTI-clearing instructions as scheduling
region boundaries. This essentially means that they must not be part of
any scheduling region, and as consequence never get moved:

 - PAC
 - PACBTI
 - BTI
 - SG

Note that PAC isn't BTI-clearing, but it's replaced by PACBTI late in
the compilation pipeline.

As far as I know, currently it isn't possible to organically obtain code
that's susceptible to the bug:

- Instructions that write to SP are region boundaries. PAC seems to
always be followed by the pushing of r12 to the stack, so essentially
PAC is always by itself in a scheduling region.
- CALL_BTI is expanded into a machine instruction bundle. Bundles are
unpacked only after the last machine scheduler run. Thus setjmp and BTI
can be separated only if someone deliberately run the scheduler once
more.
- The BTI insertion pass is run late in the pipeline, only after the
last machine scheduling has run. So once again it can be reordered only
if someone deliberately runs the scheduler again.

Nevertheless, one can reasonably argue that we should prevent the bug in
spite of the compiler not being able to produce the required conditions
for it. If things change, the compiler will be robust against this
issue.

The tests written for this are contrived: bogus MIR instructions have
been added adjacent to the BTI-clearing instructions in order to have
them inside non-trivial scheduling regions.
bazuzi pushed a commit to bazuzi/llvm-project that referenced this pull request Apr 15, 2024
…ndaries (llvm#87982)

Following llvm#68313 this patch
extends the idea to M-profile PACBTI.

The Machine Scheduler can reorder instructions within a scheduling
region depending on the scheduling policy set. If a BTI-clearing
instruction happens to partake in one such region, it might be moved
around, therefore ending up where it shouldn't.

The solution is to mark all BTI-clearing instructions as scheduling
region boundaries. This essentially means that they must not be part of
any scheduling region, and as consequence never get moved:

 - PAC
 - PACBTI
 - BTI
 - SG

Note that PAC isn't BTI-clearing, but it's replaced by PACBTI late in
the compilation pipeline.

As far as I know, currently it isn't possible to organically obtain code
that's susceptible to the bug:

- Instructions that write to SP are region boundaries. PAC seems to
always be followed by the pushing of r12 to the stack, so essentially
PAC is always by itself in a scheduling region.
- CALL_BTI is expanded into a machine instruction bundle. Bundles are
unpacked only after the last machine scheduler run. Thus setjmp and BTI
can be separated only if someone deliberately run the scheduler once
more.
- The BTI insertion pass is run late in the pipeline, only after the
last machine scheduling has run. So once again it can be reordered only
if someone deliberately runs the scheduler again.

Nevertheless, one can reasonably argue that we should prevent the bug in
spite of the compiler not being able to produce the required conditions
for it. If things change, the compiler will be robust against this
issue.

The tests written for this are contrived: bogus MIR instructions have
been added adjacent to the BTI-clearing instructions in order to have
them inside non-trivial scheduling regions.
aniplcc pushed a commit to aniplcc/llvm-project that referenced this pull request Apr 15, 2024
…ndaries (llvm#87982)

Following llvm#68313 this patch
extends the idea to M-profile PACBTI.

The Machine Scheduler can reorder instructions within a scheduling
region depending on the scheduling policy set. If a BTI-clearing
instruction happens to partake in one such region, it might be moved
around, therefore ending up where it shouldn't.

The solution is to mark all BTI-clearing instructions as scheduling
region boundaries. This essentially means that they must not be part of
any scheduling region, and as consequence never get moved:

 - PAC
 - PACBTI
 - BTI
 - SG

Note that PAC isn't BTI-clearing, but it's replaced by PACBTI late in
the compilation pipeline.

As far as I know, currently it isn't possible to organically obtain code
that's susceptible to the bug:

- Instructions that write to SP are region boundaries. PAC seems to
always be followed by the pushing of r12 to the stack, so essentially
PAC is always by itself in a scheduling region.
- CALL_BTI is expanded into a machine instruction bundle. Bundles are
unpacked only after the last machine scheduler run. Thus setjmp and BTI
can be separated only if someone deliberately run the scheduler once
more.
- The BTI insertion pass is run late in the pipeline, only after the
last machine scheduling has run. So once again it can be reordered only
if someone deliberately runs the scheduler again.

Nevertheless, one can reasonably argue that we should prevent the bug in
spite of the compiler not being able to produce the required conditions
for it. If things change, the compiler will be robust against this
issue.

The tests written for this are contrived: bogus MIR instructions have
been added adjacent to the BTI-clearing instructions in order to have
them inside non-trivial scheduling regions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants