Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hwasan] Optimize outlined memaccess for fixed shadow on Aarch64 #88544

Merged
merged 4 commits into from
Apr 24, 2024

Conversation

thurstond
Copy link
Contributor

The HWASan transform currently always uses x20 to pass the shadow base to hwasan_check_memaccess_shortgranules, even if the shadow base is a constant known at compile time (via -hwasan-mapping-offset). This patch introduces a fixed shadow variant of the hwasan_check_memaccess_shortgranules intrinsic, allowing the shadow base to be materialized inside the memaccess callee.

We currently only support this optimization for AArch64. It is a no-op on other platforms, or if -hwasan-mapping-offset is not specified.

Note: when -hwasan-mapping-offset is specified, it is necessary to specify HWASAN_OPTIONS=fixed_shadow_base=... (see ea991a1) to ensure that the runtime will map the shadow appropriately.

@llvmbot
Copy link
Collaborator

llvmbot commented Apr 12, 2024

@llvm/pr-subscribers-compiler-rt-sanitizer
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-backend-aarch64

Author: Thurston Dang (thurstond)

Changes

The HWASan transform currently always uses x20 to pass the shadow base to hwasan_check_memaccess_shortgranules, even if the shadow base is a constant known at compile time (via -hwasan-mapping-offset). This patch introduces a fixed shadow variant of the hwasan_check_memaccess_shortgranules intrinsic, allowing the shadow base to be materialized inside the memaccess callee.

We currently only support this optimization for AArch64. It is a no-op on other platforms, or if -hwasan-mapping-offset is not specified.

Note: when -hwasan-mapping-offset is specified, it is necessary to specify HWASAN_OPTIONS=fixed_shadow_base=... (see ea991a1) to ensure that the runtime will map the shadow appropriately.


Full diff: https://github.com/llvm/llvm-project/pull/88544.diff

4 Files Affected:

  • (modified) llvm/include/llvm/IR/Intrinsics.td (+5)
  • (modified) llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (+40-9)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+7)
  • (modified) llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp (+23-5)
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index f0723a633f0fc5..dee12d15c69381 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -2367,6 +2367,10 @@ def int_hwasan_check_memaccess :
 def int_hwasan_check_memaccess_shortgranules :
   Intrinsic<[], [llvm_ptr_ty, llvm_ptr_ty, llvm_i32_ty],
             [ImmArg<ArgIndex<2>>]>;
+def int_hwasan_check_memaccess_fixedshadow_shortgranules :
+  Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty, llvm_i64_ty],
+            [ImmArg<ArgIndex<1>>, ImmArg<ArgIndex<2>>]>;
+
 
 // Xray intrinsics
 //===----------------------------------------------------------------------===//
@@ -2666,4 +2670,5 @@ include "llvm/IR/IntrinsicsVE.td"
 include "llvm/IR/IntrinsicsDirectX.td"
 include "llvm/IR/IntrinsicsLoongArch.td"
 
+
 #endif // TEST_INTRINSICS_SUPPRESS_DEFS
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index f6ccd0ecfdc893..5ab54d37f30686 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -117,6 +117,7 @@ class AArch64AsmPrinter : public AsmPrinter {
   void LowerPATCHABLE_EVENT_CALL(const MachineInstr &MI, bool Typed);
 
   typedef std::tuple<unsigned, bool, uint32_t> HwasanMemaccessTuple;
+  unsigned long long HwasanFixedShadowBase = (unsigned long long)-1;
   std::map<HwasanMemaccessTuple, MCSymbol *> HwasanMemaccessSymbols;
   void LowerKCFI_CHECK(const MachineInstr &MI);
   void LowerHWASAN_CHECK_MEMACCESS(const MachineInstr &MI);
@@ -551,8 +552,17 @@ void AArch64AsmPrinter::LowerKCFI_CHECK(const MachineInstr &MI) {
 void AArch64AsmPrinter::LowerHWASAN_CHECK_MEMACCESS(const MachineInstr &MI) {
   Register Reg = MI.getOperand(0).getReg();
   bool IsShort =
-      MI.getOpcode() == AArch64::HWASAN_CHECK_MEMACCESS_SHORTGRANULES;
+      ((MI.getOpcode() == AArch64::HWASAN_CHECK_MEMACCESS_SHORTGRANULES) ||
+       (MI.getOpcode() == AArch64::HWASAN_CHECK_MEMACCESS_FIXEDSHADOW_SHORTGRANULES));
   uint32_t AccessInfo = MI.getOperand(1).getImm();
+
+  if (MI.getOpcode() == AArch64::HWASAN_CHECK_MEMACCESS_FIXEDSHADOW_SHORTGRANULES) {
+      if (HwasanFixedShadowBase != (unsigned long long)-1)
+        assert(HwasanFixedShadowBase == (unsigned long long) MI.getOperand(2).getImm());
+
+      HwasanFixedShadowBase = MI.getOperand(2).getImm();
+  }
+
   MCSymbol *&Sym =
       HwasanMemaccessSymbols[HwasanMemaccessTuple(Reg, IsShort, AccessInfo)];
   if (!Sym) {
@@ -625,14 +635,34 @@ void AArch64AsmPrinter::emitHwasanMemaccessSymbols(Module &M) {
                                      .addImm(4)
                                      .addImm(55),
                                  *STI);
-    OutStreamer->emitInstruction(
-        MCInstBuilder(AArch64::LDRBBroX)
-            .addReg(AArch64::W16)
-            .addReg(IsShort ? AArch64::X20 : AArch64::X9)
-            .addReg(AArch64::X16)
-            .addImm(0)
-            .addImm(0),
-        *STI);
+
+    if (HwasanFixedShadowBase != (unsigned long long)-1) {
+      assert(IsShort);
+      OutStreamer->emitInstruction(
+          MCInstBuilder(AArch64::MOVZXi)
+              .addReg(AArch64::X17)
+              .addImm(HwasanFixedShadowBase >> 32)
+              .addImm(32),
+          *STI);
+      OutStreamer->emitInstruction(
+          MCInstBuilder(AArch64::LDRBBroX)
+              .addReg(AArch64::W16)
+              .addReg(AArch64::X17)
+              .addReg(AArch64::X16)
+              .addImm(0)
+              .addImm(0),
+          *STI);
+    } else {
+      OutStreamer->emitInstruction(
+          MCInstBuilder(AArch64::LDRBBroX)
+              .addReg(AArch64::W16)
+              .addReg(IsShort ? AArch64::X20 : AArch64::X9)
+              .addReg(AArch64::X16)
+              .addImm(0)
+              .addImm(0),
+          *STI);
+    }
+
     OutStreamer->emitInstruction(
         MCInstBuilder(AArch64::SUBSXrs)
             .addReg(AArch64::XZR)
@@ -1765,6 +1795,7 @@ void AArch64AsmPrinter::emitInstruction(const MachineInstr *MI) {
 
   case AArch64::HWASAN_CHECK_MEMACCESS:
   case AArch64::HWASAN_CHECK_MEMACCESS_SHORTGRANULES:
+  case AArch64::HWASAN_CHECK_MEMACCESS_FIXEDSHADOW_SHORTGRANULES:
     LowerHWASAN_CHECK_MEMACCESS(*MI);
     return;
 
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index e1624f70185e1e..20f1a40263f824 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -1818,6 +1818,13 @@ def HWASAN_CHECK_MEMACCESS_SHORTGRANULES : Pseudo<
   Sched<[]>;
 }
 
+let Defs = [ X16, X17, LR, NZCV ] in {
+def HWASAN_CHECK_MEMACCESS_FIXEDSHADOW_SHORTGRANULES : Pseudo<
+  (outs), (ins GPR64noip:$ptr, i32imm:$accessinfo, i64imm:$shadow_base),
+  [(int_hwasan_check_memaccess_fixedshadow_shortgranules GPR64noip:$ptr, (i32 timm:$accessinfo), (i64 timm:$shadow_base))]>,
+  Sched<[]>;
+}
+
 // The virtual cycle counter register is CNTVCT_EL0.
 def : Pat<(readcyclecounter), (MRS 0xdf02)>;
 
diff --git a/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp b/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
index 82f60f4feed454..41c492a0000f17 100644
--- a/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
@@ -929,11 +929,29 @@ void HWAddressSanitizer::instrumentMemAccessOutline(Value *Ptr, bool IsWrite,
 
   IRBuilder<> IRB(InsertBefore);
   Module *M = IRB.GetInsertBlock()->getParent()->getParent();
-  IRB.CreateCall(Intrinsic::getDeclaration(
-                     M, UseShortGranules
-                            ? Intrinsic::hwasan_check_memaccess_shortgranules
-                            : Intrinsic::hwasan_check_memaccess),
-                 {ShadowBase, Ptr, ConstantInt::get(Int32Ty, AccessInfo)});
+
+  // Aarch64 makes it difficult to embed large constants (such as the shadow
+  // offset) in the code. Our intrinsic supports a 16-bit constant (to be left
+  // shifted by 32 bits) - this is not an onerous constraint, since
+  // 1) kShadowBaseAlignment == 32 2) an offset of 4TB (1024 << 32) is
+  // representable, and ought to be good enough for anybody.
+  bool useFixedShadowIntrinsic = true;
+  if (TargetTriple.isAArch64() && ClMappingOffset.getNumOccurrences() > 0 && UseShortGranules) {
+    uint16_t offset_shifted = Mapping.Offset >> 32;
+    if ((uint64_t)offset_shifted << 32 != Mapping.Offset)
+        useFixedShadowIntrinsic = false;
+  } else
+    useFixedShadowIntrinsic = false;
+
+  if (useFixedShadowIntrinsic)
+    IRB.CreateCall(Intrinsic::getDeclaration(
+                       M, Intrinsic::hwasan_check_memaccess_fixedshadow_shortgranules),
+                   {Ptr, ConstantInt::get(Int32Ty, AccessInfo), ConstantInt::get(Int64Ty, Mapping.Offset)});
+  else
+    IRB.CreateCall(Intrinsic::getDeclaration(
+                       M, UseShortGranules ? Intrinsic::hwasan_check_memaccess_shortgranules
+                                           : Intrinsic::hwasan_check_memaccess),
+                   {ShadowBase, Ptr, ConstantInt::get(Int32Ty, AccessInfo)});
 }
 
 void HWAddressSanitizer::instrumentMemAccessInline(Value *Ptr, bool IsWrite,

@llvmbot
Copy link
Collaborator

llvmbot commented Apr 12, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Thurston Dang (thurstond)

Changes

The HWASan transform currently always uses x20 to pass the shadow base to hwasan_check_memaccess_shortgranules, even if the shadow base is a constant known at compile time (via -hwasan-mapping-offset). This patch introduces a fixed shadow variant of the hwasan_check_memaccess_shortgranules intrinsic, allowing the shadow base to be materialized inside the memaccess callee.

We currently only support this optimization for AArch64. It is a no-op on other platforms, or if -hwasan-mapping-offset is not specified.

Note: when -hwasan-mapping-offset is specified, it is necessary to specify HWASAN_OPTIONS=fixed_shadow_base=... (see ea991a1) to ensure that the runtime will map the shadow appropriately.


Full diff: https://github.com/llvm/llvm-project/pull/88544.diff

4 Files Affected:

  • (modified) llvm/include/llvm/IR/Intrinsics.td (+5)
  • (modified) llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (+40-9)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+7)
  • (modified) llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp (+23-5)
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index f0723a633f0fc5..dee12d15c69381 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -2367,6 +2367,10 @@ def int_hwasan_check_memaccess :
 def int_hwasan_check_memaccess_shortgranules :
   Intrinsic<[], [llvm_ptr_ty, llvm_ptr_ty, llvm_i32_ty],
             [ImmArg<ArgIndex<2>>]>;
+def int_hwasan_check_memaccess_fixedshadow_shortgranules :
+  Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty, llvm_i64_ty],
+            [ImmArg<ArgIndex<1>>, ImmArg<ArgIndex<2>>]>;
+
 
 // Xray intrinsics
 //===----------------------------------------------------------------------===//
@@ -2666,4 +2670,5 @@ include "llvm/IR/IntrinsicsVE.td"
 include "llvm/IR/IntrinsicsDirectX.td"
 include "llvm/IR/IntrinsicsLoongArch.td"
 
+
 #endif // TEST_INTRINSICS_SUPPRESS_DEFS
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index f6ccd0ecfdc893..5ab54d37f30686 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -117,6 +117,7 @@ class AArch64AsmPrinter : public AsmPrinter {
   void LowerPATCHABLE_EVENT_CALL(const MachineInstr &MI, bool Typed);
 
   typedef std::tuple<unsigned, bool, uint32_t> HwasanMemaccessTuple;
+  unsigned long long HwasanFixedShadowBase = (unsigned long long)-1;
   std::map<HwasanMemaccessTuple, MCSymbol *> HwasanMemaccessSymbols;
   void LowerKCFI_CHECK(const MachineInstr &MI);
   void LowerHWASAN_CHECK_MEMACCESS(const MachineInstr &MI);
@@ -551,8 +552,17 @@ void AArch64AsmPrinter::LowerKCFI_CHECK(const MachineInstr &MI) {
 void AArch64AsmPrinter::LowerHWASAN_CHECK_MEMACCESS(const MachineInstr &MI) {
   Register Reg = MI.getOperand(0).getReg();
   bool IsShort =
-      MI.getOpcode() == AArch64::HWASAN_CHECK_MEMACCESS_SHORTGRANULES;
+      ((MI.getOpcode() == AArch64::HWASAN_CHECK_MEMACCESS_SHORTGRANULES) ||
+       (MI.getOpcode() == AArch64::HWASAN_CHECK_MEMACCESS_FIXEDSHADOW_SHORTGRANULES));
   uint32_t AccessInfo = MI.getOperand(1).getImm();
+
+  if (MI.getOpcode() == AArch64::HWASAN_CHECK_MEMACCESS_FIXEDSHADOW_SHORTGRANULES) {
+      if (HwasanFixedShadowBase != (unsigned long long)-1)
+        assert(HwasanFixedShadowBase == (unsigned long long) MI.getOperand(2).getImm());
+
+      HwasanFixedShadowBase = MI.getOperand(2).getImm();
+  }
+
   MCSymbol *&Sym =
       HwasanMemaccessSymbols[HwasanMemaccessTuple(Reg, IsShort, AccessInfo)];
   if (!Sym) {
@@ -625,14 +635,34 @@ void AArch64AsmPrinter::emitHwasanMemaccessSymbols(Module &M) {
                                      .addImm(4)
                                      .addImm(55),
                                  *STI);
-    OutStreamer->emitInstruction(
-        MCInstBuilder(AArch64::LDRBBroX)
-            .addReg(AArch64::W16)
-            .addReg(IsShort ? AArch64::X20 : AArch64::X9)
-            .addReg(AArch64::X16)
-            .addImm(0)
-            .addImm(0),
-        *STI);
+
+    if (HwasanFixedShadowBase != (unsigned long long)-1) {
+      assert(IsShort);
+      OutStreamer->emitInstruction(
+          MCInstBuilder(AArch64::MOVZXi)
+              .addReg(AArch64::X17)
+              .addImm(HwasanFixedShadowBase >> 32)
+              .addImm(32),
+          *STI);
+      OutStreamer->emitInstruction(
+          MCInstBuilder(AArch64::LDRBBroX)
+              .addReg(AArch64::W16)
+              .addReg(AArch64::X17)
+              .addReg(AArch64::X16)
+              .addImm(0)
+              .addImm(0),
+          *STI);
+    } else {
+      OutStreamer->emitInstruction(
+          MCInstBuilder(AArch64::LDRBBroX)
+              .addReg(AArch64::W16)
+              .addReg(IsShort ? AArch64::X20 : AArch64::X9)
+              .addReg(AArch64::X16)
+              .addImm(0)
+              .addImm(0),
+          *STI);
+    }
+
     OutStreamer->emitInstruction(
         MCInstBuilder(AArch64::SUBSXrs)
             .addReg(AArch64::XZR)
@@ -1765,6 +1795,7 @@ void AArch64AsmPrinter::emitInstruction(const MachineInstr *MI) {
 
   case AArch64::HWASAN_CHECK_MEMACCESS:
   case AArch64::HWASAN_CHECK_MEMACCESS_SHORTGRANULES:
+  case AArch64::HWASAN_CHECK_MEMACCESS_FIXEDSHADOW_SHORTGRANULES:
     LowerHWASAN_CHECK_MEMACCESS(*MI);
     return;
 
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index e1624f70185e1e..20f1a40263f824 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -1818,6 +1818,13 @@ def HWASAN_CHECK_MEMACCESS_SHORTGRANULES : Pseudo<
   Sched<[]>;
 }
 
+let Defs = [ X16, X17, LR, NZCV ] in {
+def HWASAN_CHECK_MEMACCESS_FIXEDSHADOW_SHORTGRANULES : Pseudo<
+  (outs), (ins GPR64noip:$ptr, i32imm:$accessinfo, i64imm:$shadow_base),
+  [(int_hwasan_check_memaccess_fixedshadow_shortgranules GPR64noip:$ptr, (i32 timm:$accessinfo), (i64 timm:$shadow_base))]>,
+  Sched<[]>;
+}
+
 // The virtual cycle counter register is CNTVCT_EL0.
 def : Pat<(readcyclecounter), (MRS 0xdf02)>;
 
diff --git a/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp b/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
index 82f60f4feed454..41c492a0000f17 100644
--- a/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
@@ -929,11 +929,29 @@ void HWAddressSanitizer::instrumentMemAccessOutline(Value *Ptr, bool IsWrite,
 
   IRBuilder<> IRB(InsertBefore);
   Module *M = IRB.GetInsertBlock()->getParent()->getParent();
-  IRB.CreateCall(Intrinsic::getDeclaration(
-                     M, UseShortGranules
-                            ? Intrinsic::hwasan_check_memaccess_shortgranules
-                            : Intrinsic::hwasan_check_memaccess),
-                 {ShadowBase, Ptr, ConstantInt::get(Int32Ty, AccessInfo)});
+
+  // Aarch64 makes it difficult to embed large constants (such as the shadow
+  // offset) in the code. Our intrinsic supports a 16-bit constant (to be left
+  // shifted by 32 bits) - this is not an onerous constraint, since
+  // 1) kShadowBaseAlignment == 32 2) an offset of 4TB (1024 << 32) is
+  // representable, and ought to be good enough for anybody.
+  bool useFixedShadowIntrinsic = true;
+  if (TargetTriple.isAArch64() && ClMappingOffset.getNumOccurrences() > 0 && UseShortGranules) {
+    uint16_t offset_shifted = Mapping.Offset >> 32;
+    if ((uint64_t)offset_shifted << 32 != Mapping.Offset)
+        useFixedShadowIntrinsic = false;
+  } else
+    useFixedShadowIntrinsic = false;
+
+  if (useFixedShadowIntrinsic)
+    IRB.CreateCall(Intrinsic::getDeclaration(
+                       M, Intrinsic::hwasan_check_memaccess_fixedshadow_shortgranules),
+                   {Ptr, ConstantInt::get(Int32Ty, AccessInfo), ConstantInt::get(Int64Ty, Mapping.Offset)});
+  else
+    IRB.CreateCall(Intrinsic::getDeclaration(
+                       M, UseShortGranules ? Intrinsic::hwasan_check_memaccess_shortgranules
+                                           : Intrinsic::hwasan_check_memaccess),
+                   {ShadowBase, Ptr, ConstantInt::get(Int32Ty, AccessInfo)});
 }
 
 void HWAddressSanitizer::instrumentMemAccessInline(Value *Ptr, bool IsWrite,

Copy link

github-actions bot commented Apr 12, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -117,6 +117,7 @@ class AArch64AsmPrinter : public AsmPrinter {
void LowerPATCHABLE_EVENT_CALL(const MachineInstr &MI, bool Typed);

typedef std::tuple<unsigned, bool, uint32_t> HwasanMemaccessTuple;
unsigned long long HwasanFixedShadowBase = (unsigned long long)-1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just use an optional

do i understand correctly that this is just to make sure the offsets of all the HWASAN_CHECK_MEMACCESS_FIXEDSHADOW_SHORTGRANULES are the same?

Copy link
Contributor Author

@thurstond thurstond Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just use an optional

Will do that

do i understand correctly that this is just to make sure the offsets of all the HWASAN_CHECK_MEMACCESS_FIXEDSHADOW_SHORTGRANULES are the same?

Yes.

// shifted by 32 bits) - this is not an onerous constraint, since
// 1) kShadowBaseAlignment == 32 2) an offset of 4TB (1024 << 32) is
// representable, and ought to be good enough for anybody.
bool useFixedShadowIntrinsic = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

= false and drop the else branch (it's cleaner dot gif)

also can make it useFixedShadowIntrinsic = (uint64_t)offset_shifted << 32 == Mapping.Offset

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do that

*STI);

if (HwasanFixedShadowBase != (unsigned long long)-1) {
assert(IsShort);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just because we didn't bother to implement this for non-short at this point, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Non-short is only used by Android 10 and earlier; implementing fixed shadow optimization for that is left as an exercise for the reader.

  // Older versions of Android do not have the required runtime support for
  // short granules, global or personality function instrumentation. On other
  // platforms we currently require using the latest version of the runtime.
  bool NewRuntime =
      !TargetTriple.isAndroid() || !TargetTriple.isAndroidVersionLT(30);

  UseShortGranules = optOr(ClUseShortGranules, NewRuntime);

(llvm-project/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Implicit request: add this as a comment :))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment (sans Polestar sponsored message)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this stage better to make it simple without assumptions about android.
We have intrinsic then just lower it, it's up to HWASAN pass to generate code right.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of this comment is why we don't support non-short

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should support, instead of explaining.

And we may want non-shorts for SelHWASAN

Copy link
Collaborator

@vitalybuka vitalybuka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need 2 tests:

  1. llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
  2. llvm/lib/Target/AArch64/

If it's not a update to the test, please precommit the current behavior in a separate patch.

I'd would recommend to move HWAddressSanitizer.cpp and test into a patch
We can introduce intrinsic and lowering in the first patch.

I don't insist about splitting, though.

@thurstond
Copy link
Contributor Author

we need 2 tests:

  1. llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
  2. llvm/lib/Target/AArch64/

If it's not a update to the test, please precommit the current behavior in a separate patch.

I'd would recommend to move HWAddressSanitizer.cpp and test into a patch We can introduce intrinsic and lowering in the first patch.

I don't insist about splitting, though.

Coming soon!

@thurstond thurstond marked this pull request as draft April 12, 2024 21:05
thurstond added a commit to thurstond/llvm-project that referenced this pull request Apr 18, 2024
This separates out the first half of "Optimize outlined memaccess for fixed shadow on Aarch64" (llvm#88544).
This patch does not meaningfully affect the behavior of HWASan, since the second half of that patch has the changes
that will make HWASan use these intrinsics.

This patch introduces HWASan memaccess intrinsics that assume a fixed shadow
(with the offset provided by --hwasan-mapping-offset=...), with and without
short granule support.

We currently only support lowering the LLVM IR intrinsic to AArch64.

The test case is adapted from hwasan-check-memaccess.ll.
thurstond added a commit to thurstond/llvm-project that referenced this pull request Apr 18, 2024
This separates out the first half of "Optimize outlined memaccess for fixed shadow on Aarch64" (llvm#88544).
This patch does not meaningfully affect the behavior of HWASan, since the second half of that patch has the changes
that will make HWASan use these intrinsics.

This patch introduces HWASan memaccess intrinsics that assume a fixed shadow
(with the offset provided by --hwasan-mapping-offset=...), with and without
short granule support.

We currently only support lowering the LLVM IR intrinsic to AArch64.

The test case is adapted from hwasan-check-memaccess.ll.
thurstond added a commit that referenced this pull request Apr 22, 2024
This patch introduces HWASan memaccess intrinsics that assume a fixed
shadow (with the offset provided by --hwasan-mapping-offset=...), with
and without short granule support.

The behavior of HWASan is not meaningfully changed by this patch;
future work ("Optimize outlined memaccess for
fixed shadow on Aarch64": #88544) will make HWASan use these intrinsics.

We currently only support lowering the LLVM IR intrinsic to AArch64.

The test case is adapted from hwasan-check-memaccess.ll.
@thurstond thurstond force-pushed the memaccess_fixedshadow branch 3 times, most recently from 3c8ab87 to ea9ae97 Compare April 23, 2024 00:44
@thurstond
Copy link
Contributor Author

Rebased on top of the memaccess intrinsics patch

The test case for this CL already exists in llvm-project/compiler-rt/test/hwasan/TestCases/Linux/fixed-shadow.c

// representable.
// In particular, an offset of 4TB (1024 << 32) is representable, and
// ought to be good enough for anybody.
if (TargetTriple.isAArch64() && ClMappingOffset.getNumOccurrences() > 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do && Mapping.Offset != kDynamicShadowSentinel ?
so we can make Mapping as source of truth after ShadowMapping::init

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in cdbecc6

thurstond added a commit to thurstond/llvm-project that referenced this pull request Apr 23, 2024
This test records the current behavior of HWASan, which doesn't
utilize the fixed shadow intrinsics of llvm@365bddf

It is intended to be updated in future work ("Optimize outlined memaccess for fixed shadow on Aarch64"; llvm#88544)
thurstond added a commit that referenced this pull request Apr 23, 2024
This test records the current behavior of HWASan, which doesn't utilize
the fixed shadow intrinsics of
365bddf

It is intended to be updated in future work ("Optimize outlined
memaccess for fixed shadow on Aarch64";
#88544)
@thurstond thurstond marked this pull request as ready for review April 23, 2024 20:37
The HWASan transform currently always uses x20 to pass the shadow base to hwasan_check_memaccess_shortgranules, even
if the shadow base is a constant known at compile time (via -hwasan-mapping-offset). This patch uses the fixed
shadow variant of the hwasan_check_memaccess_shortgranules intrinsic (introduced in
llvm@365bddf), allowing the shadow base to
be materialized inside the memaccess callee.

We currently only support this optimization for AArch64. It is a no-op on other platforms, or if -hwasan-mapping-offset is not specified.

Note: when -hwasan-mapping-offset is specified, it is necessary to specify HWASAN_OPTIONS=fixed_shadow_base=... (see ea991a1) to ensure that the runtime will map the shadow appropriately.
@thurstond thurstond merged commit 744d469 into llvm:main Apr 24, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants