[msan] Unpoison indirect outputs for userspace using memset for large operands #79924

MaskRay · 2024-01-30T01:02:20Z

Modify #77393 to clear shadow memory using llvm.memset.* when the size
is large, similar to shouldUseBZeroPlusStoresToInitialize in clang for
-ftrivial-auto-var-init=. The intrinsic, if lowered to libcall, will use the msan interceptor.

The instruction selector lowers a StoreInst to multiple stores, not
utilizing memset. When the size is large (e.g.
store { [100 x i32] } zeroinitializer, ptr %12, align 1), the
generated code will be long (and CodeGenPrepare::optimizeInst will
even crash for a huge size).

// Test stack size
template <class T>
void DoNotOptimize(const T& var) { // deprecated by https://github.com/google/benchmark/pull/1493
  asm volatile("" : "+m"(const_cast<T&>(var)));
}

int main() {
  using LargeArray = std::array<int, 1000000>;
  auto large_stack = []() { DoNotOptimize(LargeArray()); };
  /////// CodeGenPrepare::optimizeInst triggers an assertion failure when creating an integer type with a bit width>2**23
  large_stack();
}

Created using spr 1.3.4

llvmbot · 2024-01-30T01:02:49Z

@llvm/pr-subscribers-compiler-rt-sanitizer

@llvm/pr-subscribers-llvm-transforms

Author: Fangrui Song (MaskRay)

Changes

Modify #77393 to clear shadow memory using llvm.memset.* instead of
StoreInst. poisonAllocaUserspace uses llvm.memset.* as well.

The instruction selector lowers the StoreInst to many stores, not
utilizing memset. When the size is large (e.g.
store { [100 x i32] } zeroinitializer, ptr %12, align 1), the
generated code will be long, and SelectionDAG will even crash if the
object is too large. llvm.memset.* does not have the problem.

// Test stack size
template &lt;class T&gt;
void DoNotOptimize(const T&amp; var) { // deprecated by https://github.com/google/benchmark/pull/1493
  asm volatile("" : "+m"(const_cast&lt;T&amp;&gt;(var)));
}

int main() {
  using LargeArray = std::array&lt;int, 1000000&gt;;
  auto large_stack = []() { DoNotOptimize(LargeArray()); };
  /////// SelectionDAG crash if StoreInst is used
  large_stack();
}

Full diff: https://github.com/llvm/llvm-project/pull/79924.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp (+4-1)
(modified) llvm/test/Instrumentation/MemorySanitizer/msan_asm_conservative.ll (+8-8)

diff --git a/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp b/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
index 2b697557d8a92c..0806d7a5b14527 100644
--- a/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
@@ -4559,9 +4559,12 @@ struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
     } else {
       // ElemTy, derived from elementtype(), does not encode the alignment of
       // the pointer. Conservatively assume that the shadow memory is unaligned.
+      // Avoid StoreInst as SizeVal may be large, expanding to many
+      // instructions.
       auto [ShadowPtr, _] =
           getShadowOriginPtrUserspace(Operand, IRB, IRB.getInt8Ty(), Align(1));
-      IRB.CreateAlignedStore(getCleanShadow(ElemTy), ShadowPtr, Align(1));
+      IRB.CreateMemSet(ShadowPtr, ConstantInt::getNullValue(IRB.getInt8Ty()),
+                       SizeVal, Align(1));
     }
   }
 
diff --git a/llvm/test/Instrumentation/MemorySanitizer/msan_asm_conservative.ll b/llvm/test/Instrumentation/MemorySanitizer/msan_asm_conservative.ll
index 894f76b9b8d32a..86ca697ed9a4c6 100644
--- a/llvm/test/Instrumentation/MemorySanitizer/msan_asm_conservative.ll
+++ b/llvm/test/Instrumentation/MemorySanitizer/msan_asm_conservative.ll
@@ -177,8 +177,8 @@ entry:
 }
 
 ; CHECK-LABEL: @f_2i_2o_mem
-; USER-CONS:  store i32 0, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @id1 to i64), i64 87960930222080) to ptr), align 1
-; USER-CONS:  store i32 0, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @id2 to i64), i64 87960930222080) to ptr), align 1
+; USER-CONS:  call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @id1 to i64), i64 87960930222080) to ptr), i8 0, i64 4, i1 false)
+; USER-CONS:  call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @id2 to i64), i64 87960930222080) to ptr), i8 0, i64 4, i1 false)
 ; CHECK-CONS: call void @__msan_instrument_asm_store({{.*}}@id1{{.*}}, i64 4)
 ; CHECK-CONS: call void @__msan_instrument_asm_store({{.*}}@id2{{.*}}, i64 4)
 ; CHECK: call void asm "", "=*m,=*m,*m,*m,~{dirflag},~{fpsr},~{flags}"(ptr elementtype(i32) @id1, ptr elementtype(i32) @id2, ptr elementtype(i32) @is1, ptr elementtype(i32) @is2)
@@ -196,7 +196,7 @@ entry:
 
 ; CHECK-LABEL: @f_1i_1o_memreg
 ; CHECK: [[IS1_F7:%.*]] = load i32, ptr @is1, align 4
-; USER-CONS:  store i32 0, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @id1 to i64), i64 87960930222080) to ptr), align 1
+; USER-CONS:  call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @id1 to i64), i64 87960930222080) to ptr), i8 0, i64 4, i1 false)
 ; CHECK-CONS: call void @__msan_instrument_asm_store({{.*}}@id1{{.*}}, i64 4)
 ; CHECK: call void @__msan_warning
 ; CHECK: call i32 asm "", "=r,=*m,r,*m,~{dirflag},~{fpsr},~{flags}"(ptr elementtype(i32) @id1, i32 [[IS1_F7]], ptr elementtype(i32) @is1)
@@ -215,7 +215,7 @@ entry:
 }
 
 ; CHECK-LABEL: @f_3o_reg_mem_reg
-; USER-CONS:  store i32 0, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @id2 to i64), i64 87960930222080) to ptr), align 1
+; USER-CONS:  call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @id2 to i64), i64 87960930222080) to ptr), i8 0, i64 4, i1 false)
 ; CHECK-CONS: call void @__msan_instrument_asm_store(ptr @id2, i64 4)
 ; CHECK: call { i32, i32 } asm "", "=r,=*m,=r,~{dirflag},~{fpsr},~{flags}"(ptr elementtype(i32) @id2)
 
@@ -240,7 +240,7 @@ entry:
 ; CHECK: [[PAIR1_F9:%.*]] = load {{.*}} @pair1
 ; CHECK: [[C1_F9:%.*]] = load {{.*}} @c1
 ; CHECK: [[MEMCPY_S1_F9:%.*]] = load {{.*}} @memcpy_s1
-; USER-CONS:  store { i32, i32 } zeroinitializer, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @pair2 to i64), i64 87960930222080) to ptr), align 1
+; USER-CONS:  call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @pair2 to i64), i64 87960930222080) to ptr), i8 0, i64 8, i1 false)
 ; CHECK-CONS: call void @__msan_instrument_asm_store({{.*}}@pair2{{.*}}, i64 8)
 ; CHECK: call void @__msan_warning
 ; KMSAN: call void @__msan_warning
@@ -257,9 +257,9 @@ entry:
 }
 
 ; CHECK-LABEL: @f_3i_3o_complex_mem
-; USER-CONS:       store { i32, i32 } zeroinitializer, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @pair2 to i64), i64 87960930222080) to ptr), align 1
-; USER-CONS-NEXT:  store i8 0, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @c2 to i64), i64 87960930222080) to ptr), align 1
-; USER-CONS-NEXT:  store i64 0, ptr inttoptr (i64 xor (i64 ptrtoint (ptr @memcpy_d1 to i64), i64 87960930222080) to ptr), align 1
+; USER-CONS:       call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @pair2 to i64), i64 87960930222080) to ptr), i8 0, i64 8, i1 false)
+; USER-CONS-NEXT:  call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @c2 to i64), i64 87960930222080) to ptr), i8 0, i64 1, i1 false)
+; USER-CONS-NEXT:  call void @llvm.memset.p0.i64(ptr align 1 inttoptr (i64 xor (i64 ptrtoint (ptr @memcpy_d1 to i64), i64 87960930222080) to ptr), i8 0, i64 8, i1 false)
 ; CHECK-CONS:      call void @__msan_instrument_asm_store({{.*}}@pair2{{.*}}, i64 8)
 ; CHECK-CONS:      call void @__msan_instrument_asm_store({{.*}}@c2{{.*}}, i64 1)
 ; CHECK-CONS:      call void @__msan_instrument_asm_store({{.*}}@memcpy_d1{{.*}}, i64 8)

ramosian-glider · 2024-01-30T13:14:36Z

LGTM in general.

But the downside here is that the compiler won't be able to optimize away repeated stores if they are transformed to memset calls.
Maybe introduce some threshold and only memset() sizes greater than, say, 64? (IIRC that's what -ftrivial-auto-var-init does)

Created using spr 1.3.4

MaskRay · 2024-01-30T19:11:37Z

LGTM in general.

But the downside here is that the compiler won't be able to optimize away repeated stores if they are transformed to memset calls. Maybe introduce some threshold and only memset() sizes greater than, say, 64? (IIRC that's what -ftrivial-auto-var-init does)

TIL. Changed to keep using StoreInst when size <= 32, similar to shouldUseBZeroPlusStoresToInitialize -ftrivial-auto-var-init)

Created using spr 1.3.4

vitalybuka · 2024-01-30T21:18:27Z

llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp

+      if (Size <= 32)
+        IRB.CreateAlignedStore(getCleanShadow(ElemTy), ShadowPtr, Align(1));
+      else
+        IRB.CreateMemSet(ShadowPtr, ConstantInt::getNullValue(IRB.getInt8Ty()),


it will hit interceptor

Never mind, we do this all the time here.

ramosian-glider · 2024-01-31T10:26:58Z

llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp

      auto [ShadowPtr, _] =
          getShadowOriginPtrUserspace(Operand, IRB, IRB.getInt8Ty(), Align(1));
-      IRB.CreateAlignedStore(getCleanShadow(ElemTy), ShadowPtr, Align(1));
+      if (Size <= 32)


Maybe some comment wouldn't hurt here to clarify this constant.

This is a bit arbitrary. Perhaps

// The size threshold matches shouldUseBZeroPlusStoresToInitialize for -ftrivial-auto-var-init=zero?

Memset will go into interceptor and do a lot of check, see __msan_memset, so it's more expensive than a regular memset. I can't tell what can be optimal constant here, but I would expect something larger. But I doubt it will make a meaningful difference.

Thanks for your previous comment about the interceptor. The committed patch does contain this description:
"The intrinsic, if lowered to libcall, will use the msan interceptor."

Inline asm isn't commonly used:) This patch is for =m in extended asm, which I believe is more uncommon, and if used, usually with small objects. I guess 32 and 64 won't make a difference.

[𝘀𝗽𝗿] initial version

07043d2

Created using spr 1.3.4

llvmbot added compiler-rt:sanitizer llvm:transforms labels Jan 30, 2024

MaskRay requested review from fmayer and vitalybuka January 30, 2024 01:02

MaskRay requested a review from ramosian-glider January 30, 2024 01:02

use StoreInst when size<=32, similar to -ftrivial-auto-init=

d0bd88e

Created using spr 1.3.4

improve test and comment

d4d4c73

Created using spr 1.3.4

fmayer approved these changes Jan 30, 2024

View reviewed changes

vitalybuka reviewed Jan 30, 2024

View reviewed changes

vitalybuka approved these changes Jan 30, 2024

View reviewed changes

MaskRay changed the title [msan] Unpoison indirect outputs for userspace using llvm.memset.* [msan] Unpoison indirect outputs for userspace using memset for large operands Jan 30, 2024

MaskRay merged commit 9b91c54 into main Jan 30, 2024
3 of 4 checks passed

MaskRay deleted the users/MaskRay/spr/msan-unpoison-indirect-outputs-for-userspace-using-llvmmemset branch January 30, 2024 21:45

ramosian-glider reviewed Jan 31, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[msan] Unpoison indirect outputs for userspace using memset for large operands #79924

[msan] Unpoison indirect outputs for userspace using memset for large operands #79924

MaskRay commented Jan 30, 2024 •

edited

llvmbot commented Jan 30, 2024 •

edited

ramosian-glider commented Jan 30, 2024

MaskRay commented Jan 30, 2024

vitalybuka Jan 30, 2024

vitalybuka Jan 30, 2024

ramosian-glider Jan 31, 2024

MaskRay Feb 2, 2024

vitalybuka Feb 2, 2024

MaskRay Feb 3, 2024 •

edited

[msan] Unpoison indirect outputs for userspace using memset for large operands #79924

[msan] Unpoison indirect outputs for userspace using memset for large operands #79924

Conversation

MaskRay commented Jan 30, 2024 • edited

llvmbot commented Jan 30, 2024 • edited

ramosian-glider commented Jan 30, 2024

MaskRay commented Jan 30, 2024

vitalybuka Jan 30, 2024

Choose a reason for hiding this comment

vitalybuka Jan 30, 2024

Choose a reason for hiding this comment

ramosian-glider Jan 31, 2024

Choose a reason for hiding this comment

MaskRay Feb 2, 2024

Choose a reason for hiding this comment

vitalybuka Feb 2, 2024

Choose a reason for hiding this comment

MaskRay Feb 3, 2024 • edited

Choose a reason for hiding this comment

MaskRay commented Jan 30, 2024 •

edited

llvmbot commented Jan 30, 2024 •

edited

MaskRay Feb 3, 2024 •

edited