Skip to content

[msan] Add optional flag to improve instrumentation of disjoint OR #145990

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 27, 2025

Conversation

thurstond
Copy link
Contributor

@thurstond thurstond commented Jun 26, 2025

The disjoint OR (#72583) of two '1's is poison, hence the corresponding shadow memory needs to be uninitialized (rather than initialized [i.e. false negative], as per the existing instrumentation which ignores disjointedness). This patch adds a flag, -msan-precise-disjoint-or, which defaults to false (the legacy behavior). A future patch will default this flag to true.

Updates the test from #145982

The disjoint OR (llvm#72583) of two
'1's is poison, hence the corresponding shadow memory needs to be
uninitialized (rather than initialized, as per the existing
instrumentation which ignores disjointedness).

Updates the test from llvm#145982
@llvmbot
Copy link
Member

llvmbot commented Jun 26, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Thurston Dang (thurstond)

Changes

The disjoint OR (#72583) of two '1's is poison, hence the corresponding shadow memory needs to be uninitialized (rather than initialized, as per the existing instrumentation which ignores disjointedness).

Updates the test from #145982


Full diff: https://github.com/llvm/llvm-project/pull/145990.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp (+19-6)
  • (modified) llvm/test/Instrumentation/MemorySanitizer/or.ll (+3-1)
diff --git a/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp b/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
index 1a76898bd61c6..f06600a55ef27 100644
--- a/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
@@ -2497,11 +2497,16 @@ struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
 
   void visitOr(BinaryOperator &I) {
     IRBuilder<> IRB(&I);
-    //  "Or" of 1 and a poisoned value results in unpoisoned value.
-    //  1|1 => 1;     0|1 => 1;     p|1 => 1;
-    //  1|0 => 1;     0|0 => 0;     p|0 => p;
-    //  1|p => 1;     0|p => p;     p|p => p;
-    //  S = (S1 & S2) | (~V1 & S2) | (S1 & ~V2)
+    //  "Or" of 1 and a poisoned value results in unpoisoned value:
+    //    1|1 => 1;     0|1 => 1;     p|1 => 1;
+    //    1|0 => 1;     0|0 => 0;     p|0 => p;
+    //    1|p => 1;     0|p => p;     p|p => p;
+    //
+    //    S = (S1 & S2) | (~V1 & S2) | (S1 & ~V2)
+    //
+    //  Addendum if the "Or" is "disjoint":
+    //    1|1 => p;
+    //    S = S | (V1 & V2)
     Value *S1 = getShadow(&I, 0);
     Value *S2 = getShadow(&I, 1);
     Value *V1 = IRB.CreateNot(I.getOperand(0));
@@ -2513,7 +2518,15 @@ struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
     Value *S1S2 = IRB.CreateAnd(S1, S2);
     Value *V1S2 = IRB.CreateAnd(V1, S2);
     Value *S1V2 = IRB.CreateAnd(S1, V2);
-    setShadow(&I, IRB.CreateOr({S1S2, V1S2, S1V2}));
+
+    Value *S = IRB.CreateOr({S1S2, V1S2, S1V2});
+    auto* MaybeDisjoint = cast<PossiblyDisjointInst>(&I);
+    if (MaybeDisjoint->isDisjoint()) {
+      Value *V1V2 = IRB.CreateAnd(V1, V2);
+      S = IRB.CreateOr({S, V1V2});
+    }
+
+    setShadow(&I, S);
     setOriginForNaryOp(I);
   }
 
diff --git a/llvm/test/Instrumentation/MemorySanitizer/or.ll b/llvm/test/Instrumentation/MemorySanitizer/or.ll
index 6570b6d9d91ca..598de90135320 100644
--- a/llvm/test/Instrumentation/MemorySanitizer/or.ll
+++ b/llvm/test/Instrumentation/MemorySanitizer/or.ll
@@ -41,8 +41,10 @@ define i8 @test_disjoint_or(i8 %a, i8 %b) sanitize_memory {
 ; CHECK-NEXT:    [[TMP7:%.*]] = and i8 [[TMP1]], [[TMP4]]
 ; CHECK-NEXT:    [[TMP8:%.*]] = or i8 [[TMP5]], [[TMP6]]
 ; CHECK-NEXT:    [[TMP11:%.*]] = or i8 [[TMP8]], [[TMP7]]
+; CHECK-NEXT:    [[TMP10:%.*]] = and i8 [[TMP3]], [[TMP4]]
+; CHECK-NEXT:    [[TMP12:%.*]] = or i8 [[TMP11]], [[TMP10]]
 ; CHECK-NEXT:    [[C:%.*]] = or disjoint i8 [[A]], [[B]]
-; CHECK-NEXT:    store i8 [[TMP11]], ptr @__msan_retval_tls, align 8
+; CHECK-NEXT:    store i8 [[TMP12]], ptr @__msan_retval_tls, align 8
 ; CHECK-NEXT:    ret i8 [[C]]
 ;
   %c = or disjoint i8 %a, %b

@llvmbot
Copy link
Member

llvmbot commented Jun 26, 2025

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Thurston Dang (thurstond)

Changes

The disjoint OR (#72583) of two '1's is poison, hence the corresponding shadow memory needs to be uninitialized (rather than initialized, as per the existing instrumentation which ignores disjointedness).

Updates the test from #145982


Full diff: https://github.com/llvm/llvm-project/pull/145990.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp (+19-6)
  • (modified) llvm/test/Instrumentation/MemorySanitizer/or.ll (+3-1)
diff --git a/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp b/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
index 1a76898bd61c6..f06600a55ef27 100644
--- a/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
@@ -2497,11 +2497,16 @@ struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
 
   void visitOr(BinaryOperator &I) {
     IRBuilder<> IRB(&I);
-    //  "Or" of 1 and a poisoned value results in unpoisoned value.
-    //  1|1 => 1;     0|1 => 1;     p|1 => 1;
-    //  1|0 => 1;     0|0 => 0;     p|0 => p;
-    //  1|p => 1;     0|p => p;     p|p => p;
-    //  S = (S1 & S2) | (~V1 & S2) | (S1 & ~V2)
+    //  "Or" of 1 and a poisoned value results in unpoisoned value:
+    //    1|1 => 1;     0|1 => 1;     p|1 => 1;
+    //    1|0 => 1;     0|0 => 0;     p|0 => p;
+    //    1|p => 1;     0|p => p;     p|p => p;
+    //
+    //    S = (S1 & S2) | (~V1 & S2) | (S1 & ~V2)
+    //
+    //  Addendum if the "Or" is "disjoint":
+    //    1|1 => p;
+    //    S = S | (V1 & V2)
     Value *S1 = getShadow(&I, 0);
     Value *S2 = getShadow(&I, 1);
     Value *V1 = IRB.CreateNot(I.getOperand(0));
@@ -2513,7 +2518,15 @@ struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
     Value *S1S2 = IRB.CreateAnd(S1, S2);
     Value *V1S2 = IRB.CreateAnd(V1, S2);
     Value *S1V2 = IRB.CreateAnd(S1, V2);
-    setShadow(&I, IRB.CreateOr({S1S2, V1S2, S1V2}));
+
+    Value *S = IRB.CreateOr({S1S2, V1S2, S1V2});
+    auto* MaybeDisjoint = cast<PossiblyDisjointInst>(&I);
+    if (MaybeDisjoint->isDisjoint()) {
+      Value *V1V2 = IRB.CreateAnd(V1, V2);
+      S = IRB.CreateOr({S, V1V2});
+    }
+
+    setShadow(&I, S);
     setOriginForNaryOp(I);
   }
 
diff --git a/llvm/test/Instrumentation/MemorySanitizer/or.ll b/llvm/test/Instrumentation/MemorySanitizer/or.ll
index 6570b6d9d91ca..598de90135320 100644
--- a/llvm/test/Instrumentation/MemorySanitizer/or.ll
+++ b/llvm/test/Instrumentation/MemorySanitizer/or.ll
@@ -41,8 +41,10 @@ define i8 @test_disjoint_or(i8 %a, i8 %b) sanitize_memory {
 ; CHECK-NEXT:    [[TMP7:%.*]] = and i8 [[TMP1]], [[TMP4]]
 ; CHECK-NEXT:    [[TMP8:%.*]] = or i8 [[TMP5]], [[TMP6]]
 ; CHECK-NEXT:    [[TMP11:%.*]] = or i8 [[TMP8]], [[TMP7]]
+; CHECK-NEXT:    [[TMP10:%.*]] = and i8 [[TMP3]], [[TMP4]]
+; CHECK-NEXT:    [[TMP12:%.*]] = or i8 [[TMP11]], [[TMP10]]
 ; CHECK-NEXT:    [[C:%.*]] = or disjoint i8 [[A]], [[B]]
-; CHECK-NEXT:    store i8 [[TMP11]], ptr @__msan_retval_tls, align 8
+; CHECK-NEXT:    store i8 [[TMP12]], ptr @__msan_retval_tls, align 8
 ; CHECK-NEXT:    ret i8 [[C]]
 ;
   %c = or disjoint i8 %a, %b

Copy link

github-actions bot commented Jun 26, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@thurstond thurstond changed the title [msan] Improve instrumentation of disjoint OR [msan] Add optional flag to improve instrumentation of disjoint OR Jun 26, 2025
setShadow(&I, IRB.CreateOr({S1S2, V1S2, S1V2}));

Value *S = IRB.CreateOr({S1S2, V1S2, S1V2});
if (ClPreciseDisjointOr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe

if (ClPreciseDisjointOr && cast<PossiblyDisjointInst>(&I)->isDisjoint())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@thurstond thurstond merged commit afe6af1 into llvm:main Jun 27, 2025
7 checks passed
@@ -282,6 +282,12 @@ static cl::opt<bool> ClPoisonUndefVectors(
"unaffected by this flag (see -msan-poison-undef)."),
cl::Hidden, cl::init(false));

static cl::opt<bool> ClPreciseDisjointOr(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need a flag?

Copy link
Contributor Author

@thurstond thurstond Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It fixes a false negative, hence some existing tests that currently pass will start failing when the fix is enabled.

Having the flag defaulted to off allows users to:

  • turn the flag on, to preemptively identify and fix such issues while the flag is still defaulted off
  • have an escape hatch for when the flag is eventually defaulted on in a future patch

rlavaee pushed a commit to rlavaee/llvm-project that referenced this pull request Jul 1, 2025
…lvm#145990)

The disjoint OR (llvm#72583) of two '1's is poison, hence the MSan ought to consider the result uninitialized (rather than initialized - i.e. a false negative - as per the existing instrumentation which ignores disjointedness). This patch adds a flag, `-msan-precise-disjoint-or`, which defaults to false (the legacy behavior). A future patch will default this flag to true.

Updates the test from llvm#145982
thurstond added a commit to thurstond/llvm-project that referenced this pull request Jul 2, 2025
…OR (llvm#145990)"

The "V1" and "V2" values were already NOT'ed, hence the calculation of
disjoint OR was incorrect. This patch fixes the issue by using the
instruction operands directly.
thurstond added a commit to thurstond/llvm-project that referenced this pull request Jul 2, 2025
…OR (llvm#145990)"

The "V1" and "V2" values were already NOT'ed, hence the calculation of
disjoint OR was incorrect. This patch fixes the issue by using the
instruction operands directly.
thurstond added a commit that referenced this pull request Jul 3, 2025
…OR (#145990)" (#146799)

The "V1" and "V2" values were already NOT'ed, hence the calculation of disjoint OR in #145990 was incorrect. This patch fixes the issue, with some refactoring and renaming of variables.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants