Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CaptureTracking] Do not capture compares of same object #74228

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

caojoshua
Copy link
Contributor

Compares of the same object do not leak any bits.

This patch introduces getUnderlyingObjectLookThrough. It looks at the
output of getUnderlyingObject. If it is a PHI, it looks at all the
incoming underlying objects. If all those objects are the same, or the
original PHI, we determine that there is a new underlying object. This
is similar to getUnderlyingObjects, but provides a more efficient way to
find a single underlying object.

This is an attempt at solving huge compile time regressions in
https://reviews.llvm.org/D152082. First, we only look through a single
PHI, not nested PHIs. Second, we only use one callsite. There are likely
other callsites that could take advantage of this over the vanilla
getUnderlyingObjects. We need to be careful about compile times. Adding
this to BasicAA::aliasCheck increases compile times by 3% on local
builds.

This was inspired by the issues in
rust-lang/rust#111603. rustc used to generate
pointers comparisons that this patch can identify as non capturing. The
code emission was changed in rustc to avoid this behavior, but this
patch can help with other cases of pointer inductions.

Compares of the same object do not leak any bits.

This patch introduces getUnderlyingObjectLookThrough. It looks at the
output of getUnderlyingObject. If it is a PHI, it looks at all the
incoming underlying objects. If all those objects are the same, or the
original PHI, we determine that there is a new underlying object. This
is similar to getUnderlyingObjects, but provides a more efficient way to
find a single underlying object.

This is an attempt at solving huge compile time regressions in
https://reviews.llvm.org/D152082. First, we only look through a single
PHI, not nested PHIs. Second, we only use one callsite. There are likely
other callsites that could take advantage of this over the vanilla
getUnderlyingObjects. We need to be careful about compile times. Adding
this to BasicAA::aliasCheck increases compile times by 3% on local
builds.

This was inspired by the issues in
rust-lang/rust#111603. rustc used to generate
pointers comparisons that this patch can identify as non capturing. The
code emission was changed in rustc to avoid this behavior, but this
patch can help with other cases of pointer inductions.
@llvmbot
Copy link
Collaborator

llvmbot commented Dec 3, 2023

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Author: Joshua Cao (caojoshua)

Changes

Compares of the same object do not leak any bits.

This patch introduces getUnderlyingObjectLookThrough. It looks at the
output of getUnderlyingObject. If it is a PHI, it looks at all the
incoming underlying objects. If all those objects are the same, or the
original PHI, we determine that there is a new underlying object. This
is similar to getUnderlyingObjects, but provides a more efficient way to
find a single underlying object.

This is an attempt at solving huge compile time regressions in
https://reviews.llvm.org/D152082. First, we only look through a single
PHI, not nested PHIs. Second, we only use one callsite. There are likely
other callsites that could take advantage of this over the vanilla
getUnderlyingObjects. We need to be careful about compile times. Adding
this to BasicAA::aliasCheck increases compile times by 3% on local
builds.

This was inspired by the issues in
rust-lang/rust#111603. rustc used to generate
pointers comparisons that this patch can identify as non capturing. The
code emission was changed in rustc to avoid this behavior, but this
patch can help with other cases of pointer inductions.


Full diff: https://github.com/llvm/llvm-project/pull/74228.diff

5 Files Affected:

  • (modified) llvm/include/llvm/Analysis/ValueTracking.h (+7)
  • (modified) llvm/lib/Analysis/CaptureTracking.cpp (+5-1)
  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+25)
  • (modified) llvm/test/Transforms/FunctionAttrs/nocapture.ll (+65)
  • (modified) llvm/unittests/Analysis/CaptureTrackingTest.cpp (+1-8)
diff --git a/llvm/include/llvm/Analysis/ValueTracking.h b/llvm/include/llvm/Analysis/ValueTracking.h
index 82c87edd6297c..d9f8a76d3cbc2 100644
--- a/llvm/include/llvm/Analysis/ValueTracking.h
+++ b/llvm/include/llvm/Analysis/ValueTracking.h
@@ -693,6 +693,13 @@ inline Value *getArgumentAliasingToReturnedPointer(CallBase *Call,
 bool isIntrinsicReturningPointerAliasingArgumentWithoutCapturing(
     const CallBase *Call, bool MustPreserveNullness);
 
+/// This method is a wrapper around getUnderlyingObject to look through PHI
+/// nodes. This method will attempt to build a new underlying object based on
+/// the incoming values. This method can have high compile time implications and
+/// cannot be used as a direct replacement for getUnderlyingObject.
+const Value *getUnderlyingObjectLookThrough(const Value *V,
+                                            unsigned MaxLookup = 6);
+
 /// This method strips off any GEP address adjustments and pointer casts from
 /// the specified value, returning the original object being addressed. Note
 /// that the returned value has pointer type if the specified value does. If
diff --git a/llvm/lib/Analysis/CaptureTracking.cpp b/llvm/lib/Analysis/CaptureTracking.cpp
index 0d6d30923bb54..785a9dd22b75b 100644
--- a/llvm/lib/Analysis/CaptureTracking.cpp
+++ b/llvm/lib/Analysis/CaptureTracking.cpp
@@ -387,7 +387,11 @@ UseCaptureKind llvm::DetermineUseCaptureKind(
         if (IsDereferenceableOrNull && IsDereferenceableOrNull(O, DL))
           return UseCaptureKind::NO_CAPTURE;
       }
-    }
+    } else if (cast<ICmpInst>(I)->isEquality() &&
+               getUnderlyingObjectLookThrough(I->getOperand(Idx)) ==
+                   getUnderlyingObjectLookThrough(I->getOperand(OtherIdx)))
+      // Equality comparisons against the same pointer do not capture.
+      return UseCaptureKind::NO_CAPTURE;
 
     // Otherwise, be conservative. There are crazy ways to capture pointers
     // using comparisons.
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 8c29c242215d6..3dd035ac49a33 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -5968,6 +5968,31 @@ static bool isSameUnderlyingObjectInLoop(const PHINode *PN,
   return true;
 }
 
+const Value *llvm::getUnderlyingObjectLookThrough(const Value *V, unsigned MaxLookup) {
+  V = getUnderlyingObject(V, MaxLookup);
+
+  const PHINode *PN = dyn_cast<PHINode>(V);
+  if (!PN)
+    return V;
+
+  // We can look through PHIs if each underlying value has the same underlying
+  // object, or is the phi itself.
+  const Value *NewUnderlying = PN;
+  for (const Value *Incoming : PN->incoming_values()) {
+    const Value *IncomingUnderlying = getUnderlyingObject(Incoming, MaxLookup);
+    if (IncomingUnderlying == PN || IncomingUnderlying == NewUnderlying)
+      continue;
+    if (NewUnderlying == PN)
+      // Found a new possible underlying object.
+      NewUnderlying = IncomingUnderlying;
+    else
+      // There are >= 2 possible underlying objects. We cannot determine a new
+      // underlying object.
+      return V;
+  }
+  return NewUnderlying;
+}
+
 const Value *llvm::getUnderlyingObject(const Value *V, unsigned MaxLookup) {
   if (!V->getType()->isPointerTy())
     return V;
diff --git a/llvm/test/Transforms/FunctionAttrs/nocapture.ll b/llvm/test/Transforms/FunctionAttrs/nocapture.ll
index a70d71e62c305..093d9445e39d7 100644
--- a/llvm/test/Transforms/FunctionAttrs/nocapture.ll
+++ b/llvm/test/Transforms/FunctionAttrs/nocapture.ll
@@ -900,6 +900,71 @@ define void @readnone_indirec(ptr %f, ptr %p) {
   ret void
 }
 
+; FNATTR: define i1 @identity_icmp(ptr nocapture readnone %p)
+define i1 @identity_icmp(ptr %p) {
+  %r = icmp eq ptr %p, %p
+  ret i1 %r
+}
+
+; FNATTR: define i1 @compare_against_offset(ptr nocapture readnone %p)
+define i1 @compare_against_offset(ptr %p) {
+  %offset = getelementptr inbounds i32, ptr %p, i64 1
+  %r = icmp eq ptr %p, %offset
+  ret i1 %r
+}
+
+; FNATTR: define i1 @compare_offsets(ptr nocapture readnone %p)
+define i1 @compare_offsets(ptr %p) {
+  %offset1 = getelementptr inbounds i32, ptr %p, i64 1
+  %offset2 = getelementptr inbounds i32, ptr %p, i64 2
+  %r = icmp eq ptr %offset1, %offset2
+  ret i1 %r
+}
+
+; FNATTR: define void @phi_induction(ptr nocapture writeonly %p, i64 %n, i32 %x)
+define void @phi_induction(ptr %p, i64 %n, i32 %x) {
+start:
+  %end = getelementptr inbounds i32, ptr %p, i64 %n
+  br label %repeat_loop_body
+
+repeat_loop_body:                                 ; preds = %start, %repeat_loop_body
+  %induct = phi ptr [ %p, %start ], [ %induct.next, %repeat_loop_body ]
+  store i32 %x, ptr %induct, align 4
+  %induct.next = getelementptr inbounds i32, ptr %induct, i64 1
+  %.not = icmp eq ptr %induct.next, %end
+  br i1 %.not, label %repeat_loop_next, label %repeat_loop_body
+
+repeat_loop_next:
+  ret void
+}
+
+; FNATTR: define i1 @compare_against_offset_non_equality(ptr readnone %p)
+define i1 @compare_against_offset_non_equality(ptr %p) {
+  ; Cannot capture non-equality comparisons. An overflowed GEP can leak bits.
+  %offset = getelementptr inbounds i32, ptr %p, i64 1
+  %r = icmp ult ptr %p, %offset
+  ret i1 %r
+}
+
+; FNATTR: define i1 @compare_against_capture(ptr %p)
+define i1 @compare_against_capture(ptr %p) {
+  %offset1 = getelementptr inbounds i32, ptr %p, i64 1
+  %offset2 = getelementptr inbounds i32, ptr %p, i64 2
+  store ptr %offset2, ptr @g
+  %r = icmp eq ptr %offset1, %offset2
+  ret i1 %r
+}
+
+declare ptr @llvm.ptrmask.p0.i64(ptr ,i64)
+
+; FNATTR: define i1 @compare_against_ptrmask(ptr readnone %p, i64 %mask)
+define i1 @compare_against_ptrmask(ptr %p, i64 %mask) {
+  %offset1 = getelementptr inbounds i32, ptr %p, i64 1
+  %offset2 = getelementptr inbounds i32, ptr %p, i64 2
+  %masked_ptr = call ptr @llvm.ptrmask.p0.i64(ptr %offset2, i64 %mask)
+  %r = icmp eq ptr %offset1, %masked_ptr
+  ret i1 %r
+}
 
 declare ptr @llvm.launder.invariant.group.p0(ptr)
 declare ptr @llvm.strip.invariant.group.p0(ptr)
diff --git a/llvm/unittests/Analysis/CaptureTrackingTest.cpp b/llvm/unittests/Analysis/CaptureTrackingTest.cpp
index 73dd82fb921f7..8fcbbbaae6d62 100644
--- a/llvm/unittests/Analysis/CaptureTrackingTest.cpp
+++ b/llvm/unittests/Analysis/CaptureTrackingTest.cpp
@@ -105,11 +105,10 @@ TEST(CaptureTracking, MultipleUsesInSameInstruction) {
   BasicBlock *BB = &F->getEntryBlock();
   Instruction *Call = &*BB->begin();
   Instruction *CmpXChg = Call->getNextNode();
-  Instruction *ICmp = CmpXChg->getNextNode();
 
   CollectingCaptureTracker CT;
   PointerMayBeCaptured(Arg, &CT);
-  EXPECT_EQ(7u, CT.Captures.size());
+  EXPECT_EQ(5u, CT.Captures.size());
   // Call arg 1
   EXPECT_EQ(Call, CT.Captures[0]->getUser());
   EXPECT_EQ(0u, CT.Captures[0]->getOperandNo());
@@ -125,10 +124,4 @@ TEST(CaptureTracking, MultipleUsesInSameInstruction) {
   // Cmpxchg new value operand
   EXPECT_EQ(CmpXChg, CT.Captures[4]->getUser());
   EXPECT_EQ(2u, CT.Captures[4]->getOperandNo());
-  // ICmp first operand
-  EXPECT_EQ(ICmp, CT.Captures[5]->getUser());
-  EXPECT_EQ(0u, CT.Captures[5]->getOperandNo());
-  // ICmp second operand
-  EXPECT_EQ(ICmp, CT.Captures[6]->getUser());
-  EXPECT_EQ(1u, CT.Captures[6]->getOperandNo());
 }

@caojoshua
Copy link
Contributor Author

Original review from https://reviews.llvm.org/D152241

Copy link

github-actions bot commented Dec 3, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@caojoshua
Copy link
Contributor Author

This patch is incompatible with this code block, which simplifies non-capturing pointer comparisons to false.

@nikic
Copy link
Contributor

nikic commented Jan 4, 2024

This patch is incompatible with this code block, which simplifies non-capturing pointer comparisons to false.

Yes, this fold is known to be buggy. The right way to do it would be to extend

bool InstCombinerImpl::foldAllocaCmp(AllocaInst *Alloca) {
to handle allocator calls. Pretty sure I had a patch for this ... somewhere on phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants