Skip to content

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Oct 9, 2025

This folds icmp (ptrtoaddr x, ptrtoaddr y) to icmp (x, y), matching the existing ptrtoint fold. Notably, this fold is not generally correct for the case where the address size and the pointer size differ, as the ptrtoaddr variant will only compare the address bits, while the icmp will compare all bits.

Both of the examples here can still be folded, but not through the logic we currently use. We'd have to add dedicated folds for that. For now add fixmes.

Related discussion: https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/60?u=nikic

This folds `icmp (ptrtoaddr x, ptrtoaddr y)` to `icmp (x, y)`,
matching the existing ptrtoint fold. Notably, this fold is not
generally correct for the case where the address size and the
pointer size differ, as the ptrtoaddr variant will only compare
the address bits, while the icmp will compare all bits.

Both of the examples here *can* still be folded, but not through
the logic we currently use. We'd have to add dedicated folds for
that. For now add fixmes.

Related discussion:
https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/60?u=nikic
@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Oct 9, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 9, 2025

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

This folds icmp (ptrtoaddr x, ptrtoaddr y) to icmp (x, y), matching the existing ptrtoint fold. Notably, this fold is not generally correct for the case where the address size and the pointer size differ, as the ptrtoaddr variant will only compare the address bits, while the icmp will compare all bits.

Both of the examples here can still be folded, but not through the logic we currently use. We'd have to add dedicated folds for that. For now add fixmes.

Related discussion: https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/60?u=nikic


Full diff: https://github.com/llvm/llvm-project/pull/162653.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/ConstantFolding.cpp (+4-2)
  • (modified) llvm/test/Transforms/InstCombine/ptrtoaddr.ll (+36)
diff --git a/llvm/lib/Analysis/ConstantFolding.cpp b/llvm/lib/Analysis/ConstantFolding.cpp
index 45c889cef8e2c..689e2e01610e2 100755
--- a/llvm/lib/Analysis/ConstantFolding.cpp
+++ b/llvm/lib/Analysis/ConstantFolding.cpp
@@ -1225,7 +1225,8 @@ Constant *llvm::ConstantFoldCompareInstOperands(
 
       // Only do this transformation if the int is intptrty in size, otherwise
       // there is a truncation or extension that we aren't modeling.
-      if (CE0->getOpcode() == Instruction::PtrToInt) {
+      if (CE0->getOpcode() == Instruction::PtrToInt ||
+          CE0->getOpcode() == Instruction::PtrToAddr) {
         Type *IntPtrTy = DL.getIntPtrType(CE0->getOperand(0)->getType());
         if (CE0->getType() == IntPtrTy) {
           Constant *C = CE0->getOperand(0);
@@ -1252,7 +1253,8 @@ Constant *llvm::ConstantFoldCompareInstOperands(
 
         // Only do this transformation if the int is intptrty in size, otherwise
         // there is a truncation or extension that we aren't modeling.
-        if (CE0->getOpcode() == Instruction::PtrToInt) {
+        if (CE0->getOpcode() == Instruction::PtrToInt ||
+            CE0->getOpcode() == Instruction::PtrToAddr) {
           Type *IntPtrTy = DL.getIntPtrType(CE0->getOperand(0)->getType());
           if (CE0->getType() == IntPtrTy &&
               CE0->getOperand(0)->getType() == CE1->getOperand(0)->getType()) {
diff --git a/llvm/test/Transforms/InstCombine/ptrtoaddr.ll b/llvm/test/Transforms/InstCombine/ptrtoaddr.ll
index 5211fbd8b897a..2e688d8b7872c 100644
--- a/llvm/test/Transforms/InstCombine/ptrtoaddr.ll
+++ b/llvm/test/Transforms/InstCombine/ptrtoaddr.ll
@@ -130,3 +130,39 @@ define i32 @ptrtoaddr_sub_consts_offset_addrsize() {
 ;
   ret i32 sub (i32 ptrtoaddr (ptr addrspace(1) getelementptr (i8, ptr addrspace(1) @g.as1, i32 42) to i32), i32 ptrtoaddr (ptr addrspace(1) @g.as1 to i32))
 }
+
+define i1 @icmp_ptrtoaddr_0() {
+; CHECK-LABEL: define i1 @icmp_ptrtoaddr_0() {
+; CHECK-NEXT:    ret i1 true
+;
+  %cmp = icmp ne i64 ptrtoaddr (ptr @g to i64), 0
+  ret i1 %cmp
+}
+
+; FIXME: This can still fold, but not by converting to pointer comparison.
+define i1 @icmp_ptrtoaddr_0_addrsize() {
+; CHECK-LABEL: define i1 @icmp_ptrtoaddr_0_addrsize() {
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ne i32 ptrtoaddr (ptr addrspace(1) @g.as1 to i32), 0
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+  %cmp = icmp ne i32 ptrtoaddr (ptr addrspace(1) @g.as1 to i32), 0
+  ret i1 %cmp
+}
+
+define i1 @icmp_ptrtoaddr_ptrtoaddr() {
+; CHECK-LABEL: define i1 @icmp_ptrtoaddr_ptrtoaddr() {
+; CHECK-NEXT:    ret i1 true
+;
+  %cmp = icmp ne i64 ptrtoaddr (ptr @g to i64), ptrtoaddr (ptr @g2 to i64)
+  ret i1 %cmp
+}
+
+; FIXME: This can still fold, but not by converting to pointer comparison.
+define i1 @icmp_ptrtoaddr_ptrtoaddr_addrsize() {
+; CHECK-LABEL: define i1 @icmp_ptrtoaddr_ptrtoaddr_addrsize() {
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ne i32 ptrtoaddr (ptr addrspace(1) @g.as1 to i32), ptrtoaddr (ptr addrspace(1) @g2.as1 to i32)
+; CHECK-NEXT:    ret i1 [[CMP]]
+;
+  %cmp = icmp ne i32 ptrtoaddr (ptr addrspace(1) @g.as1 to i32), ptrtoaddr (ptr addrspace(1) @g2.as1 to i32)
+  ret i1 %cmp
+}

if (CE0->getOpcode() == Instruction::PtrToInt) {
if (CE0->getOpcode() == Instruction::PtrToInt ||
CE0->getOpcode() == Instruction::PtrToAddr) {
Type *IntPtrTy = DL.getIntPtrType(CE0->getOperand(0)->getType());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we seem to be aligned on using address width only for icmp ptr, I believe all of these DL.getIntPtrType() should be DL.getAddressType()? This change probably won't be caught by existing tests since we won't have many for index size != pointer size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is under the assumption that icmp compares all bits. I felt that there wasn't sufficient consensus for the address-only interpretation yet. I'll put this into draft for now and revisit this once we have more clarity.

@nikic nikic marked this pull request as draft October 9, 2025 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:analysis Includes value tracking, cost tables and constant folding llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants