Skip to content

[mlir][LLVMIR] Add folders for llvm.inttoptr and llvm.ptrtoint #141891

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

dcaballe
Copy link
Contributor

This PR adds folders for inttoptr(ptrtoint(x)) and ptrtoint(inttoptr(x)).

This PR adds folders for `inttoptr(ptrtoint(x))` and `ptrtoint(inttoptr(x))`.
@llvmbot
Copy link
Member

llvmbot commented May 29, 2025

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-llvm

Author: Diego Caballero (dcaballe)

Changes

This PR adds folders for inttoptr(ptrtoint(x)) and ptrtoint(inttoptr(x)).


Full diff: https://github.com/llvm/llvm-project/pull/141891.diff

3 Files Affected:

  • (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td (+6-2)
  • (modified) mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp (+29)
  • (modified) mlir/test/Dialect/LLVMIR/canonicalize.mlir (+29-3)
diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
index 6fde45ce5c556..3eb48964bd378 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
@@ -622,10 +622,14 @@ def LLVM_AddrSpaceCastOp : LLVM_CastOp<"addrspacecast", "AddrSpaceCast",
 }
 def LLVM_IntToPtrOp : LLVM_DereferenceableCastOp<"inttoptr", "IntToPtr",
                                   LLVM_ScalarOrVectorOf<AnySignlessInteger>,
-                                  LLVM_ScalarOrVectorOf<LLVM_AnyPointer>>;
+                                  LLVM_ScalarOrVectorOf<LLVM_AnyPointer>> {
+  let hasFolder = 1;
+}
 def LLVM_PtrToIntOp : LLVM_CastOp<"ptrtoint", "PtrToInt",
                                   LLVM_ScalarOrVectorOf<LLVM_AnyPointer>,
-                                  LLVM_ScalarOrVectorOf<AnySignlessInteger>>;
+                                  LLVM_ScalarOrVectorOf<AnySignlessInteger>> {
+  let hasFolder = 1;
+}
 def LLVM_SExtOp : LLVM_CastOp<"sext", "SExt",
                               LLVM_ScalarOrVectorOf<AnySignlessInteger>,
                               LLVM_ScalarOrVectorOf<AnySignlessInteger>> {
diff --git a/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp b/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
index c7528c970a4ba..74745752fd3dd 100644
--- a/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
+++ b/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
@@ -3562,6 +3562,35 @@ LogicalResult LLVM::BitcastOp::verify() {
   return success();
 }
 
+//===----------------------------------------------------------------------===//
+// IntToPtrOp
+//===----------------------------------------------------------------------===//
+
+template <typename CastOp, typename ReverseCastOp>
+static OpFoldResult foldReversibleCast(CastOp castOp) {
+  auto reverseCastOp = castOp.getArg().template getDefiningOp<ReverseCastOp>();
+  if (!reverseCastOp)
+    return {};
+
+  // cast(reverse_cast(x)) -> x
+  if (reverseCastOp.getArg().getType() == castOp.getType())
+    return reverseCastOp.getArg();
+
+  return {};
+}
+
+OpFoldResult LLVM::IntToPtrOp::fold(FoldAdaptor adaptor) {
+  return foldReversibleCast<LLVM::IntToPtrOp, LLVM::PtrToIntOp>(*this);
+}
+
+//===----------------------------------------------------------------------===//
+// PtrToIntOp
+//===----------------------------------------------------------------------===//
+
+OpFoldResult LLVM::PtrToIntOp::fold(FoldAdaptor adaptor) {
+  return foldReversibleCast<LLVM::PtrToIntOp, LLVM::IntToPtrOp>(*this);
+}
+
 //===----------------------------------------------------------------------===//
 // Folder for LLVM::AddrSpaceCastOp
 //===----------------------------------------------------------------------===//
diff --git a/mlir/test/Dialect/LLVMIR/canonicalize.mlir b/mlir/test/Dialect/LLVMIR/canonicalize.mlir
index 8accf6e263863..51f583868aee2 100644
--- a/mlir/test/Dialect/LLVMIR/canonicalize.mlir
+++ b/mlir/test/Dialect/LLVMIR/canonicalize.mlir
@@ -103,10 +103,10 @@ llvm.func @fold_unrelated_extractvalue(%arr: !llvm.array<4 x f32>) -> f32 {
 // -----
 // CHECK-LABEL: fold_extract_extractvalue
 llvm.func @fold_extract_extractvalue(%arr: !llvm.struct<(i64, array<1 x ptr<1>>)>) -> !llvm.ptr<1> {
-  // CHECK: llvm.extractvalue %{{.*}}[1, 0] 
+  // CHECK: llvm.extractvalue %{{.*}}[1, 0]
   // CHECK-NOT: extractvalue
-  %a = llvm.extractvalue %arr[1] : !llvm.struct<(i64, array<1 x ptr<1>>)> 
-  %b = llvm.extractvalue %a[0] : !llvm.array<1 x ptr<1>> 
+  %a = llvm.extractvalue %arr[1] : !llvm.struct<(i64, array<1 x ptr<1>>)>
+  %b = llvm.extractvalue %a[0] : !llvm.array<1 x ptr<1>>
   llvm.return %b : !llvm.ptr<1>
 }
 
@@ -134,6 +134,32 @@ llvm.func @fold_extract_splat() -> f64 {
 
 // -----
 
+// CHECK-LABEL: fold_inttoptr_ptrtoint
+//  CHECK-SAME: %[[ARG:.+]]: i32) -> i32
+//   CHECK-NOT: inttoptr
+//   CHECK-NOT: ptrtoint
+//  CHECK-NEXT: llvm.return %[[ARG]]
+llvm.func @fold_inttoptr_ptrtoint(%x : i32) -> i32 {
+  %c = llvm.inttoptr %x : i32 to !llvm.ptr
+  %d = llvm.ptrtoint %c : !llvm.ptr to i32
+  llvm.return %d : i32
+}
+
+// -----
+
+// CHECK-LABEL: fold_ptrtoint_inttoptr
+//  CHECK-SAME: %[[ARG:.+]]: !llvm.ptr<3>) -> !llvm.ptr<3>
+//   CHECK-NOT: inttoptr
+//   CHECK-NOT: ptrtoint
+//  CHECK-NEXT: llvm.return %[[ARG]]
+llvm.func @fold_ptrtoint_inttoptr(%x : !llvm.ptr<3>) -> !llvm.ptr<3> {
+  %c = llvm.ptrtoint %x : !llvm.ptr<3> to i32
+  %d = llvm.inttoptr %c : i32 to !llvm.ptr<3>
+  llvm.return %d : !llvm.ptr<3>
+}
+
+// -----
+
 // CHECK-LABEL: fold_bitcast
 // CHECK-SAME: %[[ARG:[[:alnum:]]+]]
 // CHECK-NEXT: llvm.return %[[ARG]]

llvm.func @fold_ptrtoint_inttoptr(%x : !llvm.ptr<3>) -> !llvm.ptr<3> {
%c = llvm.ptrtoint %x : !llvm.ptr<3> to i32
%d = llvm.inttoptr %c : i32 to !llvm.ptr<3>
llvm.return %d : !llvm.ptr<3>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should not be folded since the ptrtoint operation truncates the pointer value to an i32, if I understand correctly? After the folding all bits of the function argument are forwarded to the result.

Do you know what folders LLVM proper implements? I suspect there are some edge cases around these operations that require special handling.

Copy link
Collaborator

@joker-eph joker-eph May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, I'm not sure about how the provenance is supposed to be handled for this kind of things. I try these example with InstCombine first. Also maybe see what Alive is thinking about these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that's me sending PRs late at night :)

Thanks for pointing that out. Yes, that folding is not correct unless the size of the pointer is 32 bits.
This is what InstCombine does:

define ptr @test_i32(ptr %p1) {
  %i = ptrtoint ptr %p1 to i32
  %p2 = inttoptr i32 %i to ptr
  ret ptr %p2
}

define ptr @test_i64(ptr %p1) {
  %i = ptrtoint ptr %p1 to i64
  %p2 = inttoptr i64 %i to ptr
  ret ptr %p2
}

-->

define ptr @test_i32(ptr %p1) {
  %1 = ptrtoint ptr %p1 to i64
  %2 = and i64 %1, 4294967295
  %p2 = inttoptr i64 %2 to ptr
  ret ptr %p2
}

define ptr @test_i64(ptr %p1) {
  ret ptr %p1
}

That makes sense as LLVM defaults to 64-bit pointers. If we add 32-bit pointer info:

target datalayout = "e-p:32:32"

-->

define ptr @test_i32(ptr %p1) {
  ret ptr %p1
}

define ptr @test_i64(ptr %p1) {
  ret ptr %p1
}

So I think we should only apply these folders if we are able to retrieve the pointer size from the DL info and we can prove that no actual data truncation happens. Questions:

  1. Is retrieving DL info allowed withing a folder?
  2. Have we agreed upon how to represent pointer information in the DL for the LLVM dialect already (@rengolin)?

Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I don't think this fold is allowed for non-integral pointers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is retrieving DL info allowed withing a folder?

I am not aware of a specific rule but I would not recommend to query the data layout from a folder. For such a transformation, a separate optimization pass sounds like the better choice? That way the data layout doesn't need to queried for every fold.

Have we agreed upon how to represent pointer information in the DL for the LLVM dialect already (@rengolin)?

AFAIK the pointer bit width is available today but I don't think non-integral pointer info that @krzysz00 mentioned is available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I can definitely turn this into a pass and see how it goes. Let's do that. Thanks for the suggestion!

dcaballe added a commit to dcaballe/llvm-project that referenced this pull request Jun 6, 2025
This PR is a follow-up to llvm#141891.
It introduces a pass that can fold `inttoptr(ptrtoint(x)) -> x` and
`ptrtoint(inttoptr(x)) -> x`. The pass takes in a list of address space
bitwidths and makes sure that the folding is applied only when it's safe.
@dcaballe
Copy link
Contributor Author

dcaballe commented Jun 6, 2025

Pass implementation: #143066

@dcaballe dcaballe closed this Jun 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants