Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GlobalIsel][AArch64] legalize ptr add #89218

Merged
merged 1 commit into from
May 8, 2024
Merged

Conversation

tschuett
Copy link
Member

LLVM ERROR: unable to legalize instruction: %275:(<4 x p0>) = G_PTR_ADD %268:, %274:_(<4 x s64>) (in function: prepare_for_pass)

LLVM ERROR: unable to legalize instruction: %275:_(<4 x p0>) = G_PTR_ADD %268:_, %274:_(<4 x s64>) (in function: prepare_for_pass)
@tschuett
Copy link
Member Author

3 x p0 asserts

@llvmbot
Copy link
Collaborator

llvmbot commented Apr 18, 2024

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-aarch64

Author: Thorsten Schütt (tschuett)

Changes

LLVM ERROR: unable to legalize instruction: %275:(<4 x p0>) = G_PTR_ADD %268:, %274:_(<4 x s64>) (in function: prepare_for_pass)


Full diff: https://github.com/llvm/llvm-project/pull/89218.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp (+3-1)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-ptr-add.mir (+56-3)
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index 85dd0f2eb192d9..3d2d5f8287a70a 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -183,7 +183,9 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
 
   getActionDefinitionsBuilder(G_PTR_ADD)
       .legalFor({{p0, s64}, {v2p0, v2s64}})
-      .clampScalar(1, s64, s64);
+      .clampScalar(1, s64, s64)
+      .clampNumElements(0, v2p0, v2p0)
+      .clampNumElements(1, v2s64, v2s64);
 
   getActionDefinitionsBuilder(G_PTRMASK).legalFor({{p0, s64}});
 
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-ptr-add.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-ptr-add.mir
index 1ecd36b55380a6..1d3f7eab79d69d 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-ptr-add.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-ptr-add.mir
@@ -6,12 +6,65 @@ body:             |
   bb.0.entry:
     ; CHECK-LABEL: name: test_ptr_add_vec_p0
     ; CHECK: [[COPY:%[0-9]+]]:_(<2 x p0>) = COPY $q0
-    ; CHECK: [[COPY1:%[0-9]+]]:_(<2 x s64>) = COPY $q1
-    ; CHECK: [[PTR_ADD:%[0-9]+]]:_(<2 x p0>) = G_PTR_ADD [[COPY]], [[COPY1]](<2 x s64>)
-    ; CHECK: $q0 = COPY [[PTR_ADD]](<2 x p0>)
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(<2 x s64>) = COPY $q1
+    ; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(<2 x p0>) = G_PTR_ADD [[COPY]], [[COPY1]](<2 x s64>)
+    ; CHECK-NEXT: $q0 = COPY [[PTR_ADD]](<2 x p0>)
     %0:_(<2 x p0>) = COPY $q0
     %1:_(<2 x s64>) = COPY $q1
     %3:_(<2 x p0>) = G_PTR_ADD %0, %1(<2 x s64>)
     $q0 = COPY %3(<2 x p0>)
 
 ...
+---
+name:            test_ptr_add_vec_4xp0
+body:             |
+  bb.0.entry:
+    ; CHECK-LABEL: name: test_ptr_add_vec_4xp0
+    ; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(p0) = COPY $x1
+    ; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(p0) = COPY $x2
+    ; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(p0) = COPY $x3
+    ; CHECK-NEXT: [[COPY4:%[0-9]+]]:_(s64) = COPY $x4
+    ; CHECK-NEXT: [[COPY5:%[0-9]+]]:_(s64) = COPY $x5
+    ; CHECK-NEXT: [[COPY6:%[0-9]+]]:_(s64) = COPY $x6
+    ; CHECK-NEXT: [[COPY7:%[0-9]+]]:_(s64) = COPY $x7
+    ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p0>) = G_BUILD_VECTOR [[COPY]](p0), [[COPY1]](p0)
+    ; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x p0>) = G_BUILD_VECTOR [[COPY2]](p0), [[COPY3]](p0)
+    ; CHECK-NEXT: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[COPY4]](s64), [[COPY5]](s64)
+    ; CHECK-NEXT: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[COPY6]](s64), [[COPY7]](s64)
+    ; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(<2 x p0>) = G_PTR_ADD [[BUILD_VECTOR]], [[BUILD_VECTOR2]](<2 x s64>)
+    ; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(<2 x p0>) = G_PTR_ADD [[BUILD_VECTOR1]], [[BUILD_VECTOR3]](<2 x s64>)
+    ; CHECK-NEXT: %zero:_(s64) = G_CONSTANT i64 0
+    ; CHECK-NEXT: %one:_(s64) = G_CONSTANT i64 1
+    ; CHECK-NEXT: %extract0:_(p0) = G_EXTRACT_VECTOR_ELT [[PTR_ADD]](<2 x p0>), %zero(s64)
+    ; CHECK-NEXT: %extract1:_(p0) = G_EXTRACT_VECTOR_ELT [[PTR_ADD]](<2 x p0>), %one(s64)
+    ; CHECK-NEXT: %extract2:_(p0) = G_EXTRACT_VECTOR_ELT [[PTR_ADD1]](<2 x p0>), %zero(s64)
+    ; CHECK-NEXT: %extract3:_(p0) = G_EXTRACT_VECTOR_ELT [[PTR_ADD1]](<2 x p0>), %one(s64)
+    ; CHECK-NEXT: $x0 = COPY %extract0(p0)
+    ; CHECK-NEXT: $x1 = COPY %extract1(p0)
+    ; CHECK-NEXT: $x2 = COPY %extract2(p0)
+    ; CHECK-NEXT: $x3 = COPY %extract3(p0)
+    %0:_(p0) = COPY $x0
+    %1:_(p0) = COPY $x1
+    %2:_(p0) = COPY $x2
+    %3:_(p0) = COPY $x3
+    %4:_(s64) = COPY $x4
+    %5:_(s64) = COPY $x5
+    %6:_(s64) = COPY $x6
+    %7:_(s64) = COPY $x7
+    %ptr:_(<4 x p0>) = G_BUILD_VECTOR %0(p0), %1(p0), %2(p0), %3(p0)
+    %add:_(<4 x s64>) = G_BUILD_VECTOR %4(s64), %5(s64), %6(s64), %7(s64)
+    %res:_(<4 x p0>) = G_PTR_ADD %ptr, %add(<4 x s64>)
+    %zero:_(s64) = G_CONSTANT i64 0
+    %one:_(s64) = G_CONSTANT i64 1
+    %two:_(s64) = G_CONSTANT i64 2
+    %three:_(s64) = G_CONSTANT i64 3
+    %extract0:_(p0) = G_EXTRACT_VECTOR_ELT %res(<4 x p0>), %zero(s64)
+    %extract1:_(p0) = G_EXTRACT_VECTOR_ELT %res(<4 x p0>), %one(s64)
+    %extract2:_(p0) = G_EXTRACT_VECTOR_ELT %res(<4 x p0>), %two(s64)
+    %extract3:_(p0) = G_EXTRACT_VECTOR_ELT %res(<4 x p0>), %three(s64)
+    $x0 = COPY %extract0(p0)
+    $x1 = COPY %extract1(p0)
+    $x2 = COPY %extract2(p0)
+    $x3 = COPY %extract3(p0)
+...

@tschuett
Copy link
Member Author

Assertion failed: (llvm::all_of(SrcOps, [&, this](const SrcOp &Op) { return Op.getLLTTy(*getMRI()) == SrcOps[0].getLLTTy(*getMRI()); }) && "type mismatch in input list"), function buildInstr, file MachineIRBuilder.cpp, line 1355.

@dc03-work
Copy link
Contributor

3 x p0 asserts

Does your patch fix this or is it still the case after applying your patch?

@tschuett
Copy link
Member Author

After this patch, it still asserts.

@dc03-work
Copy link
Contributor

dc03-work commented Apr 18, 2024

After this patch, it still asserts.

You can likely put a moreElementsToNextPow2 before the clampNumElements to address this.

Edit: this case should also be added as a test case if it doesn't already exist.

@tschuett
Copy link
Member Author

The new error after moreElementsToNextPow2 is:

unable to legalize instruction: %res:_(<3 x p0>) = G_PTR_ADD %ptr:_, %add:_(<3 x s64>) (in function: test_ptr_add_vec_3xp0)

Comment on lines +187 to +188
.clampNumElements(0, v2p0, v2p0)
.clampNumElements(1, v2s64, v2s64);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be redundant? Do we support implicit splat in the second operand?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// TODO: Is the offset allowed to be a scalar with a vector?

I would argue for safeness despite redundancy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To state the obvious, AArch64 does not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for both: we get fewer and more elements for both type indices.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop the second one - it's adding untested paths

@tschuett
Copy link
Member Author

Ping.

@dc03-work
Copy link
Contributor

The new error after moreElementsToNextPow2 is:

unable to legalize instruction: %res:_(<3 x p0>) = G_PTR_ADD %ptr:_, %add:_(<3 x s64>) (in function: test_ptr_add_vec_3xp0)

Do you have a more detailed log? I guess scalarization can be done for odd-sized vectors, but having moreElements support would be a better solution.

@tschuett
Copy link
Member Author

I want to address the reported issue.

@davemgreen
Copy link
Collaborator

Can you add a .ll test to make sure we test the entire process? .mir tests are good for specific cases, but don't test the whole pipeline. I think I would also vote for just clamping the first index (and it only needs to clamp the max), but either way that sounds OK to me so long as you can add the test.

@tschuett
Copy link
Member Author

I only got the error message about legalization..

@tschuett
Copy link
Member Author

Unfortunately, I cannot add a ll file there are more issues:

bb.1 (%ir-block.0):
  liveins: $q0, $x0
  %0:_(p0) = COPY $x0
  %1:_(<4 x s32>) = COPY $q0
  %2:_(<4 x p0>) = G_BUILD_VECTOR %0:_(p0), %0:_(p0), %0:_(p0), %0:_(p0)
  %3:_(<4 x s64>) = G_SEXT %1:_(<4 x s32>)
  %4:_(<4 x p0>) = G_PTR_ADD %2:_, %3:_(<4 x s64>)
  %5:_(<4 x p0>) = COPY %4:_(<4 x p0>)
  %6:_(<2 x s64>), %7:_(<2 x s64>) = G_UNMERGE_VALUES %5:_(<4 x p0>)
  $q0 = COPY %6:_(<2 x s64>)
  $q1 = COPY %7:_(<2 x s64>)
  RET_ReallyLR implicit $q0, implicit $q1

# End machine code for function ptr_add_4_vector_ptr.

*** Bad machine code: G_UNMERGE_VALUES source operand does not match vector destination operands ***

@tschuett
Copy link
Member Author

Test:

define <4 x ptr> @ptr_add_4_vector_ptr(ptr %vec, <4 x i32> %offset) {
  %ptr_add = getelementptr i8, ptr %vec, <4 x i32> %offset
  ret <4 x ptr> %ptr_add
}

@tschuett
Copy link
Member Author

opened issue for the crash
#90202

@davemgreen
Copy link
Collaborator

Thanks - it looks like that G_UNMERGE_VALUES is maybe invalid, and it is probably the return type that is going wrong.

Can you change the test to store a value to show this is working now, and we can look at #90202 separately.

@tschuett
Copy link
Member Author

What do you mean by store? The mir test shows that it successfully legalized the ptradd.

@davemgreen
Copy link
Collaborator

What do you mean by store? The mir test shows that it successfully legalized the ptradd.

define void @ptr_add_4_vector_ptr(ptr %vec, <4 x i32> %offset) {
  %ptr_add = getelementptr i8, ptr %vec, <4 x i32> %offset
  store <4 x ptr> %ptr_add, ptr %vec
  ret void
}

As the problems you have encountered has shown, mir tests are not very useful in isolation and end-to-end tests work better if they are available.

@tschuett
Copy link
Member Author

tschuett commented Apr 28, 2024

The error message at the top is about legalization. The error is about calling conventions. It is unrelated.

Actually, it is the opposite. There is a fleet of legalize-opcode.mir files. We test legalization in isolation with mir files. End-to-end test are a distraction for testing legalization. The mir file exactly shows how it legalized the ptradd.

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I'd prefer to drop the untested clamp source type

Comment on lines +187 to +188
.clampNumElements(0, v2p0, v2p0)
.clampNumElements(1, v2s64, v2s64);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop the second one - it's adding untested paths

@tschuett tschuett merged commit 812c302 into llvm:main May 8, 2024
7 checks passed
@tschuett tschuett deleted the gisel-ptr-add2 branch May 8, 2024 03:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants