-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GlobalIsel][AArch64] legalize ptr add #89218
Conversation
LLVM ERROR: unable to legalize instruction: %275:_(<4 x p0>) = G_PTR_ADD %268:_, %274:_(<4 x s64>) (in function: prepare_for_pass)
3 x p0 asserts |
@llvm/pr-subscribers-llvm-globalisel @llvm/pr-subscribers-backend-aarch64 Author: Thorsten Schütt (tschuett) ChangesLLVM ERROR: unable to legalize instruction: %275:(<4 x p0>) = G_PTR_ADD %268:, %274:_(<4 x s64>) (in function: prepare_for_pass) Full diff: https://github.com/llvm/llvm-project/pull/89218.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index 85dd0f2eb192d9..3d2d5f8287a70a 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -183,7 +183,9 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
getActionDefinitionsBuilder(G_PTR_ADD)
.legalFor({{p0, s64}, {v2p0, v2s64}})
- .clampScalar(1, s64, s64);
+ .clampScalar(1, s64, s64)
+ .clampNumElements(0, v2p0, v2p0)
+ .clampNumElements(1, v2s64, v2s64);
getActionDefinitionsBuilder(G_PTRMASK).legalFor({{p0, s64}});
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-ptr-add.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-ptr-add.mir
index 1ecd36b55380a6..1d3f7eab79d69d 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-ptr-add.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-ptr-add.mir
@@ -6,12 +6,65 @@ body: |
bb.0.entry:
; CHECK-LABEL: name: test_ptr_add_vec_p0
; CHECK: [[COPY:%[0-9]+]]:_(<2 x p0>) = COPY $q0
- ; CHECK: [[COPY1:%[0-9]+]]:_(<2 x s64>) = COPY $q1
- ; CHECK: [[PTR_ADD:%[0-9]+]]:_(<2 x p0>) = G_PTR_ADD [[COPY]], [[COPY1]](<2 x s64>)
- ; CHECK: $q0 = COPY [[PTR_ADD]](<2 x p0>)
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(<2 x s64>) = COPY $q1
+ ; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(<2 x p0>) = G_PTR_ADD [[COPY]], [[COPY1]](<2 x s64>)
+ ; CHECK-NEXT: $q0 = COPY [[PTR_ADD]](<2 x p0>)
%0:_(<2 x p0>) = COPY $q0
%1:_(<2 x s64>) = COPY $q1
%3:_(<2 x p0>) = G_PTR_ADD %0, %1(<2 x s64>)
$q0 = COPY %3(<2 x p0>)
...
+---
+name: test_ptr_add_vec_4xp0
+body: |
+ bb.0.entry:
+ ; CHECK-LABEL: name: test_ptr_add_vec_4xp0
+ ; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(p0) = COPY $x1
+ ; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(p0) = COPY $x2
+ ; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(p0) = COPY $x3
+ ; CHECK-NEXT: [[COPY4:%[0-9]+]]:_(s64) = COPY $x4
+ ; CHECK-NEXT: [[COPY5:%[0-9]+]]:_(s64) = COPY $x5
+ ; CHECK-NEXT: [[COPY6:%[0-9]+]]:_(s64) = COPY $x6
+ ; CHECK-NEXT: [[COPY7:%[0-9]+]]:_(s64) = COPY $x7
+ ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x p0>) = G_BUILD_VECTOR [[COPY]](p0), [[COPY1]](p0)
+ ; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x p0>) = G_BUILD_VECTOR [[COPY2]](p0), [[COPY3]](p0)
+ ; CHECK-NEXT: [[BUILD_VECTOR2:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[COPY4]](s64), [[COPY5]](s64)
+ ; CHECK-NEXT: [[BUILD_VECTOR3:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR [[COPY6]](s64), [[COPY7]](s64)
+ ; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(<2 x p0>) = G_PTR_ADD [[BUILD_VECTOR]], [[BUILD_VECTOR2]](<2 x s64>)
+ ; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(<2 x p0>) = G_PTR_ADD [[BUILD_VECTOR1]], [[BUILD_VECTOR3]](<2 x s64>)
+ ; CHECK-NEXT: %zero:_(s64) = G_CONSTANT i64 0
+ ; CHECK-NEXT: %one:_(s64) = G_CONSTANT i64 1
+ ; CHECK-NEXT: %extract0:_(p0) = G_EXTRACT_VECTOR_ELT [[PTR_ADD]](<2 x p0>), %zero(s64)
+ ; CHECK-NEXT: %extract1:_(p0) = G_EXTRACT_VECTOR_ELT [[PTR_ADD]](<2 x p0>), %one(s64)
+ ; CHECK-NEXT: %extract2:_(p0) = G_EXTRACT_VECTOR_ELT [[PTR_ADD1]](<2 x p0>), %zero(s64)
+ ; CHECK-NEXT: %extract3:_(p0) = G_EXTRACT_VECTOR_ELT [[PTR_ADD1]](<2 x p0>), %one(s64)
+ ; CHECK-NEXT: $x0 = COPY %extract0(p0)
+ ; CHECK-NEXT: $x1 = COPY %extract1(p0)
+ ; CHECK-NEXT: $x2 = COPY %extract2(p0)
+ ; CHECK-NEXT: $x3 = COPY %extract3(p0)
+ %0:_(p0) = COPY $x0
+ %1:_(p0) = COPY $x1
+ %2:_(p0) = COPY $x2
+ %3:_(p0) = COPY $x3
+ %4:_(s64) = COPY $x4
+ %5:_(s64) = COPY $x5
+ %6:_(s64) = COPY $x6
+ %7:_(s64) = COPY $x7
+ %ptr:_(<4 x p0>) = G_BUILD_VECTOR %0(p0), %1(p0), %2(p0), %3(p0)
+ %add:_(<4 x s64>) = G_BUILD_VECTOR %4(s64), %5(s64), %6(s64), %7(s64)
+ %res:_(<4 x p0>) = G_PTR_ADD %ptr, %add(<4 x s64>)
+ %zero:_(s64) = G_CONSTANT i64 0
+ %one:_(s64) = G_CONSTANT i64 1
+ %two:_(s64) = G_CONSTANT i64 2
+ %three:_(s64) = G_CONSTANT i64 3
+ %extract0:_(p0) = G_EXTRACT_VECTOR_ELT %res(<4 x p0>), %zero(s64)
+ %extract1:_(p0) = G_EXTRACT_VECTOR_ELT %res(<4 x p0>), %one(s64)
+ %extract2:_(p0) = G_EXTRACT_VECTOR_ELT %res(<4 x p0>), %two(s64)
+ %extract3:_(p0) = G_EXTRACT_VECTOR_ELT %res(<4 x p0>), %three(s64)
+ $x0 = COPY %extract0(p0)
+ $x1 = COPY %extract1(p0)
+ $x2 = COPY %extract2(p0)
+ $x3 = COPY %extract3(p0)
+...
|
|
Does your patch fix this or is it still the case after applying your patch? |
After this patch, it still asserts. |
You can likely put a Edit: this case should also be added as a test case if it doesn't already exist. |
The new error after
|
.clampNumElements(0, v2p0, v2p0) | ||
.clampNumElements(1, v2s64, v2s64); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be redundant? Do we support implicit splat in the second operand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// TODO: Is the offset allowed to be a scalar with a vector?
I would argue for safeness despite redundancy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To state the obvious, AArch64 does not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote for both: we get fewer and more elements for both type indices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop the second one - it's adding untested paths
Ping. |
Do you have a more detailed log? I guess scalarization can be done for odd-sized vectors, but having moreElements support would be a better solution. |
I want to address the reported issue. |
Can you add a .ll test to make sure we test the entire process? .mir tests are good for specific cases, but don't test the whole pipeline. I think I would also vote for just clamping the first index (and it only needs to clamp the max), but either way that sounds OK to me so long as you can add the test. |
I only got the error message about legalization.. |
Unfortunately, I cannot add a ll file there are more issues:
|
Test:
|
opened issue for the crash |
Thanks - it looks like that G_UNMERGE_VALUES is maybe invalid, and it is probably the return type that is going wrong. Can you change the test to store a value to show this is working now, and we can look at #90202 separately. |
What do you mean by store? The mir test shows that it successfully legalized the ptradd. |
As the problems you have encountered has shown, mir tests are not very useful in isolation and end-to-end tests work better if they are available. |
The error message at the top is about legalization. The error is about calling conventions. It is unrelated. Actually, it is the opposite. There is a fleet of legalize-opcode.mir files. We test legalization in isolation with mir files. End-to-end test are a distraction for testing legalization. The mir file exactly shows how it legalized the ptradd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but I'd prefer to drop the untested clamp source type
.clampNumElements(0, v2p0, v2p0) | ||
.clampNumElements(1, v2s64, v2s64); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop the second one - it's adding untested paths
LLVM ERROR: unable to legalize instruction: %275:(<4 x p0>) = G_PTR_ADD %268:, %274:_(<4 x s64>) (in function: prepare_for_pass)