-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[LoadStoreVectorizer] Fix one-element vector handling #169671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-llvm-transforms Author: Gang Chen (cmc-rep) ChangesThis is the followup of #168135 Full diff: https://github.com/llvm/llvm-project/pull/169671.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp b/llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
index 6d24c407eb5f4..844c761c0e556 100644
--- a/llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
@@ -953,15 +953,15 @@ bool Vectorizer::vectorizeChain(Chain &C) {
unsigned EOffset =
(E.OffsetFromLeader - C[0].OffsetFromLeader).getZExtValue();
unsigned VecIdx = 8 * EOffset / DL.getTypeSizeInBits(VecElemTy);
- if (auto *VT = dyn_cast<FixedVectorType>(T)) {
+ if (VecTy == VecElemTy) {
+ V = VecInst;
+ } else if (auto *VT = dyn_cast<FixedVectorType>(T)) {
auto Mask = llvm::to_vector<8>(
llvm::seq<int>(VecIdx, VecIdx + VT->getNumElements()));
V = Builder.CreateShuffleVector(VecInst, Mask, I->getName());
- } else if (VecTy != VecElemTy) {
+ } else {
V = Builder.CreateExtractElement(VecInst, Builder.getInt32(VecIdx),
I->getName());
- } else {
- V = VecInst;
}
if (V->getType() != I->getType())
V = Builder.CreateBitOrPointerCast(V, I->getType());
diff --git a/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/vectorize-redund-loads.ll b/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/vectorize-redund-loads.ll
index 55b511fd51a2b..802795da47894 100644
--- a/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/vectorize-redund-loads.ll
+++ b/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/vectorize-redund-loads.ll
@@ -1,6 +1,33 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
; RUN: opt -mtriple=amdgcn-amd-amdhsa -passes=load-store-vectorizer -S -o - %s | FileCheck %s
+define void @onevec(ptr %ptr) {
+; CHECK-LABEL: define void @onevec(
+; CHECK-SAME: ptr [[PTR:%.*]]) {
+; CHECK-NEXT: [[TMP1:%.*]] = load i32, ptr [[PTR]], align 4
+; CHECK-NEXT: [[TMP2:%.*]] = bitcast i32 [[TMP1]] to <1 x i32>
+; CHECK-NEXT: [[GEP1:%.*]] = getelementptr inbounds i8, ptr [[PTR]], i32 16
+; CHECK-NEXT: [[TMP3:%.*]] = load i32, ptr [[GEP1]], align 4
+; CHECK-NEXT: [[TMP4:%.*]] = bitcast i32 [[TMP3]] to <1 x i32>
+; CHECK-NEXT: [[GEP2:%.*]] = getelementptr inbounds i8, ptr [[PTR]], i32 32
+; CHECK-NEXT: [[TMP5:%.*]] = load i32, ptr [[GEP2]], align 4
+; CHECK-NEXT: [[TMP6:%.*]] = bitcast i32 [[TMP5]] to <1 x i32>
+; CHECK-NEXT: [[TMP7:%.*]] = bitcast i32 [[TMP5]] to <1 x i32>
+; CHECK-NEXT: ret void
+;
+ %ld0 = load <1 x i32>, ptr %ptr, align 4
+ %ld1 = load i32, ptr %ptr, align 4
+
+ %gep1 = getelementptr inbounds i8, ptr %ptr, i32 16
+ %ld2 = load i32, ptr %gep1, align 4
+ %ld3 = load <1 x i32>, ptr %gep1, align 4
+
+ %gep2 = getelementptr inbounds i8, ptr %ptr, i32 32
+ %ld4 = load <1 x i32>, ptr %gep2, align 4
+ %ld5 = load <1 x i32>, ptr %gep2, align 4
+ ret void
+}
+
define void @test(ptr %ptr) {
; CHECK-LABEL: define void @test(
; CHECK-SAME: ptr [[PTR:%.*]]) {
|
| (E.OffsetFromLeader - C[0].OffsetFromLeader).getZExtValue(); | ||
| unsigned VecIdx = 8 * EOffset / DL.getTypeSizeInBits(VecElemTy); | ||
| if (auto *VT = dyn_cast<FixedVectorType>(T)) { | ||
| if (VecTy == VecElemTy) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be equivalent to change the condition to this? I think it would be clearer to the reader.
| if (VecTy == VecElemTy) { | |
| if (!VecTy->isVectorTy()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to have <1 x <2 x i16>> changed to <2 x i16>?
If so, VecTy is still a vector type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VecElemTy comes from getChainElemTy which has to return a scalar type, right? So for the special case we are handling here where NumElem == 1, VecElemTy and therefore VecTy should always be a scalar type.
This is the followup of llvm#168135
220a12d to
558e3f6
Compare
|
@cmc-rep feel free to merge this before my change if you think it is ready |
This is the followup of llvm#168135
This is the followup of llvm#168135
This is the followup of llvm#168135
This is the followup of #168135