Skip to content

Conversation

@cmc-rep
Copy link
Contributor

@cmc-rep cmc-rep commented Nov 26, 2025

This is the followup of #168135

@llvmbot
Copy link
Member

llvmbot commented Nov 26, 2025

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: Gang Chen (cmc-rep)

Changes

This is the followup of #168135


Full diff: https://github.com/llvm/llvm-project/pull/169671.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp (+4-4)
  • (modified) llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/vectorize-redund-loads.ll (+27)
diff --git a/llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp b/llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
index 6d24c407eb5f4..844c761c0e556 100644
--- a/llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
@@ -953,15 +953,15 @@ bool Vectorizer::vectorizeChain(Chain &C) {
       unsigned EOffset =
           (E.OffsetFromLeader - C[0].OffsetFromLeader).getZExtValue();
       unsigned VecIdx = 8 * EOffset / DL.getTypeSizeInBits(VecElemTy);
-      if (auto *VT = dyn_cast<FixedVectorType>(T)) {
+      if (VecTy == VecElemTy) {
+        V = VecInst;
+      } else if (auto *VT = dyn_cast<FixedVectorType>(T)) {
         auto Mask = llvm::to_vector<8>(
             llvm::seq<int>(VecIdx, VecIdx + VT->getNumElements()));
         V = Builder.CreateShuffleVector(VecInst, Mask, I->getName());
-      } else if (VecTy != VecElemTy) {
+      } else {
         V = Builder.CreateExtractElement(VecInst, Builder.getInt32(VecIdx),
                                          I->getName());
-      } else {
-        V = VecInst;
       }
       if (V->getType() != I->getType())
         V = Builder.CreateBitOrPointerCast(V, I->getType());
diff --git a/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/vectorize-redund-loads.ll b/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/vectorize-redund-loads.ll
index 55b511fd51a2b..802795da47894 100644
--- a/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/vectorize-redund-loads.ll
+++ b/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/vectorize-redund-loads.ll
@@ -1,6 +1,33 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
 ; RUN: opt -mtriple=amdgcn-amd-amdhsa -passes=load-store-vectorizer -S -o - %s | FileCheck %s
 
+define void @onevec(ptr %ptr) {
+; CHECK-LABEL: define void @onevec(
+; CHECK-SAME: ptr [[PTR:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[PTR]], align 4
+; CHECK-NEXT:    [[TMP2:%.*]] = bitcast i32 [[TMP1]] to <1 x i32>
+; CHECK-NEXT:    [[GEP1:%.*]] = getelementptr inbounds i8, ptr [[PTR]], i32 16
+; CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[GEP1]], align 4
+; CHECK-NEXT:    [[TMP4:%.*]] = bitcast i32 [[TMP3]] to <1 x i32>
+; CHECK-NEXT:    [[GEP2:%.*]] = getelementptr inbounds i8, ptr [[PTR]], i32 32
+; CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[GEP2]], align 4
+; CHECK-NEXT:    [[TMP6:%.*]] = bitcast i32 [[TMP5]] to <1 x i32>
+; CHECK-NEXT:    [[TMP7:%.*]] = bitcast i32 [[TMP5]] to <1 x i32>
+; CHECK-NEXT:    ret void
+;
+  %ld0 = load <1 x i32>, ptr %ptr, align 4
+  %ld1 = load i32, ptr %ptr, align 4
+
+  %gep1 = getelementptr inbounds i8, ptr %ptr, i32 16
+  %ld2 = load i32, ptr %gep1, align 4
+  %ld3 = load <1 x i32>, ptr %gep1, align 4
+
+  %gep2 = getelementptr inbounds i8, ptr %ptr, i32 32
+  %ld4 = load <1 x i32>, ptr %gep2, align 4
+  %ld5 = load <1 x i32>, ptr %gep2, align 4
+  ret void
+}
+
 define void @test(ptr %ptr) {
 ; CHECK-LABEL: define void @test(
 ; CHECK-SAME: ptr [[PTR:%.*]]) {

(E.OffsetFromLeader - C[0].OffsetFromLeader).getZExtValue();
unsigned VecIdx = 8 * EOffset / DL.getTypeSizeInBits(VecElemTy);
if (auto *VT = dyn_cast<FixedVectorType>(T)) {
if (VecTy == VecElemTy) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be equivalent to change the condition to this? I think it would be clearer to the reader.

Suggested change
if (VecTy == VecElemTy) {
if (!VecTy->isVectorTy()) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have <1 x <2 x i16>> changed to <2 x i16>?
If so, VecTy is still a vector type

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VecElemTy comes from getChainElemTy which has to return a scalar type, right? So for the special case we are handling here where NumElem == 1, VecElemTy and therefore VecTy should always be a scalar type.

@ronlieb ronlieb self-requested a review November 26, 2025 16:47
@dakersnar
Copy link
Contributor

@cmc-rep feel free to merge this before my change if you think it is ready

@cmc-rep cmc-rep merged commit ceba82f into llvm:main Nov 27, 2025
8 of 9 checks passed
tanji-dg pushed a commit to tanji-dg/llvm-project that referenced this pull request Nov 27, 2025
ronlieb pushed a commit to ROCm/llvm-project that referenced this pull request Nov 27, 2025
GeneraluseAI pushed a commit to GeneraluseAI/llvm-project that referenced this pull request Nov 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants