[SROA] Use !tbaa instead of !tbaa.struct if op matches field. #81289

fhahn · 2024-02-09T18:29:24Z

If a split memory access introduced by SROA accesses precisely a single field of the original operation's !tbaa.struct, use the !tbaa tag for the accessed field directly instead of the full !tbaa.struct.

InstCombine already had a similar logic.

Motivation for this and follow-on patches is to improve codegen for libc++, where using memcpy limits optimizations, like vectorization for code iteration over std::vector<std::complex>: https://godbolt.org/z/f3vqYos3c

Depends on #81285.

Retain TBAAStruct if we fail to match the access to a single field. All users at the moment use this when using the full size of the original access. SROA also retains the original TBAAStruct when accessing parts at offset 0. Motivation for this and follow-on patches is to improve codegen for libc++, where using memcpy limits optimizations, like vectorization for code iteration over std::vector<std::complex<float>>: https://godbolt.org/z/f3vqYos3c Depends on #81284

If a split memory access introduced by SROA accesses precisely a single field of the original operation's !tbaa.struct, use the !tbaa tag for the accessed field directly instead of the full !tbaa.struct. InstCombine already had a similar logic. Motivation for this and follow-on patches is to improve codegen for libc++, where using memcpy limits optimizations, like vectorization for code iteration over std::vector<std::complex<float>>: https://godbolt.org/z/f3vqYos3c Depends on #81285.

llvmbot · 2024-02-09T18:29:54Z

@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

If a split memory access introduced by SROA accesses precisely a single field of the original operation's !tbaa.struct, use the !tbaa tag for the accessed field directly instead of the full !tbaa.struct.

InstCombine already had a similar logic.

Motivation for this and follow-on patches is to improve codegen for libc++, where using memcpy limits optimizations, like vectorization for code iteration over std::vector<std::complex<float>>: https://godbolt.org/z/f3vqYos3c

Depends on #81285.

Full diff: https://github.com/llvm/llvm-project/pull/81289.diff

5 Files Affected:

(modified) llvm/include/llvm/IR/Metadata.h (+2)
(modified) llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp (+13)
(modified) llvm/lib/Transforms/Scalar/SROA.cpp (+34-14)
(modified) llvm/test/Transforms/SROA/tbaa-struct2.ll (+10-11)
(modified) llvm/test/Transforms/SROA/tbaa-struct3.ll (+8-8)

diff --git a/llvm/include/llvm/IR/Metadata.h b/llvm/include/llvm/IR/Metadata.h
index 6f23ac44dee968..33363a271d4823 100644
--- a/llvm/include/llvm/IR/Metadata.h
+++ b/llvm/include/llvm/IR/Metadata.h
@@ -849,6 +849,8 @@ struct AAMDNodes {
   /// If his AAMDNode has !tbaa.struct and \p AccessSize matches the size of the
   /// field at offset 0, get the TBAA tag describing the accessed field.
   AAMDNodes adjustForAccess(unsigned AccessSize);
+  AAMDNodes adjustForAccess(size_t Offset, Type *AccessTy,
+                            const DataLayout &DL);
 };
 
 // Specialize DenseMapInfo for AAMDNodes.
diff --git a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
index bfd70414c0340c..b2dc451d581939 100644
--- a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
@@ -833,3 +833,16 @@ AAMDNodes AAMDNodes::adjustForAccess(unsigned AccessSize) {
   }
   return New;
 }
+
+AAMDNodes AAMDNodes::adjustForAccess(size_t Offset, Type *AccessTy,
+                                     const DataLayout &DL) {
+
+  AAMDNodes New = shift(Offset);
+  if (!DL.typeSizeEqualsStoreSize(AccessTy))
+    return New;
+  TypeSize Size = DL.getTypeStoreSize(AccessTy);
+  if (Size.isScalable())
+    return New;
+
+  return New.adjustForAccess(Size.getKnownMinValue());
+}
diff --git a/llvm/lib/Transforms/Scalar/SROA.cpp b/llvm/lib/Transforms/Scalar/SROA.cpp
index 138dc38b5c14ce..f24cbbc1fe0591 100644
--- a/llvm/lib/Transforms/Scalar/SROA.cpp
+++ b/llvm/lib/Transforms/Scalar/SROA.cpp
@@ -2914,7 +2914,8 @@ class AllocaSliceRewriter : public InstVisitor<AllocaSliceRewriter, bool> {
 
       // Do this after copyMetadataForLoad() to preserve the TBAA shift.
       if (AATags)
-        NewLI->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+        NewLI->setAAMetadata(AATags.adjustForAccess(
+            NewBeginOffset - BeginOffset, NewLI->getType(), DL));
 
       // Try to preserve nonnull metadata
       V = NewLI;
@@ -2936,7 +2937,9 @@ class AllocaSliceRewriter : public InstVisitor<AllocaSliceRewriter, bool> {
           IRB.CreateAlignedLoad(TargetTy, getNewAllocaSlicePtr(IRB, LTy),
                                 getSliceAlign(), LI.isVolatile(), LI.getName());
       if (AATags)
-        NewLI->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+        NewLI->setAAMetadata(AATags.adjustForAccess(
+            NewBeginOffset - BeginOffset, NewLI->getType(), DL));
+
       if (LI.isVolatile())
         NewLI->setAtomic(LI.getOrdering(), LI.getSyncScopeID());
       NewLI->copyMetadata(LI, {LLVMContext::MD_mem_parallel_loop_access,
@@ -3011,7 +3014,8 @@ class AllocaSliceRewriter : public InstVisitor<AllocaSliceRewriter, bool> {
     Store->copyMetadata(SI, {LLVMContext::MD_mem_parallel_loop_access,
                              LLVMContext::MD_access_group});
     if (AATags)
-      Store->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+      Store->setAAMetadata(AATags.adjustForAccess(NewBeginOffset - BeginOffset,
+                                                  V->getType(), DL));
     Pass.DeadInsts.push_back(&SI);
 
     // NOTE: Careful to use OrigV rather than V.
@@ -3038,7 +3042,8 @@ class AllocaSliceRewriter : public InstVisitor<AllocaSliceRewriter, bool> {
     Store->copyMetadata(SI, {LLVMContext::MD_mem_parallel_loop_access,
                              LLVMContext::MD_access_group});
     if (AATags)
-      Store->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+      Store->setAAMetadata(AATags.adjustForAccess(NewBeginOffset - BeginOffset,
+                                                  V->getType(), DL));
 
     migrateDebugInfo(&OldAI, IsSplit, NewBeginOffset * 8, SliceSize * 8, &SI,
                      Store, Store->getPointerOperand(),
@@ -3097,8 +3102,10 @@ class AllocaSliceRewriter : public InstVisitor<AllocaSliceRewriter, bool> {
     }
     NewSI->copyMetadata(SI, {LLVMContext::MD_mem_parallel_loop_access,
                              LLVMContext::MD_access_group});
-    if (AATags)
-      NewSI->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+    if (AATags) {
+      NewSI->setAAMetadata(AATags.adjustForAccess(NewBeginOffset - BeginOffset,
+                                                  V->getType(), DL));
+    }
     if (SI.isVolatile())
       NewSI->setAtomic(SI.getOrdering(), SI.getSyncScopeID());
     if (NewSI->isAtomic())
@@ -3280,8 +3287,10 @@ class AllocaSliceRewriter : public InstVisitor<AllocaSliceRewriter, bool> {
         IRB.CreateAlignedStore(V, NewPtr, NewAI.getAlign(), II.isVolatile());
     New->copyMetadata(II, {LLVMContext::MD_mem_parallel_loop_access,
                            LLVMContext::MD_access_group});
-    if (AATags)
-      New->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+    if (AATags) {
+      New->setAAMetadata(AATags.adjustForAccess(NewBeginOffset - BeginOffset,
+                                                V->getType(), DL));
+    }
 
     migrateDebugInfo(&OldAI, IsSplit, NewBeginOffset * 8, SliceSize * 8, &II,
                      New, New->getPointerOperand(), V, DL);
@@ -3486,7 +3495,8 @@ class AllocaSliceRewriter : public InstVisitor<AllocaSliceRewriter, bool> {
       Load->copyMetadata(II, {LLVMContext::MD_mem_parallel_loop_access,
                               LLVMContext::MD_access_group});
       if (AATags)
-        Load->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+        Load->setAAMetadata(AATags.adjustForAccess(NewBeginOffset - BeginOffset,
+                                                   Load->getType(), DL));
       Src = Load;
     }
 
@@ -3507,8 +3517,10 @@ class AllocaSliceRewriter : public InstVisitor<AllocaSliceRewriter, bool> {
         IRB.CreateAlignedStore(Src, DstPtr, DstAlign, II.isVolatile()));
     Store->copyMetadata(II, {LLVMContext::MD_mem_parallel_loop_access,
                              LLVMContext::MD_access_group});
-    if (AATags)
-      Store->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+    if (AATags) {
+      Store->setAAMetadata(AATags.adjustForAccess(NewBeginOffset - BeginOffset,
+                                                  Src->getType(), DL));
+    }
 
     APInt Offset(DL.getIndexTypeSizeInBits(DstPtr->getType()), 0);
     if (IsDest) {
@@ -3836,7 +3848,8 @@ class AggLoadStoreRewriter : public InstVisitor<AggLoadStoreRewriter, bool> {
           DL.getIndexSizeInBits(Ptr->getType()->getPointerAddressSpace()), 0);
       if (AATags &&
           GEPOperator::accumulateConstantOffset(BaseTy, GEPIndices, DL, Offset))
-        Load->setAAMetadata(AATags.shift(Offset.getZExtValue()));
+        Load->setAAMetadata(
+            AATags.adjustForAccess(Offset.getZExtValue(), Load->getType(), DL));
 
       Agg = IRB.CreateInsertValue(Agg, Load, Indices, Name + ".insert");
       LLVM_DEBUG(dbgs() << "          to: " << *Load << "\n");
@@ -3887,8 +3900,10 @@ class AggLoadStoreRewriter : public InstVisitor<AggLoadStoreRewriter, bool> {
       APInt Offset(
           DL.getIndexSizeInBits(Ptr->getType()->getPointerAddressSpace()), 0);
       GEPOperator::accumulateConstantOffset(BaseTy, GEPIndices, DL, Offset);
-      if (AATags)
-        Store->setAAMetadata(AATags.shift(Offset.getZExtValue()));
+      if (AATags) {
+        Store->setAAMetadata(AATags.adjustForAccess(
+            Offset.getZExtValue(), ExtractValue->getType(), DL));
+      }
 
       // migrateDebugInfo requires the base Alloca. Walk to it from this gep.
       // If we cannot (because there's an intervening non-const or unbounded
@@ -4542,6 +4557,7 @@ bool SROA::presplitLoadsAndStores(AllocaInst &AI, AllocaSlices &AS) {
 
       Value *StoreBasePtr = SI->getPointerOperand();
       IRB.SetInsertPoint(SI);
+      AAMDNodes AATags = SI->getAAMetadata();
 
       LLVM_DEBUG(dbgs() << "    Splitting store of load: " << *SI << "\n");
 
@@ -4561,6 +4577,10 @@ bool SROA::presplitLoadsAndStores(AllocaInst &AI, AllocaSlices &AS) {
         PStore->copyMetadata(*SI, {LLVMContext::MD_mem_parallel_loop_access,
                                    LLVMContext::MD_access_group,
                                    LLVMContext::MD_DIAssignID});
+
+        if (AATags)
+          PStore->setAAMetadata(
+              AATags.adjustForAccess(PartOffset, PLoad->getType(), DL));
         LLVM_DEBUG(dbgs() << "      +" << PartOffset << ":" << *PStore << "\n");
       }
 
diff --git a/llvm/test/Transforms/SROA/tbaa-struct2.ll b/llvm/test/Transforms/SROA/tbaa-struct2.ll
index 1fd37e82d67775..02c99a2b329457 100644
--- a/llvm/test/Transforms/SROA/tbaa-struct2.ll
+++ b/llvm/test/Transforms/SROA/tbaa-struct2.ll
@@ -13,9 +13,9 @@ define double @bar(ptr %wishart) {
 ; CHECK-NEXT:    [[TMP_SROA_3:%.*]] = alloca [4 x i8], align 4
 ; CHECK-NEXT:    [[TMP_SROA_0_0_COPYLOAD:%.*]] = load double, ptr [[WISHART:%.*]], align 8, !tbaa.struct [[TBAA_STRUCT0:![0-9]+]]
 ; CHECK-NEXT:    [[TMP_SROA_2_0_WISHART_SROA_IDX:%.*]] = getelementptr inbounds i8, ptr [[WISHART]], i64 8
-; CHECK-NEXT:    [[TMP_SROA_2_0_COPYLOAD:%.*]] = load i32, ptr [[TMP_SROA_2_0_WISHART_SROA_IDX]], align 8, !tbaa.struct [[TBAA_STRUCT7:![0-9]+]]
+; CHECK-NEXT:    [[TMP_SROA_2_0_COPYLOAD:%.*]] = load i32, ptr [[TMP_SROA_2_0_WISHART_SROA_IDX]], align 8, !tbaa [[TBAA5:![0-9]+]]
 ; CHECK-NEXT:    [[TMP_SROA_3_0_WISHART_SROA_IDX:%.*]] = getelementptr inbounds i8, ptr [[WISHART]], i64 12
-; CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i64(ptr align 4 [[TMP_SROA_3]], ptr align 4 [[TMP_SROA_3_0_WISHART_SROA_IDX]], i64 4, i1 false), !tbaa.struct [[TBAA_STRUCT8:![0-9]+]]
+; CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i64(ptr align 4 [[TMP_SROA_3]], ptr align 4 [[TMP_SROA_3_0_WISHART_SROA_IDX]], i64 4, i1 false), !tbaa.struct [[TBAA_STRUCT7:![0-9]+]]
 ; CHECK-NEXT:    [[CALL:%.*]] = call double @subcall(double [[TMP_SROA_0_0_COPYLOAD]], i32 [[TMP_SROA_2_0_COPYLOAD]])
 ; CHECK-NEXT:    ret double [[CALL]]
 ;
@@ -38,15 +38,14 @@ define double @bar(ptr %wishart) {
 ;.
 ; CHECK: attributes #[[ATTR0:[0-9]+]] = { nocallback nofree nounwind willreturn memory(argmem: readwrite) }
 ;.
-; CHECK: [[TBAA_STRUCT0]] = !{i64 0, i64 8, !1, i64 8, i64 4, !5}
-; CHECK: [[META1:![0-9]+]] = !{!2, !2, i64 0}
-; CHECK: [[META2:![0-9]+]] = !{!"double", !3, i64 0}
-; CHECK: [[META3:![0-9]+]] = !{!"omnipotent char", !4, i64 0}
-; CHECK: [[META4:![0-9]+]] = !{!"Simple C++ TBAA"}
-; CHECK: [[META5:![0-9]+]] = !{!6, !6, i64 0}
-; CHECK: [[META6:![0-9]+]] = !{!"int", !3, i64 0}
-; CHECK: [[TBAA_STRUCT7]] = !{i64 0, i64 4, !5}
-; CHECK: [[TBAA_STRUCT8]] = !{}
+; CHECK: [[TBAA_STRUCT0]] = !{i64 0, i64 8, [[META1:![0-9]+]], i64 8, i64 4, [[TBAA5]]}
+; CHECK: [[META1]] = !{[[META2:![0-9]+]], [[META2]], i64 0}
+; CHECK: [[META2]] = !{!"double", [[META3:![0-9]+]], i64 0}
+; CHECK: [[META3]] = !{!"omnipotent char", [[META4:![0-9]+]], i64 0}
+; CHECK: [[META4]] = !{!"Simple C++ TBAA"}
+; CHECK: [[TBAA5]] = !{[[META6:![0-9]+]], [[META6]], i64 0}
+; CHECK: [[META6]] = !{!"int", [[META3]], i64 0}
+; CHECK: [[TBAA_STRUCT7]] = !{}
 ;.
 ;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
 ; CHECK-MODIFY-CFG: {{.*}}
diff --git a/llvm/test/Transforms/SROA/tbaa-struct3.ll b/llvm/test/Transforms/SROA/tbaa-struct3.ll
index 4910e0e07ae380..603e7d708647fc 100644
--- a/llvm/test/Transforms/SROA/tbaa-struct3.ll
+++ b/llvm/test/Transforms/SROA/tbaa-struct3.ll
@@ -7,9 +7,9 @@ define void @load_store_transfer_split_struct_tbaa_2_float(ptr dereferenceable(2
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    [[TMP0:%.*]] = bitcast float [[A]] to i32
 ; CHECK-NEXT:    [[TMP1:%.*]] = bitcast float [[B]] to i32
-; CHECK-NEXT:    store i32 [[TMP0]], ptr [[RES]], align 4
+; CHECK-NEXT:    store i32 [[TMP0]], ptr [[RES]], align 4, !tbaa.struct [[TBAA_STRUCT0:![0-9]+]]
 ; CHECK-NEXT:    [[RES_SROA_IDX:%.*]] = getelementptr inbounds i8, ptr [[RES]], i64 4
-; CHECK-NEXT:    store i32 [[TMP1]], ptr [[RES_SROA_IDX]], align 4
+; CHECK-NEXT:    store i32 [[TMP1]], ptr [[RES_SROA_IDX]], align 4, !tbaa [[TBAA1:![0-9]+]]
 ; CHECK-NEXT:    [[P:%.*]] = load ptr, ptr [[RES]], align 8
 ; CHECK-NEXT:    ret void
 ;
@@ -29,9 +29,9 @@ define void @memcpy_transfer(ptr dereferenceable(24) %res, float %a, float %b) {
 ; CHECK-SAME: ptr dereferenceable(24) [[RES:%.*]], float [[A:%.*]], float [[B:%.*]]) {
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    [[L_PTR:%.*]] = load ptr, ptr [[RES]], align 8
-; CHECK-NEXT:    store float [[A]], ptr [[L_PTR]], align 1, !tbaa.struct [[TBAA_STRUCT0:![0-9]+]]
+; CHECK-NEXT:    store float [[A]], ptr [[L_PTR]], align 1, !tbaa.struct [[TBAA_STRUCT0]]
 ; CHECK-NEXT:    [[TMP_SROA_2_0_L_PTR_SROA_IDX:%.*]] = getelementptr inbounds i8, ptr [[L_PTR]], i64 4
-; CHECK-NEXT:    store float [[B]], ptr [[TMP_SROA_2_0_L_PTR_SROA_IDX]], align 1, !tbaa.struct [[TBAA_STRUCT5:![0-9]+]]
+; CHECK-NEXT:    store float [[B]], ptr [[TMP_SROA_2_0_L_PTR_SROA_IDX]], align 1, !tbaa [[TBAA1]]
 ; CHECK-NEXT:    ret void
 ;
 entry:
@@ -53,7 +53,7 @@ define void @memcpy_transfer_tbaa_field_and_size_do_not_align(ptr dereferenceabl
 ; CHECK-NEXT:    [[TMP_SROA_2_0_L_PTR_SROA_IDX:%.*]] = getelementptr inbounds i8, ptr [[L_PTR]], i64 4
 ; CHECK-NEXT:    [[TMP0:%.*]] = bitcast float [[B]] to i32
 ; CHECK-NEXT:    [[TMP_SROA_2_0_EXTRACT_TRUNC:%.*]] = trunc i32 [[TMP0]] to i16
-; CHECK-NEXT:    store i16 [[TMP_SROA_2_0_EXTRACT_TRUNC]], ptr [[TMP_SROA_2_0_L_PTR_SROA_IDX]], align 1, !tbaa.struct [[TBAA_STRUCT5]]
+; CHECK-NEXT:    store i16 [[TMP_SROA_2_0_EXTRACT_TRUNC]], ptr [[TMP_SROA_2_0_L_PTR_SROA_IDX]], align 1, !tbaa.struct [[TBAA_STRUCT5:![0-9]+]]
 ; CHECK-NEXT:    ret void
 ;
 entry:
@@ -98,10 +98,10 @@ declare void @llvm.memcpy.p0.p0.i64(ptr noalias nocapture writeonly, ptr noalias
 !3 = !{!"omnipotent char", !4, i64 0}
 !4 = !{!"Simple C++ TBAA"}
 ;.
-; CHECK: [[TBAA_STRUCT0]] = !{i64 0, i64 4, [[META1:![0-9]+]], i64 4, i64 4, [[META1]]}
-; CHECK: [[META1]] = !{[[META2:![0-9]+]], [[META2]], i64 0}
+; CHECK: [[TBAA_STRUCT0]] = !{i64 0, i64 4, [[TBAA1]], i64 4, i64 4, [[TBAA1]]}
+; CHECK: [[TBAA1]] = !{[[META2:![0-9]+]], [[META2]], i64 0}
 ; CHECK: [[META2]] = !{!"float", [[META3:![0-9]+]], i64 0}
 ; CHECK: [[META3]] = !{!"omnipotent char", [[META4:![0-9]+]], i64 0}
 ; CHECK: [[META4]] = !{!"Simple C++ TBAA"}
-; CHECK: [[TBAA_STRUCT5]] = !{i64 0, i64 4, [[META1]]}
+; CHECK: [[TBAA_STRUCT5]] = !{i64 0, i64 4, [[TBAA1]]}
 ;.

dobbelaj-snps · 2024-02-09T19:32:59Z

llvm/lib/Transforms/Scalar/SROA.cpp

@@ -4561,6 +4577,10 @@ bool SROA::presplitLoadsAndStores(AllocaInst &AI, AllocaSlices &AS) {
        PStore->copyMetadata(*SI, {LLVMContext::MD_mem_parallel_loop_access,
                                   LLVMContext::MD_access_group,
                                   LLVMContext::MD_DIAssignID});
+
+        if (AATags)
+          PStore->setAAMetadata(


Is this part covered in a testcase ?

Should be now, I added a number of additional tests that should cover all cases here

llvm/test/Transforms/SROA/tbaa-struct3.ll

dobbelaj-snps · 2024-02-09T19:42:09Z

Hmm. 10 changes + 1 new usage of setAAMetaData, But only 4 relevant changes in tests.. That part of SROA seems to lack some testing ?

Motivation for this and follow-on patches is to improve codegen for libc++, where using memcpy limits optimizations, like vectorization for code iteration over std::vector<std::complex<float>>: https://godbolt.org/z/f3vqYos3c Depends on #81289.

fhahn · 2024-02-09T20:15:12Z

Hmm. 10 changes + 1 new usage of setAAMetaData, But only 4 relevant changes in tests.. That part of SROA seems to lack some testing ?

Yes, will add the missing coverage, just wanted to make sure this makes sense in general beforehand

…-clear-struct-if-field-matches Conflicts: llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp

dobbelaj-snps · 2024-02-12T14:14:08Z

Hmm. 10 changes + 1 new usage of setAAMetaData, But only 4 relevant changes in tests.. That part of SROA seems to lack some testing ?

Yes, will add the missing coverage, just wanted to make sure this makes sense in general beforehand

I think it makes sense.

For this one to go through, I think we need at least a testcase that triggers the new 'PStore->setAAMetaData'... case. (line 4582)

…matches

…to users/fhahn/sroa-use-tbaa-for-matching-field-access

…tbaa-for-matching-field-access

Add tests to cover missing cases for #81289 and #81313.

…tbaa-for-matching-field-access Conflicts: llvm/test/Transforms/SROA/tbaa-struct3.ll

…matches

fhahn

@dobbelaj-snps Added a substantial number of tests that should cover all cases now in 2a9b86c

fhahn · 2024-02-15T21:38:40Z

llvm/lib/Transforms/Scalar/SROA.cpp

@@ -4561,6 +4577,10 @@ bool SROA::presplitLoadsAndStores(AllocaInst &AI, AllocaSlices &AS) {
        PStore->copyMetadata(*SI, {LLVMContext::MD_mem_parallel_loop_access,
                                   LLVMContext::MD_access_group,
                                   LLVMContext::MD_DIAssignID});
+
+        if (AATags)
+          PStore->setAAMetadata(


Should be now, I added a number of additional tests that should cover all cases here

dobbelaj-snps

lgtm

…-access

…offset and size. (#81313) Motivation for this and follow-on patches is to improve codegen for libc++, where using memcpy limits optimizations, like vectorization for code iteration over std::vector<std::complex<float>>: https://godbolt.org/z/f3vqYos3c Depends on #81289. PR: #81313

Add tests to cover missing cases for llvm#81289 and llvm#81313. (cherry-picked from 2a9b86c)

…1289) If a split memory access introduced by SROA accesses precisely a single field of the original operation's !tbaa.struct, use the !tbaa tag for the accessed field directly instead of the full !tbaa.struct. InstCombine already had a similar logic. Motivation for this and follow-on patches is to improve codegen for libc++, where using memcpy limits optimizations, like vectorization for code iteration over std::vector<std::complex<float>>: https://godbolt.org/z/f3vqYos3c Depends on llvm#81285. (cherry-picked from 53c0e80)

…offset and size. (llvm#81313) Motivation for this and follow-on patches is to improve codegen for libc++, where using memcpy limits optimizations, like vectorization for code iteration over std::vector<std::complex<float>>: https://godbolt.org/z/f3vqYos3c Depends on llvm#81289. PR: llvm#81313 (cherry-picked from dc85719)

fhahn added 3 commits February 9, 2024 17:16

[TBAA] Extract logic to use TBAA tag for field of !tbaa.struct (NFC).

e6de9f2

fhahn requested review from nikic, wsmoses, efriedma-quic and dobbelaj-snps February 9, 2024 18:29

llvmbot added llvm:ir llvm:analysis llvm:transforms labels Feb 9, 2024

dobbelaj-snps mentioned this pull request Feb 9, 2024

[IR][TBAA] Allow multiple fileds with same offset in TBAA struct-path #76356

Open

dobbelaj-snps reviewed Feb 9, 2024

View reviewed changes

llvm/test/Transforms/SROA/tbaa-struct3.ll Show resolved Hide resolved

fhahn mentioned this pull request Feb 9, 2024

[TBAA] Use !tbaa for first accessed field if it is an exact match in offset and size. #81313

Merged

Merge remote-tracking branch 'origin/main' into users/fhahn/tbaa-only…

993aa19

…-clear-struct-if-field-matches Conflicts: llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp

fhahn added 3 commits February 14, 2024 14:12

Merge branch 'main' into users/fhahn/tbaa-only-clear-struct-if-field-…

cf510a1

…matches

Merge branch 'users/fhahn/tbaa-only-clear-struct-if-field-matches' in…

ee3cf18

…to users/fhahn/sroa-use-tbaa-for-matching-field-access

Merge remote-tracking branch 'origin/main' into users/fhahn/sroa-use-…

591a3f3

…tbaa-for-matching-field-access

fhahn added a commit that referenced this pull request Feb 15, 2024

[SROA] Extend !tbaa.struct test coverage with multiple missing cases.

2a9b86c

Add tests to cover missing cases for #81289 and #81313.

fhahn added 3 commits February 15, 2024 21:22

Merge remote-tracking branch 'origin/main' into users/fhahn/sroa-use-…

ab4b11c

…tbaa-for-matching-field-access Conflicts: llvm/test/Transforms/SROA/tbaa-struct3.ll

!fixup update new test cases, add handle missing cases.

137fe54

Merge branch 'main' into users/fhahn/tbaa-only-clear-struct-if-field-…

b5fbe7f

…matches

fhahn requested review from aaupov, maksfb and rafaelauler as code owners February 15, 2024 21:32

fhahn removed request for JDevlieghere, aaupov, matthias-springer, banach-space, dcci, hanhanW, maksfb, rafaelauler, Superty, nicolasvasilache, Endilll, dcaballe, PeimingLiu, aartbik, ayermolo, Groverkss and yinying-lisa-li February 15, 2024 21:35

fhahn commented Feb 15, 2024

View reviewed changes

dobbelaj-snps approved these changes Feb 16, 2024

View reviewed changes

Base automatically changed from users/fhahn/tbaa-only-clear-struct-if-field-matches to main February 16, 2024 09:49

fhahn added 2 commits February 16, 2024 12:19

Merge branch 'main' into users/fhahn/sroa-use-tbaa-for-matching-field…

a3baf66

…-access

!fixup update clang test.

51c915c

fhahn merged commit 53c0e80 into main Feb 16, 2024
3 of 4 checks passed

fhahn deleted the users/fhahn/sroa-use-tbaa-for-matching-field-access branch February 16, 2024 13:45

fhahn added a commit to fhahn/llvm-project that referenced this pull request Mar 1, 2024

[SROA] Extend !tbaa.struct test coverage with multiple missing cases.

268147a

Add tests to cover missing cases for llvm#81289 and llvm#81313. (cherry-picked from 2a9b86c)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SROA] Use !tbaa instead of !tbaa.struct if op matches field. #81289

[SROA] Use !tbaa instead of !tbaa.struct if op matches field. #81289

fhahn commented Feb 9, 2024

llvmbot commented Feb 9, 2024 •

edited

dobbelaj-snps Feb 9, 2024

fhahn Feb 15, 2024

dobbelaj-snps Feb 16, 2024

dobbelaj-snps commented Feb 9, 2024 •

edited

fhahn commented Feb 9, 2024

dobbelaj-snps commented Feb 12, 2024

fhahn left a comment

fhahn Feb 15, 2024

dobbelaj-snps left a comment

[SROA] Use !tbaa instead of !tbaa.struct if op matches field. #81289

[SROA] Use !tbaa instead of !tbaa.struct if op matches field. #81289

Conversation

fhahn commented Feb 9, 2024

llvmbot commented Feb 9, 2024 • edited

dobbelaj-snps Feb 9, 2024

Choose a reason for hiding this comment

fhahn Feb 15, 2024

Choose a reason for hiding this comment

dobbelaj-snps Feb 16, 2024

Choose a reason for hiding this comment

dobbelaj-snps commented Feb 9, 2024 • edited

fhahn commented Feb 9, 2024

dobbelaj-snps commented Feb 12, 2024

fhahn left a comment

Choose a reason for hiding this comment

fhahn Feb 15, 2024

Choose a reason for hiding this comment

dobbelaj-snps left a comment

Choose a reason for hiding this comment

llvmbot commented Feb 9, 2024 •

edited

dobbelaj-snps commented Feb 9, 2024 •

edited