[AArch64] use `isTRNMask` to calculate shuffle costs #171524

ginsbach · 2025-12-09T22:53:07Z

This builds on #169858 to fix the divergence in codegen between two very similar functions initially observed in #137447 (represented in the diff by test cases @transpose_splat_constants and @transpose_constants_splat:

int8x16_t f(int8_t x)
{
  return (int8x16_t) { x, 0, x, 1, x, 2, x, 3,
                       x, 4, x, 5, x, 6, x, 7 };
}

int8x16_t g(int8_t x)
{
  return (int8x16_t) { 0, x, 1, x, 2, x, 3, x,
                       4, x, 5, x, 6, x, 7, x };
}

The PR uses an additional isTRNMask call in AArch64TTIImpl::getShuffleCost to ensure that we treat shuffle masks as transpose masks even if isTransposeMask fails to recognise them (meaning that Kind == TTI::SK_Transpose cannot be relied upon).

Follow-up work could consider modifying isTransposeMask, but that would also impact other backends than AArch64.

github-actions · 2025-12-09T22:54:53Z

✅ With the latest revision this PR passed the C/C++ code formatter.

llvmbot · 2025-12-10T08:27:50Z

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Philip Ginsbach-Chen (ginsbach)

Changes

This builds on #169858 to fix the divergence in codegen between two very similar functions initially observed in #137447 (represented in the diff by test cases @transpose_splat_constants and @transpose_constants_splat:

int8x16_t f(int8_t x)
{
  return (int8x16_t) { x, 0, x, 1, x, 2, x, 3,
                       x, 4, x, 5, x, 6, x, 7 };
}

int8x16_t g(int8_t x)
{
  return (int8x16_t) { 0, x, 1, x, 2, x, 3, x,
                       4, x, 5, x, 6, x, 7, x };
}

The PR consists of two commits:

commit 1 introduces test cases that show the inconsistent effect of SLP;
commit 2 overrides improveShuffleKindFromMask in AArch64TTIImpl to use isTRNMask. Its original implementation relies on ShuffleVectorInst::isTransposeMask, which is more restrictive (doesn't allow flipped operands).

I can think of several other ways to achieve the same result:

make isTransposeMask match isTRNMask, but that would affect codegen also for other targets;
remove ShuffleKind::SK_Transpose and instead add logic to getShuffleCost similar to how we deal with isZIPMask, but that would require updating getShuffleCost also for other targets;
add code to AArch64TTIImpl::getShuffleCost that checks isTRNMask when Kind == TTI::SK_PermuteTwoSrc.

Option 3 seems less clean, whereas options 1&2 are not contained nicely inside AArch64. If we do want 1 or 2, I think it would still be better to leave that to a follow-up.

Full diff: https://github.com/llvm/llvm-project/pull/171524.diff

3 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp (+13)
(modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h (+5)
(added) llvm/test/Transforms/SLPVectorizer/AArch64/transpose-with-constants.ll (+47)

diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 043be554f8441..0e667e6ecea86 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -5918,6 +5918,19 @@ InstructionCost AArch64TTIImpl::getPartialReductionCost(
   return Cost + 2;
 }
 
+TTI::ShuffleKind AArch64TTIImpl::improveShuffleKindFromMask(
+    TTI::ShuffleKind Kind, ArrayRef<int> Mask, VectorType *SrcTy, int &Index,
+    VectorType *&SubTy) const {
+  TTI::ShuffleKind SuperResult = BasicTTIImplBase::improveShuffleKindFromMask(
+      Kind, Mask, SrcTy, Index, SubTy);
+  if (SuperResult == TTI::ShuffleKind::SK_PermuteTwoSrc && !Mask.empty()) {
+    unsigned DummyUnsigned;
+    if (isTRNMask(Mask, Mask.size(), DummyUnsigned, DummyUnsigned))
+      return TTI::ShuffleKind::SK_Transpose;
+  }
+  return SuperResult;
+}
+
 InstructionCost
 AArch64TTIImpl::getShuffleCost(TTI::ShuffleKind Kind, VectorType *DstTy,
                                VectorType *SrcTy, ArrayRef<int> Mask,
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
index ecefe2a7f1380..084c1af4298a1 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
@@ -494,6 +494,11 @@ class AArch64TTIImpl final : public BasicTTIImplBase<AArch64TTIImpl> {
       bool IsUnsigned, unsigned RedOpcode, Type *ResTy, VectorType *Ty,
       TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput) const override;
 
+  TTI::ShuffleKind improveShuffleKindFromMask(TTI::ShuffleKind Kind,
+                                              ArrayRef<int> Mask,
+                                              VectorType *SrcTy, int &Index,
+                                              VectorType *&SubTy) const;
+
   InstructionCost
   getShuffleCost(TTI::ShuffleKind Kind, VectorType *DstTy, VectorType *SrcTy,
                  ArrayRef<int> Mask, TTI::TargetCostKind CostKind, int Index,
diff --git a/llvm/test/Transforms/SLPVectorizer/AArch64/transpose-with-constants.ll b/llvm/test/Transforms/SLPVectorizer/AArch64/transpose-with-constants.ll
new file mode 100644
index 0000000000000..66461c9d91064
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/AArch64/transpose-with-constants.ll
@@ -0,0 +1,47 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -passes=slp-vectorizer -S | FileCheck %s
+
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+target triple = "aarch64--linux-gnu"
+
+define dso_local <16 x i8> @transpose_splat_constants(i8 noundef %x) {
+; CHECK-LABEL: @transpose_splat_constants(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[TMP0:%.*]] = insertelement <8 x i8> poison, i8 [[X:%.*]], i32 0
+; CHECK-NEXT:    [[TMP1:%.*]] = shufflevector <8 x i8> [[TMP0]], <8 x i8> poison, <8 x i32> zeroinitializer
+; CHECK-NEXT:    [[TMP2:%.*]] = shufflevector <8 x i8> [[TMP1]], <8 x i8> poison, <16 x i32> <i32 0, i32 poison, i32 1, i32 poison, i32 2, i32 poison, i32 3, i32 poison, i32 4, i32 poison, i32 5, i32 poison, i32 6, i32 poison, i32 7, i32 poison>
+; CHECK-NEXT:    [[TMP3:%.*]] = shufflevector <16 x i8> <i8 poison, i8 0, i8 poison, i8 1, i8 poison, i8 2, i8 poison, i8 3, i8 poison, i8 4, i8 poison, i8 5, i8 poison, i8 6, i8 poison, i8 7>, <16 x i8> [[TMP2]], <16 x i32> <i32 16, i32 1, i32 18, i32 3, i32 20, i32 5, i32 22, i32 7, i32 24, i32 9, i32 26, i32 11, i32 28, i32 13, i32 30, i32 15>
+; CHECK-NEXT:    ret <16 x i8> [[TMP3]]
+;
+entry:
+  %0 = insertelement <16 x i8> <i8 poison, i8 0, i8 poison, i8 1, i8 poison, i8 2, i8 poison, i8 3, i8 poison, i8 4, i8 poison, i8 5, i8 poison, i8 6, i8 poison, i8 7>, i8 %x, i64 0
+  %1 = insertelement <16 x i8> %0, i8 %x, i64 2
+  %2 = insertelement <16 x i8> %1, i8 %x, i64 4
+  %3 = insertelement <16 x i8> %2, i8 %x, i64 6
+  %4 = insertelement <16 x i8> %3, i8 %x, i64 8
+  %5 = insertelement <16 x i8> %4, i8 %x, i64 10
+  %6 = insertelement <16 x i8> %5, i8 %x, i64 12
+  %7 = insertelement <16 x i8> %6, i8 %x, i64 14
+  ret <16 x i8> %7
+}
+
+define dso_local <16 x i8> @transpose_constants_splat(i8 noundef %x) {
+; CHECK-LABEL: @transpose_constants_splat(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[TMP0:%.*]] = insertelement <8 x i8> poison, i8 [[X:%.*]], i32 0
+; CHECK-NEXT:    [[TMP1:%.*]] = shufflevector <8 x i8> [[TMP0]], <8 x i8> poison, <8 x i32> zeroinitializer
+; CHECK-NEXT:    [[TMP2:%.*]] = shufflevector <8 x i8> [[TMP1]], <8 x i8> poison, <16 x i32> <i32 0, i32 poison, i32 1, i32 poison, i32 2, i32 poison, i32 3, i32 poison, i32 4, i32 poison, i32 5, i32 poison, i32 6, i32 poison, i32 7, i32 poison>
+; CHECK-NEXT:    [[TMP3:%.*]] = shufflevector <16 x i8> <i8 0, i8 poison, i8 1, i8 poison, i8 2, i8 poison, i8 3, i8 poison, i8 4, i8 poison, i8 5, i8 poison, i8 6, i8 poison, i8 7, i8 poison>, <16 x i8> [[TMP2]], <16 x i32> <i32 0, i32 16, i32 2, i32 18, i32 4, i32 20, i32 6, i32 22, i32 8, i32 24, i32 10, i32 26, i32 12, i32 28, i32 14, i32 30>
+; CHECK-NEXT:    ret <16 x i8> [[TMP3]]
+;
+entry:
+  %0 = insertelement <16 x i8> <i8 0, i8 poison, i8 1, i8 poison, i8 2, i8 poison, i8 3, i8 poison, i8 4, i8 poison, i8 5, i8 poison, i8 6, i8 poison, i8 7, i8 poison>, i8 %x, i64 1
+  %1 = insertelement <16 x i8> %0, i8 %x, i64 3
+  %2 = insertelement <16 x i8> %1, i8 %x, i64 5
+  %3 = insertelement <16 x i8> %2, i8 %x, i64 7
+  %4 = insertelement <16 x i8> %3, i8 %x, i64 9
+  %5 = insertelement <16 x i8> %4, i8 %x, i64 11
+  %6 = insertelement <16 x i8> %5, i8 %x, i64 13
+  %7 = insertelement <16 x i8> %6, i8 %x, i64 15
+  ret <16 x i8> %7
+}

davemgreen

I think that either the generic ShuffleVectorInst::isTransposeMask should match transposes like this, or the isTRNMask can be reused in the same places we currently check isZipMask/isUZPMask/etc.

Can you add direct cost-modelling tests for shuffles of these kinds?

ginsbach · 2025-12-12T17:17:54Z

@davemgreen I have added three more commits that should address your suggestions.

I think that either the generic ShuffleVectorInst::isTransposeMask should match transposes like this, or the isTRNMask can be reused in the same places we currently check isZipMask/isUZPMask/etc.

With commit 3, I have picked the latter approach and used isTRNMask directly in getShuffleCost next to isZipMask, isUZPMask, etc. I have updated the PR description accordingly.

Can you add direct cost-modelling tests for shuffles of these kinds?

Added in commit 4.

Two of the test cases added in commit 4 showed that things can go wrong with two-element shuffles
shufflevector <2 x ???> %v0, <2 x ???> %v1, <2 x ???> <i32 2, i32 0>.
The reason is that these get labelled as TTI::SK_InsertSubvector. I have fixed the problem in commit 5.

github-actions · 2025-12-12T17:26:09Z

🪟 Windows x64 Test Results

128637 tests passed
2817 tests skipped

✅ The build succeeded and all tests passed.

github-actions · 2025-12-12T17:26:09Z

🐧 Linux x64 Test Results

187468 tests passed
4961 tests skipped

✅ The build succeeded and all tests passed.

ginsbach · 2025-12-12T17:31:08Z

I think the test failure is unrelated and seems to have been introduced by #167798.
Edit: confirmed (#167798 (comment))

davemgreen · 2025-12-13T10:36:41Z

Thanks, sounds great. LGTM.

MacDue · 2025-12-14T12:36:00Z

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

      LT.second.getVectorNumElements() == Mask.size() &&
-      (Kind == TTI::SK_PermuteTwoSrc || Kind == TTI::SK_PermuteSingleSrc) &&
+      (Kind == TTI::SK_PermuteTwoSrc || Kind == TTI::SK_PermuteSingleSrc ||
+       Kind == TTI::SK_InsertSubvector) &&


Maybe leave a comment about why SK_InsertSubvector is included here?

Commit 7 adds such a comment.

MacDue

(Otherwise, LGTM)

ginsbach · 2025-12-14T22:08:37Z

(Otherwise, LGTM)

Thank you. Please, would you merge this for me?

llvm-ci · 2025-12-15T06:08:39Z

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux running on sanitizer-buildbot2 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/66/builds/23669

Here is the relevant piece of the build log for the reference

Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/lit.common.cfg.py:243: warning: COMPILER_RT_TEST_STANDALONE_BUILD_LIBS=ON, but this test suite does not support testing the just-built runtime libraries when the test compiler is configured to use different runtime libraries. Either modify this test suite to support this test configuration, or set COMPILER_RT_TEST_STANDALONE_BUILD_LIBS=OFF to test the runtime libraries included in the compiler instead.
llvm-lit: /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/lit.common.cfg.py:254: note: Testing using libraries in "/home/b/sanitizer-x86_64-linux/build/build_default/./lib/../lib/clang/22/lib/i386-unknown-linux-gnu"
llvm-lit: /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/lit.common.cfg.py:232: warning: Compiler lib dir != compiler-rt lib dir
Compiler libdir:     "/home/b/sanitizer-x86_64-linux/build/build_default/lib/clang/22/lib/i386-unknown-linux-gnu"
compiler-rt libdir:  "/home/b/sanitizer-x86_64-linux/build/build_default/lib/clang/22/lib/x86_64-unknown-linux-gnu"
llvm-lit: /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/lit.common.cfg.py:243: warning: COMPILER_RT_TEST_STANDALONE_BUILD_LIBS=ON, but this test suite does not support testing the just-built runtime libraries when the test compiler is configured to use different runtime libraries. Either modify this test suite to support this test configuration, or set COMPILER_RT_TEST_STANDALONE_BUILD_LIBS=OFF to test the runtime libraries included in the compiler instead.
llvm-lit: /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/lit.common.cfg.py:254: note: Testing using libraries in "/home/b/sanitizer-x86_64-linux/build/build_default/./lib/../lib/clang/22/lib/x86_64-unknown-linux-gnu"
llvm-lit: /home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/utils/lit/lit/main.py:74: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 10689 tests, 64 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.
FAIL: XRay-x86_64-linux :: TestCases/Posix/basic-filtering.cpp (9264 of 10689)
******************** TEST 'XRay-x86_64-linux :: TestCases/Posix/basic-filtering.cpp' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 4
/home/b/sanitizer-x86_64-linux/build/build_default/./bin/clang  --driver-mode=g++ -fxray-instrument  -m64   -std=c++11 /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/xray/TestCases/Posix/basic-filtering.cpp -o /home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp -g
# executed command: /home/b/sanitizer-x86_64-linux/build/build_default/./bin/clang --driver-mode=g++ -fxray-instrument -m64 -std=c++11 /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/xray/TestCases/Posix/basic-filtering.cpp -o /home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp -g
# note: command had no output on stdout or stderr
# RUN: at line 5
rm -f basic-filtering-*
# executed command: rm -f 'basic-filtering-*'
# note: command had no output on stdout or stderr
# RUN: at line 6
env XRAY_OPTIONS="patch_premain=true xray_mode=xray-basic verbosity=1      xray_logfile_base=basic-filtering-      xray_naive_log_func_duration_threshold_us=1000      xray_naive_log_max_stack_depth=2"  /home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp 2>&1 |      FileCheck /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/xray/TestCases/Posix/basic-filtering.cpp
# executed command: env 'XRAY_OPTIONS=patch_premain=true xray_mode=xray-basic verbosity=1      xray_logfile_base=basic-filtering-      xray_naive_log_func_duration_threshold_us=1000      xray_naive_log_max_stack_depth=2' /home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp
# note: command had no output on stdout or stderr
# executed command: FileCheck /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/xray/TestCases/Posix/basic-filtering.cpp
# note: command had no output on stdout or stderr
# RUN: at line 11
ls basic-filtering-* | head -1 | tr -d '\n' > /home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp.log
# executed command: ls 'basic-filtering-*'
# note: command had no output on stdout or stderr
# executed command: head -1
# note: command had no output on stdout or stderr
# executed command: tr -d '\n'
# note: command had no output on stdout or stderr
# RUN: at line 12
/home/b/sanitizer-x86_64-linux/build/build_default/./bin/llvm-xray convert --symbolize --output-format=yaml -instr_map=/home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp      "/home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/basic-filtering-basic-filtering.cpp.tmp.VmGAMz" |      FileCheck /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/xray/TestCases/Posix/basic-filtering.cpp --check-prefix TRACE
# executed command: /home/b/sanitizer-x86_64-linux/build/build_default/./bin/llvm-xray convert --symbolize --output-format=yaml -instr_map=/home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp '%{readfile:/home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp.log}'
# note: command had no output on stdout or stderr
# executed command: FileCheck /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/xray/TestCases/Posix/basic-filtering.cpp --check-prefix TRACE
# .---command stderr------------
# | /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/xray/TestCases/Posix/basic-filtering.cpp:61:15: error: TRACE-NOT: excluded string found in input
# | // TRACE-NOT: - { type: 0, func-id: {{.*}}, function: {{.*filtered.*}}, {{.*}} }
# |               ^
# | <stdin>:10:2: note: found here
# |  - { type: 0, func-id: 1, function: 'filtered()', cpu: 0, thread: 51950, process: 51950, kind: function-enter, tsc: 1765778601752388251, data: '' }
Step 14 (test compiler-rt default) failure: test compiler-rt default (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/lit.common.cfg.py:243: warning: COMPILER_RT_TEST_STANDALONE_BUILD_LIBS=ON, but this test suite does not support testing the just-built runtime libraries when the test compiler is configured to use different runtime libraries. Either modify this test suite to support this test configuration, or set COMPILER_RT_TEST_STANDALONE_BUILD_LIBS=OFF to test the runtime libraries included in the compiler instead.
llvm-lit: /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/lit.common.cfg.py:254: note: Testing using libraries in "/home/b/sanitizer-x86_64-linux/build/build_default/./lib/../lib/clang/22/lib/i386-unknown-linux-gnu"
llvm-lit: /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/lit.common.cfg.py:232: warning: Compiler lib dir != compiler-rt lib dir
Compiler libdir:     "/home/b/sanitizer-x86_64-linux/build/build_default/lib/clang/22/lib/i386-unknown-linux-gnu"
compiler-rt libdir:  "/home/b/sanitizer-x86_64-linux/build/build_default/lib/clang/22/lib/x86_64-unknown-linux-gnu"
llvm-lit: /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/lit.common.cfg.py:243: warning: COMPILER_RT_TEST_STANDALONE_BUILD_LIBS=ON, but this test suite does not support testing the just-built runtime libraries when the test compiler is configured to use different runtime libraries. Either modify this test suite to support this test configuration, or set COMPILER_RT_TEST_STANDALONE_BUILD_LIBS=OFF to test the runtime libraries included in the compiler instead.
llvm-lit: /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/lit.common.cfg.py:254: note: Testing using libraries in "/home/b/sanitizer-x86_64-linux/build/build_default/./lib/../lib/clang/22/lib/x86_64-unknown-linux-gnu"
llvm-lit: /home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/utils/lit/lit/main.py:74: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 10689 tests, 64 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.
FAIL: XRay-x86_64-linux :: TestCases/Posix/basic-filtering.cpp (9264 of 10689)
******************** TEST 'XRay-x86_64-linux :: TestCases/Posix/basic-filtering.cpp' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 4
/home/b/sanitizer-x86_64-linux/build/build_default/./bin/clang  --driver-mode=g++ -fxray-instrument  -m64   -std=c++11 /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/xray/TestCases/Posix/basic-filtering.cpp -o /home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp -g
# executed command: /home/b/sanitizer-x86_64-linux/build/build_default/./bin/clang --driver-mode=g++ -fxray-instrument -m64 -std=c++11 /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/xray/TestCases/Posix/basic-filtering.cpp -o /home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp -g
# note: command had no output on stdout or stderr
# RUN: at line 5
rm -f basic-filtering-*
# executed command: rm -f 'basic-filtering-*'
# note: command had no output on stdout or stderr
# RUN: at line 6
env XRAY_OPTIONS="patch_premain=true xray_mode=xray-basic verbosity=1      xray_logfile_base=basic-filtering-      xray_naive_log_func_duration_threshold_us=1000      xray_naive_log_max_stack_depth=2"  /home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp 2>&1 |      FileCheck /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/xray/TestCases/Posix/basic-filtering.cpp
# executed command: env 'XRAY_OPTIONS=patch_premain=true xray_mode=xray-basic verbosity=1      xray_logfile_base=basic-filtering-      xray_naive_log_func_duration_threshold_us=1000      xray_naive_log_max_stack_depth=2' /home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp
# note: command had no output on stdout or stderr
# executed command: FileCheck /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/xray/TestCases/Posix/basic-filtering.cpp
# note: command had no output on stdout or stderr
# RUN: at line 11
ls basic-filtering-* | head -1 | tr -d '\n' > /home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp.log
# executed command: ls 'basic-filtering-*'
# note: command had no output on stdout or stderr
# executed command: head -1
# note: command had no output on stdout or stderr
# executed command: tr -d '\n'
# note: command had no output on stdout or stderr
# RUN: at line 12
/home/b/sanitizer-x86_64-linux/build/build_default/./bin/llvm-xray convert --symbolize --output-format=yaml -instr_map=/home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp      "/home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/basic-filtering-basic-filtering.cpp.tmp.VmGAMz" |      FileCheck /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/xray/TestCases/Posix/basic-filtering.cpp --check-prefix TRACE
# executed command: /home/b/sanitizer-x86_64-linux/build/build_default/./bin/llvm-xray convert --symbolize --output-format=yaml -instr_map=/home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp '%{readfile:/home/b/sanitizer-x86_64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/xray/X86_64LinuxConfig/TestCases/Posix/Output/basic-filtering.cpp.tmp.log}'
# note: command had no output on stdout or stderr
# executed command: FileCheck /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/xray/TestCases/Posix/basic-filtering.cpp --check-prefix TRACE
# .---command stderr------------
# | /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/xray/TestCases/Posix/basic-filtering.cpp:61:15: error: TRACE-NOT: excluded string found in input
# | // TRACE-NOT: - { type: 0, func-id: {{.*}}, function: {{.*filtered.*}}, {{.*}} }
# |               ^
# | <stdin>:10:2: note: found here
# |  - { type: 0, func-id: 1, function: 'filtered()', cpu: 0, thread: 51950, process: 51950, kind: function-enter, tsc: 1765778601752388251, data: '' }

[AArch64] add slp vectorizer test for new isTRNMask

657317c

ginsbach force-pushed the isTRNMask-for-shuffle-cost branch from 7f26478 to 17d3d39 Compare December 9, 2025 22:56

[AArch64] use isTRNMask to calculate shuffle costs

17d3d39

ginsbach marked this pull request as ready for review December 10, 2025 08:27

llvmbot added backend:AArch64 llvm:transforms labels Dec 10, 2025

davemgreen reviewed Dec 10, 2025

View reviewed changes

llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Dec 12, 2025

ginsbach force-pushed the isTRNMask-for-shuffle-cost branch from 4b835da to dbe240a Compare December 12, 2025 17:03

ginsbach added 3 commits December 12, 2025 17:03

alternative implementation directly in getShuffleCost

b320449

add cost model tests for trn with flipped inputs

bba438d

detect flipped trn miscategorised as SK_InsertSubvector

dbe240a

ginsbach mentioned this pull request Dec 12, 2025

[AMDGPU][GlobalISel] Add register bank legalization for buffer_load byte and short #167798

Merged

davemgreen requested a review from MacDue December 13, 2025 10:36

Merge branch 'main' into isTRNMask-for-shuffle-cost

ae3f0fa

MacDue reviewed Dec 14, 2025

View reviewed changes

MacDue approved these changes Dec 14, 2025

View reviewed changes

explanatory comment for SK_InsertSubvector

be5f7c6

davemgreen merged commit 1d821b0 into llvm:main Dec 15, 2025
10 checks passed

ginsbach deleted the isTRNMask-for-shuffle-cost branch December 15, 2025 19:36

[AArch64] use isTRNMask to calculate shuffle costs #171524

[AArch64] use isTRNMask to calculate shuffle costs #171524

Conversation

ginsbach commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

ginsbach commented Dec 12, 2025

Uh oh!

github-actions bot commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🪟 Windows x64 Test Results

Uh oh!

github-actions bot commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🐧 Linux x64 Test Results

Uh oh!

ginsbach commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davemgreen commented Dec 13, 2025

Uh oh!

MacDue Dec 14, 2025

Choose a reason for hiding this comment

Uh oh!

ginsbach Dec 14, 2025

Choose a reason for hiding this comment

Uh oh!

MacDue left a comment

Choose a reason for hiding this comment

Uh oh!

ginsbach commented Dec 14, 2025

Uh oh!

Uh oh!

llvm-ci commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[AArch64] use `isTRNMask` to calculate shuffle costs #171524

[AArch64] use `isTRNMask` to calculate shuffle costs #171524

ginsbach commented Dec 9, 2025 •

edited

Loading

github-actions bot commented Dec 9, 2025 •

edited

Loading

llvmbot commented Dec 10, 2025 •

edited

Loading

github-actions bot commented Dec 12, 2025 •

edited

Loading

github-actions bot commented Dec 12, 2025 •

edited

Loading

ginsbach commented Dec 12, 2025 •

edited

Loading