Skip to content

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented Sep 30, 2025

Add tests which show missed folds of subregister extracts with
intermediate full copies.

Add tests which show missed folds of subregister extracts with
intermediate full copies.
Copy link
Contributor Author

arsenm commented Sep 30, 2025

@llvmbot
Copy link
Member

llvmbot commented Sep 30, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

Add tests which show missed folds of subregister extracts with
intermediate full copies.


Full diff: https://github.com/llvm/llvm-project/pull/161309.diff

1 Files Affected:

  • (modified) llvm/test/CodeGen/AMDGPU/peephole-opt-regseq-removal.mir (+148)
diff --git a/llvm/test/CodeGen/AMDGPU/peephole-opt-regseq-removal.mir b/llvm/test/CodeGen/AMDGPU/peephole-opt-regseq-removal.mir
index f1f2eb6baf008..0c723a09809c6 100644
--- a/llvm/test/CodeGen/AMDGPU/peephole-opt-regseq-removal.mir
+++ b/llvm/test/CodeGen/AMDGPU/peephole-opt-regseq-removal.mir
@@ -80,3 +80,151 @@ body:             |
     %4:vreg_128 = REG_SEQUENCE %3.sub0, %subreg.sub0, %3.sub1, %subreg.sub1, %3.sub2, %subreg.sub2, %3.sub3, %subreg.sub3
     KILL implicit %4
 ...
+
+---
+name: copy_vreg_64_subreg_from_vgpr_reg_sequence
+body:             |
+  bb.0:
+    liveins: $vgpr0, $vgpr1
+    ; GCN-LABEL: name: copy_vreg_64_subreg_from_vgpr_reg_sequence
+    ; GCN: liveins: $vgpr0, $vgpr1
+    ; GCN-NEXT: {{  $}}
+    ; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
+    ; GCN-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
+    ; GCN-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY]], %subreg.sub0, [[COPY1]], %subreg.sub1
+    ; GCN-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]]
+    ; GCN-NEXT: $vgpr0 = COPY [[COPY2]]
+    %0:vgpr_32 = COPY $vgpr0
+    %1:vgpr_32 = COPY $vgpr1
+    %2:vreg_64 = REG_SEQUENCE %0, %subreg.sub0, %1, %subreg.sub1
+    %3:vgpr_32 = COPY %2.sub0
+    $vgpr0 = COPY %3
+...
+
+---
+name: copy_vreg_64_subreg_from_vgpr_reg_sequence_extra_copy
+body:             |
+  bb.0:
+    liveins: $vgpr0, $vgpr1
+    ; GCN-LABEL: name: copy_vreg_64_subreg_from_vgpr_reg_sequence_extra_copy
+    ; GCN: liveins: $vgpr0, $vgpr1
+    ; GCN-NEXT: {{  $}}
+    ; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
+    ; GCN-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
+    ; GCN-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY]], %subreg.sub0, [[COPY1]], %subreg.sub1
+    ; GCN-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+    ; GCN-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub0
+    ; GCN-NEXT: $vgpr0 = COPY [[COPY3]]
+    %0:vgpr_32 = COPY $vgpr0
+    %1:vgpr_32 = COPY $vgpr1
+    %2:vreg_64 = REG_SEQUENCE %0, %subreg.sub0, %1, %subreg.sub1
+    %3:vreg_64 = COPY %2
+    %4:vgpr_32 = COPY %3.sub0
+    $vgpr0 = COPY %4
+...
+
+---
+name: copy_av_64_subreg_from_vgpr_reg_sequence
+body:             |
+  bb.0:
+    liveins: $vgpr0, $vgpr1
+    ; GCN-LABEL: name: copy_av_64_subreg_from_vgpr_reg_sequence
+    ; GCN: liveins: $vgpr0, $vgpr1
+    ; GCN-NEXT: {{  $}}
+    ; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
+    ; GCN-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
+    ; GCN-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64_align2 = REG_SEQUENCE [[COPY]], %subreg.sub0, [[COPY1]], %subreg.sub1
+    ; GCN-NEXT: [[COPY2:%[0-9]+]]:av_64_align2 = COPY [[REG_SEQUENCE]]
+    ; GCN-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub0
+    ; GCN-NEXT: $vgpr0 = COPY [[COPY3]]
+    %0:vgpr_32 = COPY $vgpr0
+    %1:vgpr_32 = COPY $vgpr1
+    %2:vreg_64_align2 = REG_SEQUENCE %0, %subreg.sub0, %1, %subreg.sub1
+    %3:av_64_align2 = COPY %2
+    %4:vgpr_32 = COPY %3.sub0
+    $vgpr0 = COPY %4
+...
+
+---
+name: copy_vreg_64_subreg_from_vgpr_reg_sequence_with_sub0_compose
+body:             |
+  bb.0:
+    liveins: $vgpr0_vgpr1
+    ; GCN-LABEL: name: copy_vreg_64_subreg_from_vgpr_reg_sequence_with_sub0_compose
+    ; GCN: liveins: $vgpr0_vgpr1
+    ; GCN-NEXT: {{  $}}
+    ; GCN-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
+    ; GCN-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
+    ; GCN-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY]].sub0, %subreg.sub0, [[COPY1]], %subreg.sub1
+    ; GCN-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+    ; GCN-NEXT: $vgpr0 = COPY [[COPY2]]
+    %0:vreg_64 = COPY $vgpr0_vgpr1
+    %1:vgpr_32 = COPY $vgpr1
+    %2:vreg_64 = REG_SEQUENCE %0.sub0, %subreg.sub0, %1, %subreg.sub1
+    %3:vgpr_32 = COPY %2.sub0
+    $vgpr0 = COPY %3
+...
+
+---
+name: copy_vreg_64_subreg_from_vgpr_reg_sequence_with_sub1_compose
+body:             |
+  bb.0:
+    liveins: $vgpr0_vgpr1
+    ; GCN-LABEL: name: copy_vreg_64_subreg_from_vgpr_reg_sequence_with_sub1_compose
+    ; GCN: liveins: $vgpr0_vgpr1
+    ; GCN-NEXT: {{  $}}
+    ; GCN-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
+    ; GCN-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
+    ; GCN-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY]].sub1, %subreg.sub0, [[COPY1]], %subreg.sub1
+    ; GCN-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+    ; GCN-NEXT: $vgpr0 = COPY [[COPY2]]
+    %0:vreg_64 = COPY $vgpr0_vgpr1
+    %1:vgpr_32 = COPY $vgpr1
+    %2:vreg_64 = REG_SEQUENCE %0.sub1, %subreg.sub0, %1, %subreg.sub1
+    %3:vgpr_32 = COPY %2.sub0
+    $vgpr0 = COPY %3
+...
+
+---
+name: copy_vreg_64_subreg_from_multiple_vgpr_reg_sequence
+body:             |
+  bb.0:
+    liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3
+    ; GCN-LABEL: name: copy_vreg_64_subreg_from_multiple_vgpr_reg_sequence
+    ; GCN: liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3
+    ; GCN-NEXT: {{  $}}
+    ; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
+    ; GCN-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
+    ; GCN-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
+    ; GCN-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY $vgpr3
+    ; GCN-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY]], %subreg.sub0, [[COPY1]], %subreg.sub1
+    ; GCN-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY2]], %subreg.sub0, [[COPY3]], %subreg.sub1
+    ; GCN-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:vreg_128 = REG_SEQUENCE [[REG_SEQUENCE]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+    ; GCN-NEXT: [[COPY4:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+    ; GCN-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE2]].sub1_sub2
+    ; GCN-NEXT: [[COPY6:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE1]]
+    ; GCN-NEXT: [[COPY7:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE1]].sub0
+    ; GCN-NEXT: [[COPY8:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub0
+    ; GCN-NEXT: $vgpr0_vgpr1 = COPY [[COPY4]]
+    ; GCN-NEXT: $vgpr2_vgpr3 = COPY [[COPY5]]
+    ; GCN-NEXT: $vgpr4_vgpr5 = COPY [[COPY6]]
+    ; GCN-NEXT: $vgpr6 = COPY [[COPY7]]
+    ; GCN-NEXT: $vgpr6 = COPY [[COPY8]]
+    %0:vgpr_32 = COPY $vgpr0
+    %1:vgpr_32 = COPY $vgpr1
+    %2:vgpr_32 = COPY $vgpr2
+    %3:vgpr_32 = COPY $vgpr3
+    %4:vreg_64 = REG_SEQUENCE %0, %subreg.sub0, %1, %subreg.sub1
+    %5:vreg_64 = REG_SEQUENCE %2, %subreg.sub0, %3, %subreg.sub1
+    %6:vreg_128 = REG_SEQUENCE %4, %subreg.sub0_sub1, %5, %subreg.sub2_sub3
+    %7:vreg_64 = COPY %6.sub0_sub1
+    %8:vreg_64 = COPY %6.sub1_sub2
+    %9:vreg_64 = COPY %6.sub2_sub3
+    %10:vgpr_32 = COPY %6.sub2
+    %11:vgpr_32 = COPY %6.sub0
+    $vgpr0_vgpr1 = COPY %7
+    $vgpr2_vgpr3 = COPY %8
+    $vgpr4_vgpr5 = COPY %9
+    $vgpr6 = COPY %10
+    $vgpr6 = COPY %11
+...

@arsenm arsenm marked this pull request as ready for review September 30, 2025 04:22
KILL implicit %4
...

---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a comment that those tests document missed fold?

@arsenm arsenm merged commit 8df0575 into main Oct 1, 2025
14 checks passed
@arsenm arsenm deleted the users/amdgpu/add-reg-sequence-peephole-opt-tests branch October 1, 2025 13:48
@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 1, 2025

LLVM Buildbot has detected a new failure on builder clang-aarch64-quick running on linaro-clang-aarch64-quick while building llvm at step 5 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/65/builds/23393

Here is the relevant piece of the build log for the reference
Step 5 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'Clangd Unit Tests :: ./ClangdTests/51/333' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests-Clangd Unit Tests-900601-51-333.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=333 GTEST_SHARD_INDEX=51 /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests
--

Note: This is test shard 52 of 333.
[==========] Running 4 tests from 4 test suites.
[----------] Global test environment set-up.
[----------] 1 test from ClangdServerTest
[ RUN      ] ClangdServerTest.ForceReparseCompileCommandDefines
ASTWorker building file /clangd-test/foo.cpp version null with command 
[/clangd-test]
clang -DWITH_ERROR /clangd-test/foo.cpp
Driver produced command: cc1 -cc1 -triple aarch64-unknown-linux-gnu -fsyntax-only -disable-free -clear-ast-before-backend -main-file-name foo.cpp -mrelocation-model pic -pic-level 2 -pic-is-pie -mframe-pointer=non-leaf -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -enable-tlsdesc -target-cpu generic -target-feature +v8a -target-feature +fp-armv8 -target-feature +neon -target-abi aapcs -debugger-tuning=gdb -fdebug-compilation-dir=/clangd-test -fcoverage-compilation-dir=/clangd-test -resource-dir lib/clang/22 -D WITH_ERROR -internal-isystem lib/clang/22/include -internal-isystem /usr/local/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdeprecated-macro -ferror-limit 19 -fno-signed-char -fgnuc-version=4.2.1 -fskip-odr-check-in-gmf -fcxx-exceptions -fexceptions -no-round-trip-args -target-feature -fmv -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -x c++ /clangd-test/foo.cpp
Building first preamble for /clangd-test/foo.cpp version null
not idle after addDocument
UNREACHABLE executed at ../llvm/clang-tools-extra/clangd/unittests/SyncAPI.cpp:22!
Built preamble of size 817532 for file /clangd-test/foo.cpp version null in 10.66 seconds
 #0 0x0000af036a644b8c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xc94b8c)
 #1 0x0000af036a642654 llvm::sys::RunSignalHandlers() (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xc92654)
 #2 0x0000af036a6459e8 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
 #3 0x0000e614ddd2e8f8 (linux-vdso.so.1+0x8f8)
 #4 0x0000e614dd83f1f0 __pthread_kill_implementation ./nptl/./nptl/pthread_kill.c:44:76
 #5 0x0000e614dd7fa67c gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #6 0x0000e614dd7e7130 abort ./stdlib/./stdlib/abort.c:81:7
 #7 0x0000af036a5f15f8 llvm::RTTIRoot::anchor() (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xc415f8)
 #8 0x0000af036a49b720 clang::clangd::runCodeComplete(clang::clangd::ClangdServer&, llvm::StringRef, clang::clangd::Position, clang::clangd::CodeCompleteOptions) (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xaeb720)
 #9 0x0000af0369fa368c clang::clangd::(anonymous namespace)::ClangdServerTest_ForceReparseCompileCommandDefines_Test::TestBody() ClangdTests.cpp:0:0
#10 0x0000af036a69d3d8 testing::Test::Run() (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xced3d8)
#11 0x0000af036a69e6fc testing::TestInfo::Run() (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xcee6fc)
#12 0x0000af036a69f338 testing::TestSuite::Run() (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xcef338)
#13 0x0000af036a6af6b8 testing::internal::UnitTestImpl::RunAllTests() (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xcff6b8)
#14 0x0000af036a6af004 testing::UnitTest::Run() (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xcff004)
#15 0x0000af036a68a120 main (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xcda120)
#16 0x0000e614dd7e73fc __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3
#17 0x0000e614dd7e74cc call_init ./csu/../csu/libc-start.c:128:20
#18 0x0000e614dd7e74cc __libc_start_main ./csu/../csu/libc-start.c:379:5
#19 0x0000af0369e3aab0 _start (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0x48aab0)

--
exit: -6
--
shard JSON output does not exist: /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests-Clangd Unit Tests-900601-51-333.json
********************


mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
Add tests which show missed folds of subregister extracts with
intermediate full copies.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants