-
Notifications
You must be signed in to change notification settings - Fork 14.9k
RegisterCoalescer: Enable terminal rule by default #161621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This appears to be forgotten switch flip from 2015. This seems to do a nicer job with subregister copies. Most of the test changes are improvements or neutral, not that many are light regressions. The worst AMDGPU regressions are for true16 in the atomic tests, but I think that's due to existing true16 issues. I also had to hack many hexagon tests to disable the rule. I have no idea how to update these tests. They appear to be testing specific scheduling and packet formation of later machine passes, so any change in the incoming mir is likely hiding whatever was originally intended. I'll open an issue to fixup these tests once this lands.
@llvm/pr-subscribers-backend-systemz @llvm/pr-subscribers-llvm-globalisel Author: Matt Arsenault (arsenm) ChangesThis appears to be forgotten switch flip from 2015. This I also had to hack many hexagon tests to disable the rule. I have Patch is 1.65 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/161621.diff 152 Files Affected:
diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index 7ac1aef83777a..5bd38a916fe4d 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -81,7 +81,7 @@ static cl::opt<bool> EnableJoining("join-liveintervals",
static cl::opt<bool> UseTerminalRule("terminal-rule",
cl::desc("Apply the terminal rule"),
- cl::init(false), cl::Hidden);
+ cl::init(true), cl::Hidden);
/// Temporary flag to test critical edge unsplitting.
static cl::opt<bool> EnableJoinSplits(
diff --git a/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll b/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll
index dbbfbea9176f6..f725c19081deb 100644
--- a/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll
+++ b/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll
@@ -188,11 +188,11 @@ entry:
define <8 x i8> @test11(ptr nocapture noundef readonly %a, ptr nocapture noundef readonly %b) {
; CHECK-LABEL: test11:
; CHECK: // %bb.0: // %entry
-; CHECK-NEXT: ld1r { v1.8b }, [x0]
-; CHECK-NEXT: ld1r { v2.8b }, [x1]
-; CHECK-NEXT: mov v0.16b, v1.16b
-; CHECK-NEXT: mov v0.h[2], v2.h[0]
-; CHECK-NEXT: mov v0.h[3], v1.h[0]
+; CHECK-NEXT: ld1r { v0.8b }, [x0]
+; CHECK-NEXT: ld1r { v1.8b }, [x1]
+; CHECK-NEXT: fmov d2, d0
+; CHECK-NEXT: mov v0.h[2], v1.h[0]
+; CHECK-NEXT: mov v0.h[3], v2.h[0]
; CHECK-NEXT: // kill: def $d0 killed $d0 killed $q0
; CHECK-NEXT: ret
entry:
diff --git a/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll b/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll
index 3230c9e946da7..b3a7ec961b736 100644
--- a/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll
+++ b/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll
@@ -20,20 +20,17 @@ define i32 @sink_load_and_copy(i32 %n) {
; CHECK-NEXT: b.lt .LBB0_3
; CHECK-NEXT: // %bb.1: // %for.body.preheader
; CHECK-NEXT: adrp x8, A
-; CHECK-NEXT: mov w20, w19
-; CHECK-NEXT: ldr w21, [x8, :lo12:A]
+; CHECK-NEXT: mov w21, w19
+; CHECK-NEXT: ldr w20, [x8, :lo12:A]
; CHECK-NEXT: .LBB0_2: // %for.body
; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
-; CHECK-NEXT: mov w0, w21
+; CHECK-NEXT: mov w0, w20
; CHECK-NEXT: bl _Z3usei
-; CHECK-NEXT: sdiv w20, w20, w0
-; CHECK-NEXT: subs w19, w19, #1
+; CHECK-NEXT: sdiv w19, w19, w0
+; CHECK-NEXT: subs w21, w21, #1
; CHECK-NEXT: b.ne .LBB0_2
-; CHECK-NEXT: b .LBB0_4
-; CHECK-NEXT: .LBB0_3:
-; CHECK-NEXT: mov w20, w19
-; CHECK-NEXT: .LBB0_4: // %for.cond.cleanup
-; CHECK-NEXT: mov w0, w20
+; CHECK-NEXT: .LBB0_3: // %for.cond.cleanup
+; CHECK-NEXT: mov w0, w19
; CHECK-NEXT: ldp x20, x19, [sp, #16] // 16-byte Folded Reload
; CHECK-NEXT: ldp x30, x21, [sp], #32 // 16-byte Folded Reload
; CHECK-NEXT: ret
@@ -82,15 +79,12 @@ define i32 @cant_sink_successive_call(i32 %n) {
; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
; CHECK-NEXT: mov w0, w20
; CHECK-NEXT: bl _Z3usei
-; CHECK-NEXT: sdiv w21, w21, w0
-; CHECK-NEXT: subs w19, w19, #1
+; CHECK-NEXT: sdiv w19, w19, w0
+; CHECK-NEXT: subs w21, w21, #1
; CHECK-NEXT: b.ne .LBB1_2
-; CHECK-NEXT: b .LBB1_4
-; CHECK-NEXT: .LBB1_3:
-; CHECK-NEXT: mov w21, w19
-; CHECK-NEXT: .LBB1_4: // %for.cond.cleanup
+; CHECK-NEXT: .LBB1_3: // %for.cond.cleanup
+; CHECK-NEXT: mov w0, w19
; CHECK-NEXT: ldp x20, x19, [sp, #16] // 16-byte Folded Reload
-; CHECK-NEXT: mov w0, w21
; CHECK-NEXT: ldp x30, x21, [sp], #32 // 16-byte Folded Reload
; CHECK-NEXT: ret
entry:
@@ -139,15 +133,12 @@ define i32 @cant_sink_successive_store(ptr nocapture readnone %store, i32 %n) {
; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
; CHECK-NEXT: mov w0, w20
; CHECK-NEXT: bl _Z3usei
-; CHECK-NEXT: sdiv w21, w21, w0
-; CHECK-NEXT: subs w19, w19, #1
+; CHECK-NEXT: sdiv w19, w19, w0
+; CHECK-NEXT: subs w21, w21, #1
; CHECK-NEXT: b.ne .LBB2_2
-; CHECK-NEXT: b .LBB2_4
-; CHECK-NEXT: .LBB2_3:
-; CHECK-NEXT: mov w21, w19
-; CHECK-NEXT: .LBB2_4: // %for.cond.cleanup
+; CHECK-NEXT: .LBB2_3: // %for.cond.cleanup
+; CHECK-NEXT: mov w0, w19
; CHECK-NEXT: ldp x20, x19, [sp, #16] // 16-byte Folded Reload
-; CHECK-NEXT: mov w0, w21
; CHECK-NEXT: ldp x30, x21, [sp], #32 // 16-byte Folded Reload
; CHECK-NEXT: ret
entry:
diff --git a/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll b/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll
index e7e109170d6a1..338084295fc7f 100644
--- a/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll
+++ b/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll
@@ -16,13 +16,12 @@ define i32 @test(ptr %ptr) {
; CHECK-NEXT: mov w9, wzr
; CHECK-NEXT: LBB0_1: ; %.thread
; CHECK-NEXT: ; =>This Inner Loop Header: Depth=1
-; CHECK-NEXT: lsr w11, w9, #1
; CHECK-NEXT: sub w10, w9, #1
-; CHECK-NEXT: mov w9, w11
+; CHECK-NEXT: lsr w9, w9, #1
; CHECK-NEXT: tbnz w10, #0, LBB0_1
; CHECK-NEXT: ; %bb.2: ; %bb343
; CHECK-NEXT: and w9, w10, #0x1
-; CHECK-NEXT: mov w0, #-1
+; CHECK-NEXT: mov w0, #-1 ; =0xffffffff
; CHECK-NEXT: str w9, [x8]
; CHECK-NEXT: ret
bb:
diff --git a/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll b/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll
index b947c943ba448..72f6646930624 100644
--- a/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll
+++ b/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll
@@ -151,12 +151,11 @@ define void @dont_coalesce_arg_f16(half %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $h0 killed $h0 killed $z0
; CHECK-NEXT: str h0, [sp, #14] // 2-byte Folded Spill
+; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr h0, [sp, #14] // 2-byte Folded Reload
; CHECK-NEXT: bl use_f16
@@ -190,12 +189,11 @@ define void @dont_coalesce_arg_f32(float %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $s0 killed $s0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $s0 killed $s0 killed $z0
; CHECK-NEXT: str s0, [sp, #12] // 4-byte Folded Spill
+; CHECK-NEXT: // kill: def $s0 killed $s0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr s0, [sp, #12] // 4-byte Folded Reload
; CHECK-NEXT: bl use_f32
@@ -229,12 +227,11 @@ define void @dont_coalesce_arg_f64(double %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_f64
@@ -273,12 +270,11 @@ define void @dont_coalesce_arg_v1i8(<1 x i8> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v16i8
@@ -313,12 +309,11 @@ define void @dont_coalesce_arg_v1i16(<1 x i16> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v8i16
@@ -353,12 +348,11 @@ define void @dont_coalesce_arg_v1i32(<1 x i32> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v4i32
@@ -393,12 +387,11 @@ define void @dont_coalesce_arg_v1i64(<1 x i64> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v2i64
@@ -433,12 +426,11 @@ define void @dont_coalesce_arg_v1f16(<1 x half> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $h0 killed $h0 killed $z0
; CHECK-NEXT: str h0, [sp, #14] // 2-byte Folded Spill
+; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr h0, [sp, #14] // 2-byte Folded Reload
; CHECK-NEXT: bl use_v8f16
@@ -513,12 +505,11 @@ define void @dont_coalesce_arg_v1f64(<1 x double> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v2f64
@@ -557,12 +548,11 @@ define void @dont_coalesce_arg_v16i8(<16 x i8> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v16i8
@@ -596,12 +586,11 @@ define void @dont_coalesce_arg_v8i16(<8 x i16> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v8i16
@@ -635,12 +624,11 @@ define void @dont_coalesce_arg_v4i32(<4 x i32> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v4i32
@@ -674,12 +662,11 @@ define void @dont_coalesce_arg_v2i64(<2 x i64> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v2i64
@@ -713,12 +700,11 @@ define void @dont_coalesce_arg_v8f16(<8 x half> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v8f16
@@ -752,12 +738,11 @@ define void @dont_coalesce_arg_v8bf16(<8 x bfloat> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v8bf16
@@ -791,12 +776,11 @@ define void @dont_coalesce_arg_v4f32(<4 x float> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v4f32
@@ -830,12 +814,11 @@ define void @dont_coalesce_arg_v2f64(<2 x double> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v2f64
diff --git a/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll b/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll
index f2163ad15bafc..df88f37195ed6 100644
--- a/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll
+++ b/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll
@@ -129,12 +129,11 @@ define <2 x double> @streaming_compatible_with_neon_vectors(<2 x double> %arg) "
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
+; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: mrs x19, SVCR
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: mrs x19, SVCR
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
-; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
; CHECK-NEXT: tbz w19, #0, .LBB4_2
; CHECK-NEXT: // %bb.1:
; CHECK-NEXT: smstop sm
diff --git a/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll b/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
index 6c6a691760af3..52a77cb396909 100644
--- a/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
+++ b/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
@@ -147,15 +147,15 @@ define <2 x float> @extract_v2f32_nxv16f32_2(<vscale x 16 x float> %arg) {
define <4 x i1> @extract_v4i1_nxv32i1_0(<vscale x 32 x i1> %arg) {
; CHECK-LABEL: extract_v4i1_nxv32i1_0:
; CHECK: // %bb.0:
-; CHECK-NEXT: mov z1.b, p0/z, #1 // =0x1
-; CHECK-NEXT: umov w8, v1.b[1]
-; CHECK-NEXT: mov v0.16b, v1.16b
-; CHECK-NEXT: umov w9, v1.b[2]
+; CHECK-NEXT: mov z0.b, p0/z, #1 // =0x1
+; CHECK-NEXT: umov w8, v0.b[1]
+; CHECK-NEXT: mov v1.16b, v0.16b
; CHECK-NEXT: mov v0.h[1], w8
+; CHECK-NEXT: umov w8, v1.b[2]
+; CHECK-NEXT: mov v0.h[2], w8
; CHECK-NEXT: umov w8, v1.b[3]
-; CHECK-NEXT: mov v0.h[2], w9
; CHECK-NEXT: mov v0.h[3], w8
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $q0
+; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: ret
%ext = call <4 x i1> @llvm.vector.extract.v4i1.n...
[truncated]
|
@llvm/pr-subscribers-backend-aarch64 Author: Matt Arsenault (arsenm) ChangesThis appears to be forgotten switch flip from 2015. This I also had to hack many hexagon tests to disable the rule. I have Patch is 1.65 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/161621.diff 152 Files Affected:
diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index 7ac1aef83777a..5bd38a916fe4d 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -81,7 +81,7 @@ static cl::opt<bool> EnableJoining("join-liveintervals",
static cl::opt<bool> UseTerminalRule("terminal-rule",
cl::desc("Apply the terminal rule"),
- cl::init(false), cl::Hidden);
+ cl::init(true), cl::Hidden);
/// Temporary flag to test critical edge unsplitting.
static cl::opt<bool> EnableJoinSplits(
diff --git a/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll b/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll
index dbbfbea9176f6..f725c19081deb 100644
--- a/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll
+++ b/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll
@@ -188,11 +188,11 @@ entry:
define <8 x i8> @test11(ptr nocapture noundef readonly %a, ptr nocapture noundef readonly %b) {
; CHECK-LABEL: test11:
; CHECK: // %bb.0: // %entry
-; CHECK-NEXT: ld1r { v1.8b }, [x0]
-; CHECK-NEXT: ld1r { v2.8b }, [x1]
-; CHECK-NEXT: mov v0.16b, v1.16b
-; CHECK-NEXT: mov v0.h[2], v2.h[0]
-; CHECK-NEXT: mov v0.h[3], v1.h[0]
+; CHECK-NEXT: ld1r { v0.8b }, [x0]
+; CHECK-NEXT: ld1r { v1.8b }, [x1]
+; CHECK-NEXT: fmov d2, d0
+; CHECK-NEXT: mov v0.h[2], v1.h[0]
+; CHECK-NEXT: mov v0.h[3], v2.h[0]
; CHECK-NEXT: // kill: def $d0 killed $d0 killed $q0
; CHECK-NEXT: ret
entry:
diff --git a/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll b/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll
index 3230c9e946da7..b3a7ec961b736 100644
--- a/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll
+++ b/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll
@@ -20,20 +20,17 @@ define i32 @sink_load_and_copy(i32 %n) {
; CHECK-NEXT: b.lt .LBB0_3
; CHECK-NEXT: // %bb.1: // %for.body.preheader
; CHECK-NEXT: adrp x8, A
-; CHECK-NEXT: mov w20, w19
-; CHECK-NEXT: ldr w21, [x8, :lo12:A]
+; CHECK-NEXT: mov w21, w19
+; CHECK-NEXT: ldr w20, [x8, :lo12:A]
; CHECK-NEXT: .LBB0_2: // %for.body
; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
-; CHECK-NEXT: mov w0, w21
+; CHECK-NEXT: mov w0, w20
; CHECK-NEXT: bl _Z3usei
-; CHECK-NEXT: sdiv w20, w20, w0
-; CHECK-NEXT: subs w19, w19, #1
+; CHECK-NEXT: sdiv w19, w19, w0
+; CHECK-NEXT: subs w21, w21, #1
; CHECK-NEXT: b.ne .LBB0_2
-; CHECK-NEXT: b .LBB0_4
-; CHECK-NEXT: .LBB0_3:
-; CHECK-NEXT: mov w20, w19
-; CHECK-NEXT: .LBB0_4: // %for.cond.cleanup
-; CHECK-NEXT: mov w0, w20
+; CHECK-NEXT: .LBB0_3: // %for.cond.cleanup
+; CHECK-NEXT: mov w0, w19
; CHECK-NEXT: ldp x20, x19, [sp, #16] // 16-byte Folded Reload
; CHECK-NEXT: ldp x30, x21, [sp], #32 // 16-byte Folded Reload
; CHECK-NEXT: ret
@@ -82,15 +79,12 @@ define i32 @cant_sink_successive_call(i32 %n) {
; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
; CHECK-NEXT: mov w0, w20
; CHECK-NEXT: bl _Z3usei
-; CHECK-NEXT: sdiv w21, w21, w0
-; CHECK-NEXT: subs w19, w19, #1
+; CHECK-NEXT: sdiv w19, w19, w0
+; CHECK-NEXT: subs w21, w21, #1
; CHECK-NEXT: b.ne .LBB1_2
-; CHECK-NEXT: b .LBB1_4
-; CHECK-NEXT: .LBB1_3:
-; CHECK-NEXT: mov w21, w19
-; CHECK-NEXT: .LBB1_4: // %for.cond.cleanup
+; CHECK-NEXT: .LBB1_3: // %for.cond.cleanup
+; CHECK-NEXT: mov w0, w19
; CHECK-NEXT: ldp x20, x19, [sp, #16] // 16-byte Folded Reload
-; CHECK-NEXT: mov w0, w21
; CHECK-NEXT: ldp x30, x21, [sp], #32 // 16-byte Folded Reload
; CHECK-NEXT: ret
entry:
@@ -139,15 +133,12 @@ define i32 @cant_sink_successive_store(ptr nocapture readnone %store, i32 %n) {
; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
; CHECK-NEXT: mov w0, w20
; CHECK-NEXT: bl _Z3usei
-; CHECK-NEXT: sdiv w21, w21, w0
-; CHECK-NEXT: subs w19, w19, #1
+; CHECK-NEXT: sdiv w19, w19, w0
+; CHECK-NEXT: subs w21, w21, #1
; CHECK-NEXT: b.ne .LBB2_2
-; CHECK-NEXT: b .LBB2_4
-; CHECK-NEXT: .LBB2_3:
-; CHECK-NEXT: mov w21, w19
-; CHECK-NEXT: .LBB2_4: // %for.cond.cleanup
+; CHECK-NEXT: .LBB2_3: // %for.cond.cleanup
+; CHECK-NEXT: mov w0, w19
; CHECK-NEXT: ldp x20, x19, [sp, #16] // 16-byte Folded Reload
-; CHECK-NEXT: mov w0, w21
; CHECK-NEXT: ldp x30, x21, [sp], #32 // 16-byte Folded Reload
; CHECK-NEXT: ret
entry:
diff --git a/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll b/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll
index e7e109170d6a1..338084295fc7f 100644
--- a/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll
+++ b/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll
@@ -16,13 +16,12 @@ define i32 @test(ptr %ptr) {
; CHECK-NEXT: mov w9, wzr
; CHECK-NEXT: LBB0_1: ; %.thread
; CHECK-NEXT: ; =>This Inner Loop Header: Depth=1
-; CHECK-NEXT: lsr w11, w9, #1
; CHECK-NEXT: sub w10, w9, #1
-; CHECK-NEXT: mov w9, w11
+; CHECK-NEXT: lsr w9, w9, #1
; CHECK-NEXT: tbnz w10, #0, LBB0_1
; CHECK-NEXT: ; %bb.2: ; %bb343
; CHECK-NEXT: and w9, w10, #0x1
-; CHECK-NEXT: mov w0, #-1
+; CHECK-NEXT: mov w0, #-1 ; =0xffffffff
; CHECK-NEXT: str w9, [x8]
; CHECK-NEXT: ret
bb:
diff --git a/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll b/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll
index b947c943ba448..72f6646930624 100644
--- a/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll
+++ b/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll
@@ -151,12 +151,11 @@ define void @dont_coalesce_arg_f16(half %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $h0 killed $h0 killed $z0
; CHECK-NEXT: str h0, [sp, #14] // 2-byte Folded Spill
+; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr h0, [sp, #14] // 2-byte Folded Reload
; CHECK-NEXT: bl use_f16
@@ -190,12 +189,11 @@ define void @dont_coalesce_arg_f32(float %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $s0 killed $s0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $s0 killed $s0 killed $z0
; CHECK-NEXT: str s0, [sp, #12] // 4-byte Folded Spill
+; CHECK-NEXT: // kill: def $s0 killed $s0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr s0, [sp, #12] // 4-byte Folded Reload
; CHECK-NEXT: bl use_f32
@@ -229,12 +227,11 @@ define void @dont_coalesce_arg_f64(double %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_f64
@@ -273,12 +270,11 @@ define void @dont_coalesce_arg_v1i8(<1 x i8> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v16i8
@@ -313,12 +309,11 @@ define void @dont_coalesce_arg_v1i16(<1 x i16> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v8i16
@@ -353,12 +348,11 @@ define void @dont_coalesce_arg_v1i32(<1 x i32> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v4i32
@@ -393,12 +387,11 @@ define void @dont_coalesce_arg_v1i64(<1 x i64> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v2i64
@@ -433,12 +426,11 @@ define void @dont_coalesce_arg_v1f16(<1 x half> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $h0 killed $h0 killed $z0
; CHECK-NEXT: str h0, [sp, #14] // 2-byte Folded Spill
+; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr h0, [sp, #14] // 2-byte Folded Reload
; CHECK-NEXT: bl use_v8f16
@@ -513,12 +505,11 @@ define void @dont_coalesce_arg_v1f64(<1 x double> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v2f64
@@ -557,12 +548,11 @@ define void @dont_coalesce_arg_v16i8(<16 x i8> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v16i8
@@ -596,12 +586,11 @@ define void @dont_coalesce_arg_v8i16(<8 x i16> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v8i16
@@ -635,12 +624,11 @@ define void @dont_coalesce_arg_v4i32(<4 x i32> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v4i32
@@ -674,12 +662,11 @@ define void @dont_coalesce_arg_v2i64(<2 x i64> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v2i64
@@ -713,12 +700,11 @@ define void @dont_coalesce_arg_v8f16(<8 x half> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v8f16
@@ -752,12 +738,11 @@ define void @dont_coalesce_arg_v8bf16(<8 x bfloat> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v8bf16
@@ -791,12 +776,11 @@ define void @dont_coalesce_arg_v4f32(<4 x float> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v4f32
@@ -830,12 +814,11 @@ define void @dont_coalesce_arg_v2f64(<2 x double> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v2f64
diff --git a/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll b/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll
index f2163ad15bafc..df88f37195ed6 100644
--- a/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll
+++ b/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll
@@ -129,12 +129,11 @@ define <2 x double> @streaming_compatible_with_neon_vectors(<2 x double> %arg) "
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
+; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: mrs x19, SVCR
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: mrs x19, SVCR
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
-; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
; CHECK-NEXT: tbz w19, #0, .LBB4_2
; CHECK-NEXT: // %bb.1:
; CHECK-NEXT: smstop sm
diff --git a/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll b/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
index 6c6a691760af3..52a77cb396909 100644
--- a/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
+++ b/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
@@ -147,15 +147,15 @@ define <2 x float> @extract_v2f32_nxv16f32_2(<vscale x 16 x float> %arg) {
define <4 x i1> @extract_v4i1_nxv32i1_0(<vscale x 32 x i1> %arg) {
; CHECK-LABEL: extract_v4i1_nxv32i1_0:
; CHECK: // %bb.0:
-; CHECK-NEXT: mov z1.b, p0/z, #1 // =0x1
-; CHECK-NEXT: umov w8, v1.b[1]
-; CHECK-NEXT: mov v0.16b, v1.16b
-; CHECK-NEXT: umov w9, v1.b[2]
+; CHECK-NEXT: mov z0.b, p0/z, #1 // =0x1
+; CHECK-NEXT: umov w8, v0.b[1]
+; CHECK-NEXT: mov v1.16b, v0.16b
; CHECK-NEXT: mov v0.h[1], w8
+; CHECK-NEXT: umov w8, v1.b[2]
+; CHECK-NEXT: mov v0.h[2], w8
; CHECK-NEXT: umov w8, v1.b[3]
-; CHECK-NEXT: mov v0.h[2], w9
; CHECK-NEXT: mov v0.h[3], w8
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $q0
+; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: ret
%ext = call <4 x i1> @llvm.vector.extract.v4i1.n...
[truncated]
|
@llvm/pr-subscribers-llvm-regalloc Author: Matt Arsenault (arsenm) ChangesThis appears to be forgotten switch flip from 2015. This I also had to hack many hexagon tests to disable the rule. I have Patch is 1.65 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/161621.diff 152 Files Affected:
diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index 7ac1aef83777a..5bd38a916fe4d 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -81,7 +81,7 @@ static cl::opt<bool> EnableJoining("join-liveintervals",
static cl::opt<bool> UseTerminalRule("terminal-rule",
cl::desc("Apply the terminal rule"),
- cl::init(false), cl::Hidden);
+ cl::init(true), cl::Hidden);
/// Temporary flag to test critical edge unsplitting.
static cl::opt<bool> EnableJoinSplits(
diff --git a/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll b/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll
index dbbfbea9176f6..f725c19081deb 100644
--- a/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll
+++ b/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll
@@ -188,11 +188,11 @@ entry:
define <8 x i8> @test11(ptr nocapture noundef readonly %a, ptr nocapture noundef readonly %b) {
; CHECK-LABEL: test11:
; CHECK: // %bb.0: // %entry
-; CHECK-NEXT: ld1r { v1.8b }, [x0]
-; CHECK-NEXT: ld1r { v2.8b }, [x1]
-; CHECK-NEXT: mov v0.16b, v1.16b
-; CHECK-NEXT: mov v0.h[2], v2.h[0]
-; CHECK-NEXT: mov v0.h[3], v1.h[0]
+; CHECK-NEXT: ld1r { v0.8b }, [x0]
+; CHECK-NEXT: ld1r { v1.8b }, [x1]
+; CHECK-NEXT: fmov d2, d0
+; CHECK-NEXT: mov v0.h[2], v1.h[0]
+; CHECK-NEXT: mov v0.h[3], v2.h[0]
; CHECK-NEXT: // kill: def $d0 killed $d0 killed $q0
; CHECK-NEXT: ret
entry:
diff --git a/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll b/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll
index 3230c9e946da7..b3a7ec961b736 100644
--- a/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll
+++ b/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll
@@ -20,20 +20,17 @@ define i32 @sink_load_and_copy(i32 %n) {
; CHECK-NEXT: b.lt .LBB0_3
; CHECK-NEXT: // %bb.1: // %for.body.preheader
; CHECK-NEXT: adrp x8, A
-; CHECK-NEXT: mov w20, w19
-; CHECK-NEXT: ldr w21, [x8, :lo12:A]
+; CHECK-NEXT: mov w21, w19
+; CHECK-NEXT: ldr w20, [x8, :lo12:A]
; CHECK-NEXT: .LBB0_2: // %for.body
; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
-; CHECK-NEXT: mov w0, w21
+; CHECK-NEXT: mov w0, w20
; CHECK-NEXT: bl _Z3usei
-; CHECK-NEXT: sdiv w20, w20, w0
-; CHECK-NEXT: subs w19, w19, #1
+; CHECK-NEXT: sdiv w19, w19, w0
+; CHECK-NEXT: subs w21, w21, #1
; CHECK-NEXT: b.ne .LBB0_2
-; CHECK-NEXT: b .LBB0_4
-; CHECK-NEXT: .LBB0_3:
-; CHECK-NEXT: mov w20, w19
-; CHECK-NEXT: .LBB0_4: // %for.cond.cleanup
-; CHECK-NEXT: mov w0, w20
+; CHECK-NEXT: .LBB0_3: // %for.cond.cleanup
+; CHECK-NEXT: mov w0, w19
; CHECK-NEXT: ldp x20, x19, [sp, #16] // 16-byte Folded Reload
; CHECK-NEXT: ldp x30, x21, [sp], #32 // 16-byte Folded Reload
; CHECK-NEXT: ret
@@ -82,15 +79,12 @@ define i32 @cant_sink_successive_call(i32 %n) {
; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
; CHECK-NEXT: mov w0, w20
; CHECK-NEXT: bl _Z3usei
-; CHECK-NEXT: sdiv w21, w21, w0
-; CHECK-NEXT: subs w19, w19, #1
+; CHECK-NEXT: sdiv w19, w19, w0
+; CHECK-NEXT: subs w21, w21, #1
; CHECK-NEXT: b.ne .LBB1_2
-; CHECK-NEXT: b .LBB1_4
-; CHECK-NEXT: .LBB1_3:
-; CHECK-NEXT: mov w21, w19
-; CHECK-NEXT: .LBB1_4: // %for.cond.cleanup
+; CHECK-NEXT: .LBB1_3: // %for.cond.cleanup
+; CHECK-NEXT: mov w0, w19
; CHECK-NEXT: ldp x20, x19, [sp, #16] // 16-byte Folded Reload
-; CHECK-NEXT: mov w0, w21
; CHECK-NEXT: ldp x30, x21, [sp], #32 // 16-byte Folded Reload
; CHECK-NEXT: ret
entry:
@@ -139,15 +133,12 @@ define i32 @cant_sink_successive_store(ptr nocapture readnone %store, i32 %n) {
; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
; CHECK-NEXT: mov w0, w20
; CHECK-NEXT: bl _Z3usei
-; CHECK-NEXT: sdiv w21, w21, w0
-; CHECK-NEXT: subs w19, w19, #1
+; CHECK-NEXT: sdiv w19, w19, w0
+; CHECK-NEXT: subs w21, w21, #1
; CHECK-NEXT: b.ne .LBB2_2
-; CHECK-NEXT: b .LBB2_4
-; CHECK-NEXT: .LBB2_3:
-; CHECK-NEXT: mov w21, w19
-; CHECK-NEXT: .LBB2_4: // %for.cond.cleanup
+; CHECK-NEXT: .LBB2_3: // %for.cond.cleanup
+; CHECK-NEXT: mov w0, w19
; CHECK-NEXT: ldp x20, x19, [sp, #16] // 16-byte Folded Reload
-; CHECK-NEXT: mov w0, w21
; CHECK-NEXT: ldp x30, x21, [sp], #32 // 16-byte Folded Reload
; CHECK-NEXT: ret
entry:
diff --git a/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll b/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll
index e7e109170d6a1..338084295fc7f 100644
--- a/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll
+++ b/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll
@@ -16,13 +16,12 @@ define i32 @test(ptr %ptr) {
; CHECK-NEXT: mov w9, wzr
; CHECK-NEXT: LBB0_1: ; %.thread
; CHECK-NEXT: ; =>This Inner Loop Header: Depth=1
-; CHECK-NEXT: lsr w11, w9, #1
; CHECK-NEXT: sub w10, w9, #1
-; CHECK-NEXT: mov w9, w11
+; CHECK-NEXT: lsr w9, w9, #1
; CHECK-NEXT: tbnz w10, #0, LBB0_1
; CHECK-NEXT: ; %bb.2: ; %bb343
; CHECK-NEXT: and w9, w10, #0x1
-; CHECK-NEXT: mov w0, #-1
+; CHECK-NEXT: mov w0, #-1 ; =0xffffffff
; CHECK-NEXT: str w9, [x8]
; CHECK-NEXT: ret
bb:
diff --git a/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll b/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll
index b947c943ba448..72f6646930624 100644
--- a/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll
+++ b/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll
@@ -151,12 +151,11 @@ define void @dont_coalesce_arg_f16(half %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $h0 killed $h0 killed $z0
; CHECK-NEXT: str h0, [sp, #14] // 2-byte Folded Spill
+; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr h0, [sp, #14] // 2-byte Folded Reload
; CHECK-NEXT: bl use_f16
@@ -190,12 +189,11 @@ define void @dont_coalesce_arg_f32(float %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $s0 killed $s0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $s0 killed $s0 killed $z0
; CHECK-NEXT: str s0, [sp, #12] // 4-byte Folded Spill
+; CHECK-NEXT: // kill: def $s0 killed $s0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr s0, [sp, #12] // 4-byte Folded Reload
; CHECK-NEXT: bl use_f32
@@ -229,12 +227,11 @@ define void @dont_coalesce_arg_f64(double %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_f64
@@ -273,12 +270,11 @@ define void @dont_coalesce_arg_v1i8(<1 x i8> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v16i8
@@ -313,12 +309,11 @@ define void @dont_coalesce_arg_v1i16(<1 x i16> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v8i16
@@ -353,12 +348,11 @@ define void @dont_coalesce_arg_v1i32(<1 x i32> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v4i32
@@ -393,12 +387,11 @@ define void @dont_coalesce_arg_v1i64(<1 x i64> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v2i64
@@ -433,12 +426,11 @@ define void @dont_coalesce_arg_v1f16(<1 x half> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $h0 killed $h0 killed $z0
; CHECK-NEXT: str h0, [sp, #14] // 2-byte Folded Spill
+; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr h0, [sp, #14] // 2-byte Folded Reload
; CHECK-NEXT: bl use_v8f16
@@ -513,12 +505,11 @@ define void @dont_coalesce_arg_v1f64(<1 x double> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v2f64
@@ -557,12 +548,11 @@ define void @dont_coalesce_arg_v16i8(<16 x i8> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v16i8
@@ -596,12 +586,11 @@ define void @dont_coalesce_arg_v8i16(<8 x i16> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v8i16
@@ -635,12 +624,11 @@ define void @dont_coalesce_arg_v4i32(<4 x i32> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v4i32
@@ -674,12 +662,11 @@ define void @dont_coalesce_arg_v2i64(<2 x i64> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v2i64
@@ -713,12 +700,11 @@ define void @dont_coalesce_arg_v8f16(<8 x half> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v8f16
@@ -752,12 +738,11 @@ define void @dont_coalesce_arg_v8bf16(<8 x bfloat> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v8bf16
@@ -791,12 +776,11 @@ define void @dont_coalesce_arg_v4f32(<4 x float> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v4f32
@@ -830,12 +814,11 @@ define void @dont_coalesce_arg_v2f64(<2 x double> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v2f64
diff --git a/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll b/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll
index f2163ad15bafc..df88f37195ed6 100644
--- a/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll
+++ b/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll
@@ -129,12 +129,11 @@ define <2 x double> @streaming_compatible_with_neon_vectors(<2 x double> %arg) "
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
+; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: mrs x19, SVCR
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: mrs x19, SVCR
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
-; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
; CHECK-NEXT: tbz w19, #0, .LBB4_2
; CHECK-NEXT: // %bb.1:
; CHECK-NEXT: smstop sm
diff --git a/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll b/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
index 6c6a691760af3..52a77cb396909 100644
--- a/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
+++ b/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
@@ -147,15 +147,15 @@ define <2 x float> @extract_v2f32_nxv16f32_2(<vscale x 16 x float> %arg) {
define <4 x i1> @extract_v4i1_nxv32i1_0(<vscale x 32 x i1> %arg) {
; CHECK-LABEL: extract_v4i1_nxv32i1_0:
; CHECK: // %bb.0:
-; CHECK-NEXT: mov z1.b, p0/z, #1 // =0x1
-; CHECK-NEXT: umov w8, v1.b[1]
-; CHECK-NEXT: mov v0.16b, v1.16b
-; CHECK-NEXT: umov w9, v1.b[2]
+; CHECK-NEXT: mov z0.b, p0/z, #1 // =0x1
+; CHECK-NEXT: umov w8, v0.b[1]
+; CHECK-NEXT: mov v1.16b, v0.16b
; CHECK-NEXT: mov v0.h[1], w8
+; CHECK-NEXT: umov w8, v1.b[2]
+; CHECK-NEXT: mov v0.h[2], w8
; CHECK-NEXT: umov w8, v1.b[3]
-; CHECK-NEXT: mov v0.h[2], w9
; CHECK-NEXT: mov v0.h[3], w8
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $q0
+; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: ret
%ext = call <4 x i1> @llvm.vector.extract.v4i1.n...
[truncated]
|
@llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) ChangesThis appears to be forgotten switch flip from 2015. This I also had to hack many hexagon tests to disable the rule. I have Patch is 1.65 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/161621.diff 152 Files Affected:
diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index 7ac1aef83777a..5bd38a916fe4d 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -81,7 +81,7 @@ static cl::opt<bool> EnableJoining("join-liveintervals",
static cl::opt<bool> UseTerminalRule("terminal-rule",
cl::desc("Apply the terminal rule"),
- cl::init(false), cl::Hidden);
+ cl::init(true), cl::Hidden);
/// Temporary flag to test critical edge unsplitting.
static cl::opt<bool> EnableJoinSplits(
diff --git a/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll b/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll
index dbbfbea9176f6..f725c19081deb 100644
--- a/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll
+++ b/llvm/test/CodeGen/AArch64/build-vector-two-dup.ll
@@ -188,11 +188,11 @@ entry:
define <8 x i8> @test11(ptr nocapture noundef readonly %a, ptr nocapture noundef readonly %b) {
; CHECK-LABEL: test11:
; CHECK: // %bb.0: // %entry
-; CHECK-NEXT: ld1r { v1.8b }, [x0]
-; CHECK-NEXT: ld1r { v2.8b }, [x1]
-; CHECK-NEXT: mov v0.16b, v1.16b
-; CHECK-NEXT: mov v0.h[2], v2.h[0]
-; CHECK-NEXT: mov v0.h[3], v1.h[0]
+; CHECK-NEXT: ld1r { v0.8b }, [x0]
+; CHECK-NEXT: ld1r { v1.8b }, [x1]
+; CHECK-NEXT: fmov d2, d0
+; CHECK-NEXT: mov v0.h[2], v1.h[0]
+; CHECK-NEXT: mov v0.h[3], v2.h[0]
; CHECK-NEXT: // kill: def $d0 killed $d0 killed $q0
; CHECK-NEXT: ret
entry:
diff --git a/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll b/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll
index 3230c9e946da7..b3a7ec961b736 100644
--- a/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll
+++ b/llvm/test/CodeGen/AArch64/machine-licm-sink-instr.ll
@@ -20,20 +20,17 @@ define i32 @sink_load_and_copy(i32 %n) {
; CHECK-NEXT: b.lt .LBB0_3
; CHECK-NEXT: // %bb.1: // %for.body.preheader
; CHECK-NEXT: adrp x8, A
-; CHECK-NEXT: mov w20, w19
-; CHECK-NEXT: ldr w21, [x8, :lo12:A]
+; CHECK-NEXT: mov w21, w19
+; CHECK-NEXT: ldr w20, [x8, :lo12:A]
; CHECK-NEXT: .LBB0_2: // %for.body
; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
-; CHECK-NEXT: mov w0, w21
+; CHECK-NEXT: mov w0, w20
; CHECK-NEXT: bl _Z3usei
-; CHECK-NEXT: sdiv w20, w20, w0
-; CHECK-NEXT: subs w19, w19, #1
+; CHECK-NEXT: sdiv w19, w19, w0
+; CHECK-NEXT: subs w21, w21, #1
; CHECK-NEXT: b.ne .LBB0_2
-; CHECK-NEXT: b .LBB0_4
-; CHECK-NEXT: .LBB0_3:
-; CHECK-NEXT: mov w20, w19
-; CHECK-NEXT: .LBB0_4: // %for.cond.cleanup
-; CHECK-NEXT: mov w0, w20
+; CHECK-NEXT: .LBB0_3: // %for.cond.cleanup
+; CHECK-NEXT: mov w0, w19
; CHECK-NEXT: ldp x20, x19, [sp, #16] // 16-byte Folded Reload
; CHECK-NEXT: ldp x30, x21, [sp], #32 // 16-byte Folded Reload
; CHECK-NEXT: ret
@@ -82,15 +79,12 @@ define i32 @cant_sink_successive_call(i32 %n) {
; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
; CHECK-NEXT: mov w0, w20
; CHECK-NEXT: bl _Z3usei
-; CHECK-NEXT: sdiv w21, w21, w0
-; CHECK-NEXT: subs w19, w19, #1
+; CHECK-NEXT: sdiv w19, w19, w0
+; CHECK-NEXT: subs w21, w21, #1
; CHECK-NEXT: b.ne .LBB1_2
-; CHECK-NEXT: b .LBB1_4
-; CHECK-NEXT: .LBB1_3:
-; CHECK-NEXT: mov w21, w19
-; CHECK-NEXT: .LBB1_4: // %for.cond.cleanup
+; CHECK-NEXT: .LBB1_3: // %for.cond.cleanup
+; CHECK-NEXT: mov w0, w19
; CHECK-NEXT: ldp x20, x19, [sp, #16] // 16-byte Folded Reload
-; CHECK-NEXT: mov w0, w21
; CHECK-NEXT: ldp x30, x21, [sp], #32 // 16-byte Folded Reload
; CHECK-NEXT: ret
entry:
@@ -139,15 +133,12 @@ define i32 @cant_sink_successive_store(ptr nocapture readnone %store, i32 %n) {
; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
; CHECK-NEXT: mov w0, w20
; CHECK-NEXT: bl _Z3usei
-; CHECK-NEXT: sdiv w21, w21, w0
-; CHECK-NEXT: subs w19, w19, #1
+; CHECK-NEXT: sdiv w19, w19, w0
+; CHECK-NEXT: subs w21, w21, #1
; CHECK-NEXT: b.ne .LBB2_2
-; CHECK-NEXT: b .LBB2_4
-; CHECK-NEXT: .LBB2_3:
-; CHECK-NEXT: mov w21, w19
-; CHECK-NEXT: .LBB2_4: // %for.cond.cleanup
+; CHECK-NEXT: .LBB2_3: // %for.cond.cleanup
+; CHECK-NEXT: mov w0, w19
; CHECK-NEXT: ldp x20, x19, [sp, #16] // 16-byte Folded Reload
-; CHECK-NEXT: mov w0, w21
; CHECK-NEXT: ldp x30, x21, [sp], #32 // 16-byte Folded Reload
; CHECK-NEXT: ret
entry:
diff --git a/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll b/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll
index e7e109170d6a1..338084295fc7f 100644
--- a/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll
+++ b/llvm/test/CodeGen/AArch64/machine-sink-kill-flags.ll
@@ -16,13 +16,12 @@ define i32 @test(ptr %ptr) {
; CHECK-NEXT: mov w9, wzr
; CHECK-NEXT: LBB0_1: ; %.thread
; CHECK-NEXT: ; =>This Inner Loop Header: Depth=1
-; CHECK-NEXT: lsr w11, w9, #1
; CHECK-NEXT: sub w10, w9, #1
-; CHECK-NEXT: mov w9, w11
+; CHECK-NEXT: lsr w9, w9, #1
; CHECK-NEXT: tbnz w10, #0, LBB0_1
; CHECK-NEXT: ; %bb.2: ; %bb343
; CHECK-NEXT: and w9, w10, #0x1
-; CHECK-NEXT: mov w0, #-1
+; CHECK-NEXT: mov w0, #-1 ; =0xffffffff
; CHECK-NEXT: str w9, [x8]
; CHECK-NEXT: ret
bb:
diff --git a/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll b/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll
index b947c943ba448..72f6646930624 100644
--- a/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll
+++ b/llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll
@@ -151,12 +151,11 @@ define void @dont_coalesce_arg_f16(half %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $h0 killed $h0 killed $z0
; CHECK-NEXT: str h0, [sp, #14] // 2-byte Folded Spill
+; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr h0, [sp, #14] // 2-byte Folded Reload
; CHECK-NEXT: bl use_f16
@@ -190,12 +189,11 @@ define void @dont_coalesce_arg_f32(float %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $s0 killed $s0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $s0 killed $s0 killed $z0
; CHECK-NEXT: str s0, [sp, #12] // 4-byte Folded Spill
+; CHECK-NEXT: // kill: def $s0 killed $s0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr s0, [sp, #12] // 4-byte Folded Reload
; CHECK-NEXT: bl use_f32
@@ -229,12 +227,11 @@ define void @dont_coalesce_arg_f64(double %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_f64
@@ -273,12 +270,11 @@ define void @dont_coalesce_arg_v1i8(<1 x i8> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v16i8
@@ -313,12 +309,11 @@ define void @dont_coalesce_arg_v1i16(<1 x i16> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v8i16
@@ -353,12 +348,11 @@ define void @dont_coalesce_arg_v1i32(<1 x i32> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v4i32
@@ -393,12 +387,11 @@ define void @dont_coalesce_arg_v1i64(<1 x i64> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v2i64
@@ -433,12 +426,11 @@ define void @dont_coalesce_arg_v1f16(<1 x half> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $h0 killed $h0 killed $z0
; CHECK-NEXT: str h0, [sp, #14] // 2-byte Folded Spill
+; CHECK-NEXT: // kill: def $h0 killed $h0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr h0, [sp, #14] // 2-byte Folded Reload
; CHECK-NEXT: bl use_v8f16
@@ -513,12 +505,11 @@ define void @dont_coalesce_arg_v1f64(<1 x double> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: str d0, [sp, #8] // 8-byte Folded Spill
+; CHECK-NEXT: // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr d0, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: bl use_v2f64
@@ -557,12 +548,11 @@ define void @dont_coalesce_arg_v16i8(<16 x i8> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v16i8
@@ -596,12 +586,11 @@ define void @dont_coalesce_arg_v8i16(<8 x i16> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v8i16
@@ -635,12 +624,11 @@ define void @dont_coalesce_arg_v4i32(<4 x i32> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v4i32
@@ -674,12 +662,11 @@ define void @dont_coalesce_arg_v2i64(<2 x i64> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v2i64
@@ -713,12 +700,11 @@ define void @dont_coalesce_arg_v8f16(<8 x half> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v8f16
@@ -752,12 +738,11 @@ define void @dont_coalesce_arg_v8bf16(<8 x bfloat> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v8bf16
@@ -791,12 +776,11 @@ define void @dont_coalesce_arg_v4f32(<4 x float> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v4f32
@@ -830,12 +814,11 @@ define void @dont_coalesce_arg_v2f64(<2 x double> %arg, ptr %ptr) #0 {
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: mov x19, x0
-; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
+; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
; CHECK-NEXT: smstop sm
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: bl use_v2f64
diff --git a/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll b/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll
index f2163ad15bafc..df88f37195ed6 100644
--- a/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll
+++ b/llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll
@@ -129,12 +129,11 @@ define <2 x double> @streaming_compatible_with_neon_vectors(<2 x double> %arg) "
; CHECK-NEXT: stp x30, x19, [sp, #80] // 16-byte Folded Spill
; CHECK-NEXT: sub sp, sp, #16
; CHECK-NEXT: addvl sp, sp, #-1
+; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
+; CHECK-NEXT: mrs x19, SVCR
; CHECK-NEXT: add x8, sp, #16
; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
; CHECK-NEXT: str z0, [x8] // 16-byte Folded Spill
-; CHECK-NEXT: mrs x19, SVCR
-; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
-; CHECK-NEXT: str q0, [sp] // 16-byte Folded Spill
; CHECK-NEXT: tbz w19, #0, .LBB4_2
; CHECK-NEXT: // %bb.1:
; CHECK-NEXT: smstop sm
diff --git a/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll b/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
index 6c6a691760af3..52a77cb396909 100644
--- a/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
+++ b/llvm/test/CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll
@@ -147,15 +147,15 @@ define <2 x float> @extract_v2f32_nxv16f32_2(<vscale x 16 x float> %arg) {
define <4 x i1> @extract_v4i1_nxv32i1_0(<vscale x 32 x i1> %arg) {
; CHECK-LABEL: extract_v4i1_nxv32i1_0:
; CHECK: // %bb.0:
-; CHECK-NEXT: mov z1.b, p0/z, #1 // =0x1
-; CHECK-NEXT: umov w8, v1.b[1]
-; CHECK-NEXT: mov v0.16b, v1.16b
-; CHECK-NEXT: umov w9, v1.b[2]
+; CHECK-NEXT: mov z0.b, p0/z, #1 // =0x1
+; CHECK-NEXT: umov w8, v0.b[1]
+; CHECK-NEXT: mov v1.16b, v0.16b
; CHECK-NEXT: mov v0.h[1], w8
+; CHECK-NEXT: umov w8, v1.b[2]
+; CHECK-NEXT: mov v0.h[2], w8
; CHECK-NEXT: umov w8, v1.b[3]
-; CHECK-NEXT: mov v0.h[2], w9
; CHECK-NEXT: mov v0.h[3], w8
-; CHECK-NEXT: // kill: def $d0 killed $d0 killed $q0
+; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
; CHECK-NEXT: ret
%ext = call <4 x i1> @llvm.vector.extract.v4i1.n...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To give an opportunity for targets to adopt the terminal rule on their own, I think it make sense to have a target hook for it, like what we do with join-globalcopies
and ::enableJoinGlobalCopies
I don't remember why I didn't flip the switch when I introduced this rule, so I want to make sure we are friendly with downstream users.
I specifically think we should not have a control for this, and have a policy against introducing this style of target hook. I think these are lazy shortcuts to avoid touching targets whoever adds the feature doesn't care about, and the project is worse off for it. The reality of target maintenance is this will never be implemented by any target. If we wait several years, a few might discover it and turn it on |
@arsenm I generally agree with you here, but I think this is a case where adding a hook would make review much easier. I'd proposed adding the hook, enabling AMDGPU, then doing individual changes for the other backends, then (if desired) remove the hook. LGTM from the RISC-V perspective, and the change generally seems reasonable. |
This appears to be forgotten switch flip from 2015. This
seems to do a nicer job with subregister copies. Most of the
test changes are improvements or neutral, not that many are
light regressions. The worst AMDGPU regressions are for true16
in the atomic tests, but I think that's due to existing true16
issues.
I also had to hack many hexagon tests to disable the rule. I have
no idea how to update these tests. They appear to be testing specific
scheduling and packet formation of later machine passes, so any change
in the incoming mir is likely hiding whatever was originally intended.
I'll open an issue to fixup these tests once this lands.