[GlobalISel] Disable CSE in IRTranslator & Legalizer at O0 by c-rhodes · Pull Request #197000 · llvm/llvm-project

c-rhodes · 2026-05-11T17:13:21Z

CTMark -0.54% geomean improvement on stage1-aarch64-O0-g [1] with no change to
code-size [2]. Sqlite is -1.30%.

I also measured without -g locally and geomean is -0.62%:

                 instructions:u                 diff
                            old           new
7zip               131599609457  131463795723 -0.10%
Bullet              57300798645   57215708638 -0.15%
ClamAV              12687796637   12595161387 -0.73%
SPASS               12494927389   12442127229 -0.42%
consumer-typeset    10419961416   10306472249 -1.09%
kimwitu++           20354279265   20261925075 -0.45%
lencod              10986384595   10899231247 -0.79%
mafft                6039142913    6005083779 -0.56%
sqlite3              3582744938    3525792360 -1.59%
tramp3d-v4          16245375037   16196807778 -0.30%
geomean             15815512869   15717416669 -0.62%

This was enabled for constants only in 946b124 and improved compile-time
and code-size.

Assisted-by: codex

[1] https://llvm-compile-time-tracker.com/compare.php?from=f0c84b3ac80c4c594045aff9b4f88ba564614361&to=98bfcb7aa08ba8ecf5ed5a89e64d2e7f0878be5b&stat=instructions%3Au
[2] https://llvm-compile-time-tracker.com/compare.php?from=f0c84b3ac80c4c594045aff9b4f88ba564614361&to=98bfcb7aa08ba8ecf5ed5a89e64d2e7f0878be5b&stat=size-total

c-rhodes · 2026-05-11T17:16:42Z

https://llvm-compile-time-tracker.com/compare.php?from=ed50ea52004259af958bb3e5636268342c49ee62&to=0fce1265d04213d57b031eb746296c2c77b7eb1f&stat=instructions%3Au

this is showing positive results, -0.31% geomean improvement aarch64-O0-g with no change in code size. @aemerson I noticed you turned this on a while back (946b124) but only for constants at O0. Is this result unexpected?

it breaks tons of tests of course and I've not updated them yet hence draft, just wanted to check before I post a full PR.

aemerson · 2026-05-11T18:34:33Z

I am quite surprised at these results but I guess we may have added the same CSE combines into the O0 prelegalizer combiner anyway.

In principal if this doesn't regress anything I'm not opposed to it, and 0.3% compile time saving is quite significant.

github-actions · 2026-05-12T07:22:57Z

🐧 Linux x64 Test Results

195291 tests passed
5202 tests skipped

✅ The build succeeded and all tests passed.

github-actions · 2026-05-12T07:22:57Z

🪟 Windows x64 Test Results

134663 tests passed
3269 tests skipped

✅ The build succeeded and all tests passed.

c-rhodes · 2026-05-12T08:37:23Z

also disabling it for IRTranslator is a further improvement still with no impact to code-size, geomean -0.54% sqlite -1.30%:
https://llvm-compile-time-tracker.com/compare.php?from=f0c84b3ac80c4c594045aff9b4f88ba564614361&to=98bfcb7aa08ba8ecf5ed5a89e64d2e7f0878be5b&stat=instructions%3Au

llvmorg-github-actions · 2026-05-12T10:31:40Z

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-webassembly
@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-backend-aarch64

Author: Cullen Rhodes (c-rhodes)

Changes

CTMark -0.54% geomean improvement on stage1-aarch64-O0-g [1] with no change to
code-size [2]. Sqlite is -1.30%.

I also measured without -g locally and geomean is -0.62%:

                 instructions:u                 diff
                            old           new
7zip               131599609457  131463795723 -0.10%
Bullet              57300798645   57215708638 -0.15%
ClamAV              12687796637   12595161387 -0.73%
SPASS               12494927389   12442127229 -0.42%
consumer-typeset    10419961416   10306472249 -1.09%
kimwitu++           20354279265   20261925075 -0.45%
lencod              10986384595   10899231247 -0.79%
mafft                6039142913    6005083779 -0.56%
sqlite3              3582744938    3525792360 -1.59%
tramp3d-v4          16245375037   16196807778 -0.30%
geomean             15815512869   15717416669 -0.62%

This was enabled for constants only in 946b124 and improved compile-time
and code-size.

Assisted-by: codex

[1] https://llvm-compile-time-tracker.com/compare.php?from=f0c84b3ac80c4c594045aff9b4f88ba564614361&to=98bfcb7aa08ba8ecf5ed5a89e64d2e7f0878be5b&stat=instructions%3Au
[2] https://llvm-compile-time-tracker.com/compare.php?from=f0c84b3ac80c4c594045aff9b4f88ba564614361&to=98bfcb7aa08ba8ecf5ed5a89e64d2e7f0878be5b&stat=size-total

Patch is 6.07 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/197000.diff

140 Files Affected:

(modified) llvm/lib/CodeGen/TargetPassConfig.cpp (+1-1)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll (+16-16)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll (+16-16)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll (+1-2)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll (+8-6)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-gep.ll (+4-3)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll (+19-10)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/call-translator-ios.ll (+3-2)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/call-translator.ll (+2-1)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-bitcast.ll (+16-16)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-gep-flags.ll (+20-15)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-sincos.ll (+2-1)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-tbaa.ll (+2-2)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-build-vector.mir (+4-2)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-cmp.mir (+152-74)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-concat-vectors.mir (+8-2)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-constant.mir (+48-38)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-ctlz.mir (+11-7)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-div.mir (+2-1)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-ext.mir (+6-4)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-extract-vector-elt.mir (+4-3)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-fptrunc.mir (+2-1)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-freeze.mir (+7-2)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-fshl.mir (+149-75)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-fshr.mir (+131-66)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir (+9-3)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-inserts.mir (+298-133)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store.mir (+38-21)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-merge-values.mir (+2-2)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-or.mir (+32-19)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-phi-insertpt-decrement.mir (+2-2)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-phi.mir (+10-8)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-rem.mir (+2-1)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-select.mir (+16-10)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-shift.mir (+33-19)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-shuffle-vector.mir (+16-12)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-simple.mir (+8-4)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-store-vector-bools.mir (+46-31)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-threeway-cmp.mir (+5-4)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-uadd-sat.mir (+18-12)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-undef.mir (+6-3)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-usub-sat.mir (+18-12)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-vaarg.mir (+15-13)
(modified) llvm/test/CodeGen/AArch64/aarch64-mops-mte.ll (+22-21)
(modified) llvm/test/CodeGen/AArch64/aarch64-tbz.ll (+30-23)
(modified) llvm/test/CodeGen/AArch64/popcount.ll (+8-6)
(modified) llvm/test/CodeGen/AArch64/pr48188.ll (+1-1)
(modified) llvm/test/CodeGen/AArch64/pr53315-returned-i128.ll (+3-2)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/artifact-combiner-extract.mir (+13-7)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/artifact-combiner-sext.mir (+4-3)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/artifact-combiner-unmerge-values.mir (+142-77)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/artifact-combiner-zext.mir (+4-4)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-constant-fold-vector-op.ll (+2-3)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-function-args.ll (+11-9)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-add.mir (+42-23)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-addrspacecast.mir (+14-7)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-and.mir (+170-66)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ashr.mir (+234-144)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-bitreverse.mir (+6-3)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-bswap.mir (+67-56)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-build-vector-trunc.mir (+5-3)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-build-vector.s16.mir (+173-90)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ctls.mir (+24-12)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ctlz-zero-poison.mir (+10-6)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ctlz.mir (+20-10)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ctpop.mir (+2-1)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-cttz-zero-poison.mir (+2-1)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-cttz.mir (+8-4)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fcmp.mir (+18-9)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-freeze.mir (+66-21)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fshl.mir (+277-177)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fshr.mir (+248-161)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-icmp.mir (+58-30)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-implicit-def-s1025.mir (+186-124)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-implicit-def.mir (+51-23)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant-32bit.mir (+6-4)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir (+2551-1503)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir (+3270-1927)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir (+4200-2521)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir (+5010-2905)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir (+5509-3162)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-lshr.mir (+277-165)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-merge-values.mir (+529-289)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-mul.mir (+46-25)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-or.mir (+100-42)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-saddo.mir (+102-67)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sbfx.mir (+4-2)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sdiv.mir (+382-247)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-select.mir (+46-20)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sext-inreg.mir (+155-89)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sext.mir (+56-29)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-shl.mir (+270-168)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-shuffle-vector.mir (+45-19)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-smulh.mir (+80-46)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-smulo.mir (+58-34)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-srem.mir (+320-197)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ssubo.mir (+102-67)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-store-global.mir (+2845-1803)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-store.mir (+262-194)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sub.mir (+64-23)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ubfx.mir (+8-4)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-udiv.mir (+494-372)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-umulh.mir (+214-128)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-umulo.mir (+218-124)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-unmerge-values.mir (+111-81)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-urem.mir (+394-284)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-xor.mir (+100-42)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/no-ctlz-from-umul-to-lshr-in-postlegalizer.ll (+39-32)
(modified) llvm/test/CodeGen/AMDGPU/div_i128.ll (+251-218)
(modified) llvm/test/CodeGen/AMDGPU/overlapping-tuple-copy-implicit-op-failure.ll (+56-57)
(modified) llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-bitcounts.mir (+6-4)
(modified) llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-divmod.mir (+20-12)
(modified) llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-fp.mir (+8-4)
(modified) llvm/test/CodeGen/Mips/GlobalISel/irtranslator/aggregate_struct_return.ll (+14-12)
(modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/add.mir (+4-2)
(modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/bitwise.mir (+4-2)
(modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/constants.mir (+8-4)
(modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/ctlz.mir (+2-1)
(modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/ctpop.mir (+19-11)
(modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/cttz.mir (+32-20)
(modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/icmp.mir (+5-3)
(modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/implicit_def.mir (+2-1)
(modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/rem_and_div.mir (+204-176)
(modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/sub.mir (+4-2)
(modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/truncStore_and_aExtLoad.mir (+2-1)
(modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/bitreverse.ll (+60-50)
(modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/bitwise.ll (+12-12)
(modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/constants.ll (+4-2)
(modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/ctlz.ll (+1-1)
(modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/ctpop.ll (+25-17)
(modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/cttz.ll (+33-28)
(modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/implicit_def.ll (+9-5)
(modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/long_ambiguous_chain_s64.ll (+67-65)
(modified) llvm/test/CodeGen/WebAssembly/GlobalISel/instructions/constant.ll (+6-2)
(modified) llvm/test/CodeGen/WebAssembly/GlobalISel/instructions/fshl.ll (+78-42)
(modified) llvm/test/CodeGen/WebAssembly/GlobalISel/instructions/fshr.ll (+72-36)
(modified) llvm/test/CodeGen/X86/GlobalISel/irtranslator-callingconv.ll (+4-2)
(modified) llvm/test/CodeGen/X86/GlobalISel/legalize-ext-x86-64.mir (+2-2)
(modified) llvm/test/CodeGen/X86/GlobalISel/legalize-memop-scalar-32.mir (+2-1)
(modified) llvm/test/CodeGen/X86/GlobalISel/x86_64-legalize-sitofp.mir (+64-56)

diff --git a/llvm/lib/CodeGen/TargetPassConfig.cpp b/llvm/lib/CodeGen/TargetPassConfig.cpp
index 096e9a6f2b1dc..c3af2acc622ba 100644
--- a/llvm/lib/CodeGen/TargetPassConfig.cpp
+++ b/llvm/lib/CodeGen/TargetPassConfig.cpp
@@ -1587,7 +1587,7 @@ bool TargetPassConfig::reportDiagnosticWhenGlobalISelFallback() const {
 }
 
 bool TargetPassConfig::isGISelCSEEnabled() const {
-  return true;
+  return getOptLevel() != CodeGenOptLevel::None;
 }
 
 std::unique_ptr<CSEConfigBase> TargetPassConfig::getCSEConfig() const {
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll
index 373b040ebec65..7c08cf1273b32 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll
@@ -231,8 +231,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_unordered:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered:
@@ -246,8 +246,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered_const(ptr readonly %pt
 ; -O0-LABEL: load_atomic_i128_aligned_unordered_const:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered_const:
@@ -261,8 +261,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_monotonic:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic:
@@ -276,8 +276,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic_const(ptr readonly %pt
 ; -O0-LABEL: load_atomic_i128_aligned_monotonic_const:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic_const:
@@ -291,8 +291,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_acquire:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire:
@@ -306,8 +306,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire_const(ptr readonly %ptr)
 ; -O0-LABEL: load_atomic_i128_aligned_acquire_const:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire_const:
@@ -321,8 +321,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_seq_cst:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stlxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stlxp w8, x10, x11, [x9]
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst:
@@ -336,8 +336,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst_const(ptr readonly %ptr)
 ; -O0-LABEL: load_atomic_i128_aligned_seq_cst_const:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stlxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stlxp w8, x10, x11, [x9]
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst_const:
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll
index 045e080983d5f..8ba7f6c235696 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll
@@ -231,8 +231,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_unordered:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered:
@@ -246,8 +246,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered_const(ptr readonly %pt
 ; -O0-LABEL: load_atomic_i128_aligned_unordered_const:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered_const:
@@ -261,8 +261,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_monotonic:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic:
@@ -276,8 +276,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic_const(ptr readonly %pt
 ; -O0-LABEL: load_atomic_i128_aligned_monotonic_const:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic_const:
@@ -291,8 +291,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_acquire:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire:
@@ -306,8 +306,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire_const(ptr readonly %ptr)
 ; -O0-LABEL: load_atomic_i128_aligned_acquire_const:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire_const:
@@ -321,8 +321,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_seq_cst:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stlxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stlxp w8, x10, x11, [x9]
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst:
@@ -336,8 +336,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst_const(ptr readonly %ptr)
 ; -O0-LABEL: load_atomic_i128_aligned_seq_cst_const:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stlxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stlxp w8, x10, x11, [x9]
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst_const:
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll
index 8c958459d93e8..3369db8b5f7d8 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll
@@ -421,12 +421,11 @@ define void @store_atomic_i128_from_gep() {
 ; GISEL-LABEL: store_atomic_i128_from_gep:
 ; GISEL:    bl init
 ; GISEL:    dmb ish
-; GISEL:    stp x8, x8, [x9, #16]
+; GISEL:    stp x8, x9, [x10, #16]
 ;
 ; SDAG-LABEL: store_atomic_i128_from_gep:
 ; SDAG:    bl init
 ; SDAG:    dmb ish
-; SDAG:    stp xzr, xzr, [sp, #16]
   %a = alloca [3 x i128]
   call void @init(ptr %a)
   %arrayidx  = getelementptr [3 x i128], ptr %a, i64 0, i64 1
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll
index be51210882eaa..04a76ffba9e2b 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll
@@ -556,15 +556,16 @@ define void @atomic_load_relaxed(i64, i64, ptr %p, ptr %p2) {
 ; CHECK-LLSC-O0-LABEL: atomic_load_relaxed:
 ; CHECK-LLSC-O0:       // %bb.0:
 ; CHECK-LLSC-O0-NEXT:    mov x11, xzr
+; CHECK-LLSC-O0-NEXT:    mov x12, xzr
 ; CHECK-LLSC-O0-NEXT:  .LBB4_1: // =>This Inner Loop Header: Depth=1
 ; CHECK-LLSC-O0-NEXT:    ldxp x9, x8, [x2]
 ; CHECK-LLSC-O0-NEXT:    cmp x9, x11
 ; CHECK-LLSC-O0-NEXT:    cset w10, ne
-; CHECK-LLSC-O0-NEXT:    cmp x8, x11
+; CHECK-LLSC-O0-NEXT:    cmp x8, x12
 ; CHECK-LLSC-O0-NEXT:    cinc w10, w10, ne
 ; CHECK-LLSC-O0-NEXT:    cbnz w10, .LBB4_3
 ; CHECK-LLSC-O0-NEXT:  // %bb.2: // in Loop: Header=BB4_1 Depth=1
-; CHECK-LLSC-O0-NEXT:    stxp w10, x11, x11, [x2]
+; CHECK-LLSC-O0-NEXT:    stxp w10, x11, x12, [x2]
 ; CHECK-LLSC-O0-NEXT:    cbnz w10, .LBB4_1
 ; CHECK-LLSC-O0-NEXT:    b .LBB4_4
 ; CHECK-LLSC-O0-NEXT:  .LBB4_3: // in Loop: Header=BB4_1 Depth=1
@@ -585,10 +586,10 @@ define void @atomic_load_relaxed(i64, i64, ptr %p, ptr %p2) {
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_offset w30, -16
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x4, x2
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    str x3, [sp, #8] // 8-byte Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, xzr
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x3, xzr
-; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x3
+; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x2
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x1, x3
-; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    bl __aarch64_cas16_relax
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x3, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    // implicit-def: $q0
@@ -601,10 +602,11 @@ define void @atomic_load_relaxed(i64, i64, ptr %p, ptr %p2) {
 ;
 ; CHECK-CAS-O0-LABEL: atomic_load_relaxed:
 ; CHECK-CAS-O0:       // %bb.0:
+; CHECK-CAS-O0-NEXT:    mov x4, xzr
 ; CHECK-CAS-O0-NEXT:    mov x8, xzr
-; CHECK-CAS-O0-NEXT:    mov x0, x8
+; CHECK-CAS-O0-NEXT:    mov x0, x4
 ; CHECK-CAS-O0-NEXT:    mov x1, x8
-; CHECK-CAS-O0-NEXT:    mov x4, x8
+; CHECK-CAS-O0-NEXT:    // kill: def $x4 killed $x4 def $x4_x5
 ; CHECK-CAS-O0-NEXT:    mov x5, x8
 ; CHECK-CAS-O0-NEXT:    casp x0, x1, x4, x5, [x2]
 ; CHECK-CAS-O0-NEXT:    mov x9, x0
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-gep.ll b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-gep.ll
index 269597cbd730b..71cddad8c904d 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-gep.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-gep.ll
@@ -17,10 +17,11 @@ define i32 @cse_gep(ptr %ptr, i32 %idx) {
   ; O0-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = nusw inbounds G_PTR_ADD [[COPY]], [[MUL]](i64)
   ; O0-NEXT:   [[COPY2:%[0-9]+]]:_(p0) = COPY [[PTR_ADD]](p0)
   ; O0-NEXT:   [[LOAD:%[0-9]+]]:_(i32) = G_LOAD [[COPY2]](p0) :: (load (i32) from %ir.gep1)
-  ; O0-NEXT:   [[MUL1:%[0-9]+]]:_(i64) = nsw G_MUL [[SEXT]], [[C]]
+  ; O0-NEXT:   [[C1:%[0-9]+]]:_(i64) = G_CONSTANT i64 16
+  ; O0-NEXT:   [[MUL1:%[0-9]+]]:_(i64) = nsw G_MUL [[SEXT]], [[C1]]
   ; O0-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = nusw inbounds G_PTR_ADD [[COPY]], [[MUL1]](i64)
-  ; O0-NEXT:   [[C1:%[0-9]+]]:_(i64) = G_CONSTANT i64 4
-  ; O0-NEXT:   [[PTR_ADD2:%[0-9]+]]:_(p0) = nuw nusw inbounds G_PTR_ADD [[PTR_ADD1]], [[C1]](i64)
+  ; O0-NEXT:   [[C2:%[0-9]+]]:_(i64) = G_CONSTANT i64 4
+  ; O0-NEXT:   [[PTR_ADD2:%[0-9]+]]:_(p0) = nuw nusw inbounds G_PTR_ADD [[PTR_ADD1]], [[C2]](i64)
   ; O0-NEXT:   [[LOAD1:%[0-9]+]]:_(i32) = G_LOAD [[PTR_ADD2]](p0) :: (load (i32) from %ir.gep2)
   ; O0-NEXT:   [[ADD:%[0-9]+]]:_(i32) = G_ADD [[LOAD]], [[LOAD1]]
   ; O0-NEXT:   $w0 = COPY [[ADD]](i32)
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll
index 8548f63bd1150..8af250766de74 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll
@@ -465,7 +465,7 @@ next:
 ; CHECK-LABEL: name: constant_int_start
 ; CHECK: [[TWO:%[0-9]+]]:_(i32) = G_CONSTANT i32 2
 ; CHECK: [[ANSWER:%[0-9]+]]:_(i32) = G_CONSTANT i32 42
-; CHECK: [[RES:%[0-9]+]]:_(i32) = G_CONSTANT i32 44
+; CHECK: [[RES:%[0-9]+]]:_(i32) = G_ADD [[TWO]], [[ANSWER]]
 define i32 @constant_int_start() {
   %res = add i32 2, 42
   ret i32 %res
@@ -605,7 +605,8 @@ define ptr @test_constant_null() {
 ; CHECK: [[GEP1:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD [[ADDR]], [[CST1]](i64)
 ; CHECK: [[VAL2:%[0-9]+]]:_(i32) = G_LOAD [[GEP1]](p0) :: (load (i32) from %ir.addr + 4)
 ; CHECK: G_STORE [[VAL1]](i8), [[ADDR]](p0) :: (store (i8) into %ir.addr, align 4)
-; CHECK: [[GEP2:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD [[ADDR]], [[CST1]](i64)
+; CHECK: [[CST1B:%[0-9]+]]:_(i64) = G_CONSTANT i64 4
+; CHECK: [[GEP2:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD [[ADDR]], [[CST1B]](i64)
 ; CHECK: G_STORE [[VAL2]](i32), [[GEP2]](p0) :: (store (i32) into %ir.addr + 4)
 define void @test_struct_memops(ptr %addr) {
   %val = load { i8, i32 }, ptr %addr
@@ -832,7 +833,8 @@ define i32 @test_extractvalue(ptr %addr) {
 ; CHECK: [[GEP3:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST3]](i64)
 ; CHECK: [[LD4:%[0-9]+]]:_(i32) = G_LOAD [[GEP3]](p0) :: (load (i32) from %ir.addr + 12)
 ; CHECK: G_STORE [[LD2]](i8), %1(p0) :: (store (i8) into %ir.addr2, align 4)
-; CHECK: [[GEP4:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %1, [[CST1]](i64)
+; CHECK: [[CST4:%[0-9]+]]:_(i64) = G_CONSTANT i64 4
+; CHECK: [[GEP4:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %1, [[CST4]](i64)
 ; CHECK: G_STORE [[LD3]](i32), [[GEP4]](p0) :: (store (i32) into %ir.addr2 + 4)
 define void @test_extractvalue_agg(ptr %addr, ptr %addr2) {
   %struct = load %struct.nested, ptr %addr
@@ -866,11 +868,14 @@ define void @test_trivial_extract_ptr([1 x ptr] %s, i8 %val) {
 ; CHECK: [[GEP3:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST3]](i64)
 ; CHECK: [[LD4:%[0-9]+]]:_(i32) = G_LOAD [[GEP3]](p0) :: (load (i32) from %ir.addr + 12)
 ; CHECK: G_STORE [[LD1]](i8), %0(p0) :: (store (i8) into %ir.addr, align 4)
-; CHECK: [[GEP4:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST1]](i64)
+; CHECK: [[CST1B:%[0-9]+]]:_(i64) = G_CONSTANT i64 4
+; CHECK: [[GEP4:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST1B]](i64)
 ; CHECK: G_STORE [[LD2]](i8), [[GEP4]](p0) :: (store (i8) into %ir.addr + 4, align 4)
-; CHECK: [[GEP5:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST2]](i64)
+; CHECK: [[CST2B:%[0-9]+]]:_(i64) = G_CONSTANT i64 8
+; CHECK: [[GEP5:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST2B]](i64)
 ; CHECK: G_STORE %1(i32), [[GEP5]](p0) :: (store (i32) into %ir.addr + 8)
-; CHECK: [[GEP6:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST3]](i64)
+; CHECK: [[CST3B:%[0-9]+]]:_(i64) = G_CONSTANT i64 12
+; CHECK: [[GEP6:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST3B]](i64)
 ; CHECK: G_STORE [[LD4]](i32), [[GEP6]](p0) :: (store (i32) into %ir.addr + 12)
 define void @test_insertvalue(ptr %addr, i32 %val) {
   %struct = load %struct.nested, ptr %addr
@@ -905,7 +910,8 @@ define [1 x ptr] @test_trivial_insert_ptr([1 x ptr] %s, ptr %val) {
 ; CHECK: [[GEP1:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %1, [[CST1]](i64)
 ; CHECK: [[LD2:%[0-9]+]]:_(i32) = G_LOAD [[GEP1]](p0) :: (load (i32) from %ir.addr2 + 4)
 ; CHECK: [[LD3:%[0-9]+]]:_(i8) = G_LOAD %0(p0) :: (load (i8) from %ir.addr, align 4)
-; CHECK: [[GEP2:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST1]](i64)
+; CHECK: [[CST2:%[0-9]+]]:_(i64) = G_CONSTANT i64 4
+; CHECK: [[GEP2:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST2]](i64)
 ; CHECK: [[LD4:%[0-9]+]]:_(i8) = G_LOAD [[GEP2]](p0) :: (load (i8) from %ir.addr + 4, align 4)
 ; CHECK: [[CST3:%[0-9]+]]:_(i64) = G_CONSTANT i64 8
 ; CHECK: [[GEP3:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST3]](i64)
@@ -914,11 +920,14 @@ define [1 x ptr] @test_trivial_insert_ptr([1 x ptr] %s, ptr %val) {
 ; CHECK: [[GEP4:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST4]](i64)
 ; CHECK: [[LD6:%[0-9]+]]:_(i32) = G_LOAD [[GEP4]](p0) :: (load (i32) from %ir.addr + 12)
 ; CHECK: G_STORE [[LD3]](i8), %0(p0) :: (store (i8) into %ir.addr, align 4)
-; CHECK: [[GEP5:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST1]](i64)
+; CHECK: [[CST2B:%[0-9]+]]:_(i64) = G_CONSTANT i64 4
+; CHECK: [[GEP5:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST2B]](i64)
 ; CHECK: G_STORE [[LD1]](i8), [[GEP5]](p0) :: (store (i8) into %ir.addr + 4, align 4)
-; CHECK: [[GEP6:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST3]](i64)
+; CHECK: [[CST3B:%[0-9]+]]:_(i64) = G_CONSTANT i64 8
+; CHECK: [[GEP6:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST3B]](i64)
 ; CHECK: G_STORE [[LD2]](i32), [[GEP6]](p0) :: (store (i32) into %ir.addr + 8)
-; CHECK: [[GEP7:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST4]](i64)
+; CHECK: [[CST4B:%[0-9]+]]:_(i64) = G_CONSTANT i64 12
+; CHECK: [[GEP7:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST4B]](i64)
 ; CHECK: G_STORE [[LD6]](i32), [[GEP7]](p0) :: (store (i32) into %ir.addr + 12)
 define void @test_insertvalue_agg(ptr %addr, ptr %addr2) {
   %smallstruct = load {i8, i32}, ptr %addr2
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/call-translator-ios.ll b/llvm/test/CodeGen/AArch64/GlobalISel/call-translator-ios.ll
index d5ffcb2b9b556..7f37a07a387b5 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/call-translator-ios.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/call-translator-ios.ll
@@ -70,8 +70,9 @@ define void @take_128bit_struct(ptr %ptr, [2 x i64] %in) {
 ; CHECK: [[ADDR:%[0-9]+]]:_(p0) = G_PTR_ADD [[SP]], [[OFF]](i64)
 ; CHECK: G_STORE [[LD1]](i64), [[ADDR]](p0) :: (store (i64) into stack, align 1)
 
-; CHECK: [[ADDR:%[0-9]+]]:_(p0) = G_PTR_ADD [[SP]], [[CST]]
-; CHECK: G_STORE [[LD2]](i64), [[ADDR]](p0) :: (store (i64) into stack + 8, align 1)
+; CHECK: [[CST2:%[0-9]+]]:_(i64) = G_CONSTANT i64 8
+; CHECK: [[ADDR2:%[0-9]+]]:_(p0) = G_PTR_ADD [[SP]], [[CST2]](i64)
+; CHECK: G_STORE [[LD2]](i64), [[ADDR2]](p0) :: (store (i64) into stack + 8, align 1)
 define void @test_split_struct(ptr %ptr) {
   %struct = load [2 x i64], ptr %ptr
   call void @take_split_struct(ptr null, i64 1, i64 2, i64 3,
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/call-translator.ll b/llvm/test/CodeGen/AArch64/GlobalISel/call-translator.ll
index 25baf6a295b14..d8b83c951b0c3 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/call-translator.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/call-translator.ll
@@ -293,7 +293,8 @@ define void @take_128bit_struct(ptr %ptr, [2 x i64] %in) {
 ; CHECK: [[CST2:%[0-9]+]]:_(i64) = G_CONSTANT i64 0
 ; CHECK: [[GEP2:%[0-9]+]]:_(p0) = G_PTR_ADD [[SP]], [[CST2]](i64)
 ; CHECK: G_STORE [[LO]](i64), [[GEP2]](p0) :: (store (i64) into stack, align 1)
-; CHECK: [[GEP3:%[0-9]+]]:_(p0) = G_PTR_ADD [[SP]], [[CST]](i64)
+; CHECK: [[CST3:%[0-9]+]]:_(i64) = G_CONSTANT i64 8
+; CHECK: [[GEP3:%[0-9]+]]:_(p0) = G_PTR_ADD [[SP]], [[CST3]](i64)
 ; CHECK: G_STORE [[HI]](i64), [[GEP3]](p0) :: (store (i64) into stack + 8, align 1)
 define void @test_split_struct(ptr %ptr) {
   %struct = load [2 x i64], ptr %ptr
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-bitcast.ll b/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-bitcast.ll
index 6749a32e237db..4f676283cd62d 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-bitcast.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-bitcast.ll
@@ -37,22 +37,22 @@ define i32 @test_bitcast_invalid_vreg() {
   ; CHECK-NEXT:   [[C28:%[0-9]+]]:_(i32) = G_CONSTANT i32 29
   ; CHECK-NEXT:   [[C29:%[0-9]+]]:_(i32) = G_CONSTANT i32 30
   ; CHECK-NEXT:   [[C30:%[0-9]+]]:_(i32) = G_CONSTANT i32 100
-  ; CHECK-NEXT:   [[C31:%[0-9]+]]:_(i32) = G_CONSTANT i32 3
-  ; CHECK-NEXT:   [[C32:%[0-9]+]]:_(i32) = G_CONSTANT i32 7
-  ; CHECK-NEXT:   [[C33:%[0-9]+]]:_(i32) = G_CONSTANT i32 11
-  ; CHE...
[truncated]

arsenm

I thought the whole reason we had the CSE builder was to improve compile time. Is it worth enabling at all? Especially in all these legalizer tests where the instruction count multiplies in the end

c-rhodes · 2026-05-12T12:24:52Z

I thought the whole reason we had the CSE builder was to improve compile time. Is it worth enabling at all? Especially in all these legalizer tests where the instruction count multiplies in the end

do you mean at other opt levels? Looking at https://reviews.llvm.org/D52803 which added it I don't get the impression it was added to improve compile-time, if anything it's framed as the opposite (small compile-time regression), but perhaps @aemerson knows the history better.

aemerson

It's definitely possible that environment is just different now than 8 years ago when it was implemented. For one thing the combiners are just more fleshed out.

I'm somewhat nervous about wholesale removing it but again if the data suggests it's not useful anymore it's fine with me. It is nice though that for testing purposes it creates simplified MIR for dumb folds.

Anyway, for this PR specifically at -O0 I think it's fine to do. I do have a question on one of the test changes...

c-rhodes force-pushed the perf/gisel-disable-legalizer-cse-O0 branch from 7958550 to 98bfcb7 Compare May 12, 2026 06:46

c-rhodes changed the title ~~[GlobalISel] Disable Legalizer CSE at O0~~ [GlobalISel] Disable CSE in IRTranslator & Legalizer at O0 May 12, 2026

c-rhodes added 4 commits May 12, 2026 09:26

[GlobalISel] Disable Legalizer CSE at O0

8c725d6

re-generate tests

c04458c

also disable in IRTranslator

f8e4e33

re-generate tests

2cd7f4f

c-rhodes force-pushed the perf/gisel-disable-legalizer-cse-O0 branch from a05963a to 2cd7f4f Compare May 12, 2026 09:44

c-rhodes marked this pull request as ready for review May 12, 2026 10:30

llvmorg-github-actions Bot added backend:ARM backend:AArch64 backend:AMDGPU backend:MIPS backend:WebAssembly backend:X86 llvm:codegen llvm:globalisel labels May 12, 2026

c-rhodes requested review from aemerson, aengelke, arsenm and davemgreen May 12, 2026 10:31

arsenm reviewed May 12, 2026

View reviewed changes

aemerson reviewed May 12, 2026

View reviewed changes

Comment thread llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll

address comments

0812b44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GlobalISel] Disable CSE in IRTranslator & Legalizer at O0#197000

[GlobalISel] Disable CSE in IRTranslator & Legalizer at O0#197000
c-rhodes wants to merge 5 commits into
llvm:mainfrom
c-rhodes:perf/gisel-disable-legalizer-cse-O0

c-rhodes commented May 11, 2026 •

edited

Loading

Uh oh!

c-rhodes commented May 11, 2026

Uh oh!

aemerson commented May 11, 2026

Uh oh!

github-actions Bot commented May 12, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 12, 2026 •

edited

Loading

Uh oh!

c-rhodes commented May 12, 2026

Uh oh!

llvmorg-github-actions Bot commented May 12, 2026 •

edited

Loading

Uh oh!

arsenm left a comment

Uh oh!

c-rhodes commented May 12, 2026

Uh oh!

aemerson left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

c-rhodes commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

c-rhodes commented May 11, 2026

Uh oh!

aemerson commented May 11, 2026

Uh oh!

github-actions Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🐧 Linux x64 Test Results

Uh oh!

github-actions Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🪟 Windows x64 Test Results

Uh oh!

c-rhodes commented May 12, 2026

Uh oh!

llvmorg-github-actions Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

c-rhodes commented May 12, 2026

Uh oh!

aemerson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

c-rhodes commented May 11, 2026 •

edited

Loading

github-actions Bot commented May 12, 2026 •

edited

Loading

github-actions Bot commented May 12, 2026 •

edited

Loading

llvmorg-github-actions Bot commented May 12, 2026 •

edited

Loading