Skip to content

[GlobalISel] Disable CSE in IRTranslator & Legalizer at O0#197000

Open
c-rhodes wants to merge 5 commits into
llvm:mainfrom
c-rhodes:perf/gisel-disable-legalizer-cse-O0
Open

[GlobalISel] Disable CSE in IRTranslator & Legalizer at O0#197000
c-rhodes wants to merge 5 commits into
llvm:mainfrom
c-rhodes:perf/gisel-disable-legalizer-cse-O0

Conversation

@c-rhodes
Copy link
Copy Markdown
Contributor

@c-rhodes c-rhodes commented May 11, 2026

CTMark -0.54% geomean improvement on stage1-aarch64-O0-g [1] with no change to
code-size [2]. Sqlite is -1.30%.

I also measured without -g locally and geomean is -0.62%:

                 instructions:u                 diff
                            old           new
7zip               131599609457  131463795723 -0.10%
Bullet              57300798645   57215708638 -0.15%
ClamAV              12687796637   12595161387 -0.73%
SPASS               12494927389   12442127229 -0.42%
consumer-typeset    10419961416   10306472249 -1.09%
kimwitu++           20354279265   20261925075 -0.45%
lencod              10986384595   10899231247 -0.79%
mafft                6039142913    6005083779 -0.56%
sqlite3              3582744938    3525792360 -1.59%
tramp3d-v4          16245375037   16196807778 -0.30%
geomean             15815512869   15717416669 -0.62%

This was enabled for constants only in 946b124 and improved compile-time
and code-size.

Assisted-by: codex

[1] https://llvm-compile-time-tracker.com/compare.php?from=f0c84b3ac80c4c594045aff9b4f88ba564614361&to=98bfcb7aa08ba8ecf5ed5a89e64d2e7f0878be5b&stat=instructions%3Au
[2] https://llvm-compile-time-tracker.com/compare.php?from=f0c84b3ac80c4c594045aff9b4f88ba564614361&to=98bfcb7aa08ba8ecf5ed5a89e64d2e7f0878be5b&stat=size-total

@c-rhodes
Copy link
Copy Markdown
Contributor Author

https://llvm-compile-time-tracker.com/compare.php?from=ed50ea52004259af958bb3e5636268342c49ee62&to=0fce1265d04213d57b031eb746296c2c77b7eb1f&stat=instructions%3Au

this is showing positive results, -0.31% geomean improvement aarch64-O0-g with no change in code size. @aemerson I noticed you turned this on a while back (946b124) but only for constants at O0. Is this result unexpected?

it breaks tons of tests of course and I've not updated them yet hence draft, just wanted to check before I post a full PR.

@aemerson
Copy link
Copy Markdown
Contributor

I am quite surprised at these results but I guess we may have added the same CSE combines into the O0 prelegalizer combiner anyway.

In principal if this doesn't regress anything I'm not opposed to it, and 0.3% compile time saving is quite significant.

@c-rhodes c-rhodes force-pushed the perf/gisel-disable-legalizer-cse-O0 branch from 7958550 to 98bfcb7 Compare May 12, 2026 06:46
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 12, 2026

🐧 Linux x64 Test Results

  • 195291 tests passed
  • 5202 tests skipped

✅ The build succeeded and all tests passed.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 12, 2026

🪟 Windows x64 Test Results

  • 134663 tests passed
  • 3269 tests skipped

✅ The build succeeded and all tests passed.

@c-rhodes
Copy link
Copy Markdown
Contributor Author

also disabling it for IRTranslator is a further improvement still with no impact to code-size, geomean -0.54% sqlite -1.30%:
https://llvm-compile-time-tracker.com/compare.php?from=f0c84b3ac80c4c594045aff9b4f88ba564614361&to=98bfcb7aa08ba8ecf5ed5a89e64d2e7f0878be5b&stat=instructions%3Au

@c-rhodes c-rhodes changed the title [GlobalISel] Disable Legalizer CSE at O0 [GlobalISel] Disable CSE in IRTranslator & Legalizer at O0 May 12, 2026
@llvmorg-github-actions
Copy link
Copy Markdown

llvmorg-github-actions Bot commented May 12, 2026

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-webassembly
@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-backend-aarch64

Author: Cullen Rhodes (c-rhodes)

Changes

CTMark -0.54% geomean improvement on stage1-aarch64-O0-g [1] with no change to
code-size [2]. Sqlite is -1.30%.

I also measured without -g locally and geomean is -0.62%:

                 instructions:u                 diff
                            old           new
7zip               131599609457  131463795723 -0.10%
Bullet              57300798645   57215708638 -0.15%
ClamAV              12687796637   12595161387 -0.73%
SPASS               12494927389   12442127229 -0.42%
consumer-typeset    10419961416   10306472249 -1.09%
kimwitu++           20354279265   20261925075 -0.45%
lencod              10986384595   10899231247 -0.79%
mafft                6039142913    6005083779 -0.56%
sqlite3              3582744938    3525792360 -1.59%
tramp3d-v4          16245375037   16196807778 -0.30%
geomean             15815512869   15717416669 -0.62%

This was enabled for constants only in 946b124 and improved compile-time
and code-size.

Assisted-by: codex

[1] https://llvm-compile-time-tracker.com/compare.php?from=f0c84b3ac80c4c594045aff9b4f88ba564614361&to=98bfcb7aa08ba8ecf5ed5a89e64d2e7f0878be5b&stat=instructions%3Au
[2] https://llvm-compile-time-tracker.com/compare.php?from=f0c84b3ac80c4c594045aff9b4f88ba564614361&to=98bfcb7aa08ba8ecf5ed5a89e64d2e7f0878be5b&stat=size-total


Patch is 6.07 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/197000.diff

140 Files Affected:

  • (modified) llvm/lib/CodeGen/TargetPassConfig.cpp (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll (+1-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll (+8-6)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-gep.ll (+4-3)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll (+19-10)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/call-translator-ios.ll (+3-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/call-translator.ll (+2-1)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-bitcast.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-gep-flags.ll (+20-15)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-sincos.ll (+2-1)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-tbaa.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-build-vector.mir (+4-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-cmp.mir (+152-74)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-concat-vectors.mir (+8-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-constant.mir (+48-38)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-ctlz.mir (+11-7)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-div.mir (+2-1)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-ext.mir (+6-4)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-extract-vector-elt.mir (+4-3)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-fptrunc.mir (+2-1)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-freeze.mir (+7-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-fshl.mir (+149-75)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-fshr.mir (+131-66)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-insert-vector-elt.mir (+9-3)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-inserts.mir (+298-133)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store.mir (+38-21)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-merge-values.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-or.mir (+32-19)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-phi-insertpt-decrement.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-phi.mir (+10-8)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-rem.mir (+2-1)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-select.mir (+16-10)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-shift.mir (+33-19)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-shuffle-vector.mir (+16-12)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-simple.mir (+8-4)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-store-vector-bools.mir (+46-31)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-threeway-cmp.mir (+5-4)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-uadd-sat.mir (+18-12)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-undef.mir (+6-3)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-usub-sat.mir (+18-12)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-vaarg.mir (+15-13)
  • (modified) llvm/test/CodeGen/AArch64/aarch64-mops-mte.ll (+22-21)
  • (modified) llvm/test/CodeGen/AArch64/aarch64-tbz.ll (+30-23)
  • (modified) llvm/test/CodeGen/AArch64/popcount.ll (+8-6)
  • (modified) llvm/test/CodeGen/AArch64/pr48188.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/pr53315-returned-i128.ll (+3-2)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/artifact-combiner-extract.mir (+13-7)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/artifact-combiner-sext.mir (+4-3)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/artifact-combiner-unmerge-values.mir (+142-77)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/artifact-combiner-zext.mir (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-constant-fold-vector-op.ll (+2-3)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-function-args.ll (+11-9)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-add.mir (+42-23)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-addrspacecast.mir (+14-7)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-and.mir (+170-66)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ashr.mir (+234-144)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-bitreverse.mir (+6-3)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-bswap.mir (+67-56)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-build-vector-trunc.mir (+5-3)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-build-vector.s16.mir (+173-90)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ctls.mir (+24-12)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ctlz-zero-poison.mir (+10-6)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ctlz.mir (+20-10)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ctpop.mir (+2-1)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-cttz-zero-poison.mir (+2-1)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-cttz.mir (+8-4)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fcmp.mir (+18-9)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-freeze.mir (+66-21)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fshl.mir (+277-177)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fshr.mir (+248-161)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-icmp.mir (+58-30)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-implicit-def-s1025.mir (+186-124)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-implicit-def.mir (+51-23)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant-32bit.mir (+6-4)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir (+2551-1503)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir (+3270-1927)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir (+4200-2521)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir (+5010-2905)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir (+5509-3162)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-lshr.mir (+277-165)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-merge-values.mir (+529-289)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-mul.mir (+46-25)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-or.mir (+100-42)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-saddo.mir (+102-67)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sbfx.mir (+4-2)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sdiv.mir (+382-247)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-select.mir (+46-20)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sext-inreg.mir (+155-89)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sext.mir (+56-29)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-shl.mir (+270-168)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-shuffle-vector.mir (+45-19)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-smulh.mir (+80-46)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-smulo.mir (+58-34)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-srem.mir (+320-197)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ssubo.mir (+102-67)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-store-global.mir (+2845-1803)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-store.mir (+262-194)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sub.mir (+64-23)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ubfx.mir (+8-4)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-udiv.mir (+494-372)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-umulh.mir (+214-128)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-umulo.mir (+218-124)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-unmerge-values.mir (+111-81)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-urem.mir (+394-284)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-xor.mir (+100-42)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/no-ctlz-from-umul-to-lshr-in-postlegalizer.ll (+39-32)
  • (modified) llvm/test/CodeGen/AMDGPU/div_i128.ll (+251-218)
  • (modified) llvm/test/CodeGen/AMDGPU/overlapping-tuple-copy-implicit-op-failure.ll (+56-57)
  • (modified) llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-bitcounts.mir (+6-4)
  • (modified) llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-divmod.mir (+20-12)
  • (modified) llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-fp.mir (+8-4)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/irtranslator/aggregate_struct_return.ll (+14-12)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/add.mir (+4-2)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/bitwise.mir (+4-2)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/constants.mir (+8-4)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/ctlz.mir (+2-1)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/ctpop.mir (+19-11)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/cttz.mir (+32-20)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/icmp.mir (+5-3)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/implicit_def.mir (+2-1)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/rem_and_div.mir (+204-176)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/sub.mir (+4-2)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/legalizer/truncStore_and_aExtLoad.mir (+2-1)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/bitreverse.ll (+60-50)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/bitwise.ll (+12-12)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/constants.ll (+4-2)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/ctlz.ll (+1-1)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/ctpop.ll (+25-17)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/cttz.ll (+33-28)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/implicit_def.ll (+9-5)
  • (modified) llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/long_ambiguous_chain_s64.ll (+67-65)
  • (modified) llvm/test/CodeGen/WebAssembly/GlobalISel/instructions/constant.ll (+6-2)
  • (modified) llvm/test/CodeGen/WebAssembly/GlobalISel/instructions/fshl.ll (+78-42)
  • (modified) llvm/test/CodeGen/WebAssembly/GlobalISel/instructions/fshr.ll (+72-36)
  • (modified) llvm/test/CodeGen/X86/GlobalISel/irtranslator-callingconv.ll (+4-2)
  • (modified) llvm/test/CodeGen/X86/GlobalISel/legalize-ext-x86-64.mir (+2-2)
  • (modified) llvm/test/CodeGen/X86/GlobalISel/legalize-memop-scalar-32.mir (+2-1)
  • (modified) llvm/test/CodeGen/X86/GlobalISel/x86_64-legalize-sitofp.mir (+64-56)
diff --git a/llvm/lib/CodeGen/TargetPassConfig.cpp b/llvm/lib/CodeGen/TargetPassConfig.cpp
index 096e9a6f2b1dc..c3af2acc622ba 100644
--- a/llvm/lib/CodeGen/TargetPassConfig.cpp
+++ b/llvm/lib/CodeGen/TargetPassConfig.cpp
@@ -1587,7 +1587,7 @@ bool TargetPassConfig::reportDiagnosticWhenGlobalISelFallback() const {
 }
 
 bool TargetPassConfig::isGISelCSEEnabled() const {
-  return true;
+  return getOptLevel() != CodeGenOptLevel::None;
 }
 
 std::unique_ptr<CSEConfigBase> TargetPassConfig::getCSEConfig() const {
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll
index 373b040ebec65..7c08cf1273b32 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll
@@ -231,8 +231,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_unordered:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered:
@@ -246,8 +246,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered_const(ptr readonly %pt
 ; -O0-LABEL: load_atomic_i128_aligned_unordered_const:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered_const:
@@ -261,8 +261,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_monotonic:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic:
@@ -276,8 +276,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic_const(ptr readonly %pt
 ; -O0-LABEL: load_atomic_i128_aligned_monotonic_const:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic_const:
@@ -291,8 +291,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_acquire:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire:
@@ -306,8 +306,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire_const(ptr readonly %ptr)
 ; -O0-LABEL: load_atomic_i128_aligned_acquire_const:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire_const:
@@ -321,8 +321,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_seq_cst:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stlxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stlxp w8, x10, x11, [x9]
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst:
@@ -336,8 +336,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst_const(ptr readonly %ptr)
 ; -O0-LABEL: load_atomic_i128_aligned_seq_cst_const:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stlxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stlxp w8, x10, x11, [x9]
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst_const:
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll
index 045e080983d5f..8ba7f6c235696 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll
@@ -231,8 +231,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_unordered:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered:
@@ -246,8 +246,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered_const(ptr readonly %pt
 ; -O0-LABEL: load_atomic_i128_aligned_unordered_const:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered_const:
@@ -261,8 +261,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_monotonic:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic:
@@ -276,8 +276,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic_const(ptr readonly %pt
 ; -O0-LABEL: load_atomic_i128_aligned_monotonic_const:
 ; -O0:    ldxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic_const:
@@ -291,8 +291,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_acquire:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire:
@@ -306,8 +306,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire_const(ptr readonly %ptr)
 ; -O0-LABEL: load_atomic_i128_aligned_acquire_const:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stxp w8, x10, x11, [x9]
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire_const:
@@ -321,8 +321,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst(ptr %ptr) {
 ; -O0-LABEL: load_atomic_i128_aligned_seq_cst:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stlxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stlxp w8, x10, x11, [x9]
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst:
@@ -336,8 +336,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst_const(ptr readonly %ptr)
 ; -O0-LABEL: load_atomic_i128_aligned_seq_cst_const:
 ; -O0:    ldaxp x0, x1, [x9]
 ; -O0:    cmp x0, x10
-; -O0:    cmp x1, x10
-; -O0:    stlxp w8, x10, x10, [x9]
+; -O0:    cmp x1, x11
+; -O0:    stlxp w8, x10, x11, [x9]
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst_const:
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll
index 8c958459d93e8..3369db8b5f7d8 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll
@@ -421,12 +421,11 @@ define void @store_atomic_i128_from_gep() {
 ; GISEL-LABEL: store_atomic_i128_from_gep:
 ; GISEL:    bl init
 ; GISEL:    dmb ish
-; GISEL:    stp x8, x8, [x9, #16]
+; GISEL:    stp x8, x9, [x10, #16]
 ;
 ; SDAG-LABEL: store_atomic_i128_from_gep:
 ; SDAG:    bl init
 ; SDAG:    dmb ish
-; SDAG:    stp xzr, xzr, [sp, #16]
   %a = alloca [3 x i128]
   call void @init(ptr %a)
   %arrayidx  = getelementptr [3 x i128], ptr %a, i64 0, i64 1
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll
index be51210882eaa..04a76ffba9e2b 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll
@@ -556,15 +556,16 @@ define void @atomic_load_relaxed(i64, i64, ptr %p, ptr %p2) {
 ; CHECK-LLSC-O0-LABEL: atomic_load_relaxed:
 ; CHECK-LLSC-O0:       // %bb.0:
 ; CHECK-LLSC-O0-NEXT:    mov x11, xzr
+; CHECK-LLSC-O0-NEXT:    mov x12, xzr
 ; CHECK-LLSC-O0-NEXT:  .LBB4_1: // =>This Inner Loop Header: Depth=1
 ; CHECK-LLSC-O0-NEXT:    ldxp x9, x8, [x2]
 ; CHECK-LLSC-O0-NEXT:    cmp x9, x11
 ; CHECK-LLSC-O0-NEXT:    cset w10, ne
-; CHECK-LLSC-O0-NEXT:    cmp x8, x11
+; CHECK-LLSC-O0-NEXT:    cmp x8, x12
 ; CHECK-LLSC-O0-NEXT:    cinc w10, w10, ne
 ; CHECK-LLSC-O0-NEXT:    cbnz w10, .LBB4_3
 ; CHECK-LLSC-O0-NEXT:  // %bb.2: // in Loop: Header=BB4_1 Depth=1
-; CHECK-LLSC-O0-NEXT:    stxp w10, x11, x11, [x2]
+; CHECK-LLSC-O0-NEXT:    stxp w10, x11, x12, [x2]
 ; CHECK-LLSC-O0-NEXT:    cbnz w10, .LBB4_1
 ; CHECK-LLSC-O0-NEXT:    b .LBB4_4
 ; CHECK-LLSC-O0-NEXT:  .LBB4_3: // in Loop: Header=BB4_1 Depth=1
@@ -585,10 +586,10 @@ define void @atomic_load_relaxed(i64, i64, ptr %p, ptr %p2) {
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_offset w30, -16
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x4, x2
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    str x3, [sp, #8] // 8-byte Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, xzr
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x3, xzr
-; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x3
+; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x2
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x1, x3
-; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    bl __aarch64_cas16_relax
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x3, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    // implicit-def: $q0
@@ -601,10 +602,11 @@ define void @atomic_load_relaxed(i64, i64, ptr %p, ptr %p2) {
 ;
 ; CHECK-CAS-O0-LABEL: atomic_load_relaxed:
 ; CHECK-CAS-O0:       // %bb.0:
+; CHECK-CAS-O0-NEXT:    mov x4, xzr
 ; CHECK-CAS-O0-NEXT:    mov x8, xzr
-; CHECK-CAS-O0-NEXT:    mov x0, x8
+; CHECK-CAS-O0-NEXT:    mov x0, x4
 ; CHECK-CAS-O0-NEXT:    mov x1, x8
-; CHECK-CAS-O0-NEXT:    mov x4, x8
+; CHECK-CAS-O0-NEXT:    // kill: def $x4 killed $x4 def $x4_x5
 ; CHECK-CAS-O0-NEXT:    mov x5, x8
 ; CHECK-CAS-O0-NEXT:    casp x0, x1, x4, x5, [x2]
 ; CHECK-CAS-O0-NEXT:    mov x9, x0
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-gep.ll b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-gep.ll
index 269597cbd730b..71cddad8c904d 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-gep.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator-gep.ll
@@ -17,10 +17,11 @@ define i32 @cse_gep(ptr %ptr, i32 %idx) {
   ; O0-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = nusw inbounds G_PTR_ADD [[COPY]], [[MUL]](i64)
   ; O0-NEXT:   [[COPY2:%[0-9]+]]:_(p0) = COPY [[PTR_ADD]](p0)
   ; O0-NEXT:   [[LOAD:%[0-9]+]]:_(i32) = G_LOAD [[COPY2]](p0) :: (load (i32) from %ir.gep1)
-  ; O0-NEXT:   [[MUL1:%[0-9]+]]:_(i64) = nsw G_MUL [[SEXT]], [[C]]
+  ; O0-NEXT:   [[C1:%[0-9]+]]:_(i64) = G_CONSTANT i64 16
+  ; O0-NEXT:   [[MUL1:%[0-9]+]]:_(i64) = nsw G_MUL [[SEXT]], [[C1]]
   ; O0-NEXT:   [[PTR_ADD1:%[0-9]+]]:_(p0) = nusw inbounds G_PTR_ADD [[COPY]], [[MUL1]](i64)
-  ; O0-NEXT:   [[C1:%[0-9]+]]:_(i64) = G_CONSTANT i64 4
-  ; O0-NEXT:   [[PTR_ADD2:%[0-9]+]]:_(p0) = nuw nusw inbounds G_PTR_ADD [[PTR_ADD1]], [[C1]](i64)
+  ; O0-NEXT:   [[C2:%[0-9]+]]:_(i64) = G_CONSTANT i64 4
+  ; O0-NEXT:   [[PTR_ADD2:%[0-9]+]]:_(p0) = nuw nusw inbounds G_PTR_ADD [[PTR_ADD1]], [[C2]](i64)
   ; O0-NEXT:   [[LOAD1:%[0-9]+]]:_(i32) = G_LOAD [[PTR_ADD2]](p0) :: (load (i32) from %ir.gep2)
   ; O0-NEXT:   [[ADD:%[0-9]+]]:_(i32) = G_ADD [[LOAD]], [[LOAD1]]
   ; O0-NEXT:   $w0 = COPY [[ADD]](i32)
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll
index 8548f63bd1150..8af250766de74 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll
@@ -465,7 +465,7 @@ next:
 ; CHECK-LABEL: name: constant_int_start
 ; CHECK: [[TWO:%[0-9]+]]:_(i32) = G_CONSTANT i32 2
 ; CHECK: [[ANSWER:%[0-9]+]]:_(i32) = G_CONSTANT i32 42
-; CHECK: [[RES:%[0-9]+]]:_(i32) = G_CONSTANT i32 44
+; CHECK: [[RES:%[0-9]+]]:_(i32) = G_ADD [[TWO]], [[ANSWER]]
 define i32 @constant_int_start() {
   %res = add i32 2, 42
   ret i32 %res
@@ -605,7 +605,8 @@ define ptr @test_constant_null() {
 ; CHECK: [[GEP1:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD [[ADDR]], [[CST1]](i64)
 ; CHECK: [[VAL2:%[0-9]+]]:_(i32) = G_LOAD [[GEP1]](p0) :: (load (i32) from %ir.addr + 4)
 ; CHECK: G_STORE [[VAL1]](i8), [[ADDR]](p0) :: (store (i8) into %ir.addr, align 4)
-; CHECK: [[GEP2:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD [[ADDR]], [[CST1]](i64)
+; CHECK: [[CST1B:%[0-9]+]]:_(i64) = G_CONSTANT i64 4
+; CHECK: [[GEP2:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD [[ADDR]], [[CST1B]](i64)
 ; CHECK: G_STORE [[VAL2]](i32), [[GEP2]](p0) :: (store (i32) into %ir.addr + 4)
 define void @test_struct_memops(ptr %addr) {
   %val = load { i8, i32 }, ptr %addr
@@ -832,7 +833,8 @@ define i32 @test_extractvalue(ptr %addr) {
 ; CHECK: [[GEP3:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST3]](i64)
 ; CHECK: [[LD4:%[0-9]+]]:_(i32) = G_LOAD [[GEP3]](p0) :: (load (i32) from %ir.addr + 12)
 ; CHECK: G_STORE [[LD2]](i8), %1(p0) :: (store (i8) into %ir.addr2, align 4)
-; CHECK: [[GEP4:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %1, [[CST1]](i64)
+; CHECK: [[CST4:%[0-9]+]]:_(i64) = G_CONSTANT i64 4
+; CHECK: [[GEP4:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %1, [[CST4]](i64)
 ; CHECK: G_STORE [[LD3]](i32), [[GEP4]](p0) :: (store (i32) into %ir.addr2 + 4)
 define void @test_extractvalue_agg(ptr %addr, ptr %addr2) {
   %struct = load %struct.nested, ptr %addr
@@ -866,11 +868,14 @@ define void @test_trivial_extract_ptr([1 x ptr] %s, i8 %val) {
 ; CHECK: [[GEP3:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST3]](i64)
 ; CHECK: [[LD4:%[0-9]+]]:_(i32) = G_LOAD [[GEP3]](p0) :: (load (i32) from %ir.addr + 12)
 ; CHECK: G_STORE [[LD1]](i8), %0(p0) :: (store (i8) into %ir.addr, align 4)
-; CHECK: [[GEP4:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST1]](i64)
+; CHECK: [[CST1B:%[0-9]+]]:_(i64) = G_CONSTANT i64 4
+; CHECK: [[GEP4:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST1B]](i64)
 ; CHECK: G_STORE [[LD2]](i8), [[GEP4]](p0) :: (store (i8) into %ir.addr + 4, align 4)
-; CHECK: [[GEP5:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST2]](i64)
+; CHECK: [[CST2B:%[0-9]+]]:_(i64) = G_CONSTANT i64 8
+; CHECK: [[GEP5:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST2B]](i64)
 ; CHECK: G_STORE %1(i32), [[GEP5]](p0) :: (store (i32) into %ir.addr + 8)
-; CHECK: [[GEP6:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST3]](i64)
+; CHECK: [[CST3B:%[0-9]+]]:_(i64) = G_CONSTANT i64 12
+; CHECK: [[GEP6:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST3B]](i64)
 ; CHECK: G_STORE [[LD4]](i32), [[GEP6]](p0) :: (store (i32) into %ir.addr + 12)
 define void @test_insertvalue(ptr %addr, i32 %val) {
   %struct = load %struct.nested, ptr %addr
@@ -905,7 +910,8 @@ define [1 x ptr] @test_trivial_insert_ptr([1 x ptr] %s, ptr %val) {
 ; CHECK: [[GEP1:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %1, [[CST1]](i64)
 ; CHECK: [[LD2:%[0-9]+]]:_(i32) = G_LOAD [[GEP1]](p0) :: (load (i32) from %ir.addr2 + 4)
 ; CHECK: [[LD3:%[0-9]+]]:_(i8) = G_LOAD %0(p0) :: (load (i8) from %ir.addr, align 4)
-; CHECK: [[GEP2:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST1]](i64)
+; CHECK: [[CST2:%[0-9]+]]:_(i64) = G_CONSTANT i64 4
+; CHECK: [[GEP2:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST2]](i64)
 ; CHECK: [[LD4:%[0-9]+]]:_(i8) = G_LOAD [[GEP2]](p0) :: (load (i8) from %ir.addr + 4, align 4)
 ; CHECK: [[CST3:%[0-9]+]]:_(i64) = G_CONSTANT i64 8
 ; CHECK: [[GEP3:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST3]](i64)
@@ -914,11 +920,14 @@ define [1 x ptr] @test_trivial_insert_ptr([1 x ptr] %s, ptr %val) {
 ; CHECK: [[GEP4:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST4]](i64)
 ; CHECK: [[LD6:%[0-9]+]]:_(i32) = G_LOAD [[GEP4]](p0) :: (load (i32) from %ir.addr + 12)
 ; CHECK: G_STORE [[LD3]](i8), %0(p0) :: (store (i8) into %ir.addr, align 4)
-; CHECK: [[GEP5:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST1]](i64)
+; CHECK: [[CST2B:%[0-9]+]]:_(i64) = G_CONSTANT i64 4
+; CHECK: [[GEP5:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST2B]](i64)
 ; CHECK: G_STORE [[LD1]](i8), [[GEP5]](p0) :: (store (i8) into %ir.addr + 4, align 4)
-; CHECK: [[GEP6:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST3]](i64)
+; CHECK: [[CST3B:%[0-9]+]]:_(i64) = G_CONSTANT i64 8
+; CHECK: [[GEP6:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST3B]](i64)
 ; CHECK: G_STORE [[LD2]](i32), [[GEP6]](p0) :: (store (i32) into %ir.addr + 8)
-; CHECK: [[GEP7:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST4]](i64)
+; CHECK: [[CST4B:%[0-9]+]]:_(i64) = G_CONSTANT i64 12
+; CHECK: [[GEP7:%[0-9]+]]:_(p0) = nuw inbounds G_PTR_ADD %0, [[CST4B]](i64)
 ; CHECK: G_STORE [[LD6]](i32), [[GEP7]](p0) :: (store (i32) into %ir.addr + 12)
 define void @test_insertvalue_agg(ptr %addr, ptr %addr2) {
   %smallstruct = load {i8, i32}, ptr %addr2
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/call-translator-ios.ll b/llvm/test/CodeGen/AArch64/GlobalISel/call-translator-ios.ll
index d5ffcb2b9b556..7f37a07a387b5 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/call-translator-ios.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/call-translator-ios.ll
@@ -70,8 +70,9 @@ define void @take_128bit_struct(ptr %ptr, [2 x i64] %in) {
 ; CHECK: [[ADDR:%[0-9]+]]:_(p0) = G_PTR_ADD [[SP]], [[OFF]](i64)
 ; CHECK: G_STORE [[LD1]](i64), [[ADDR]](p0) :: (store (i64) into stack, align 1)
 
-; CHECK: [[ADDR:%[0-9]+]]:_(p0) = G_PTR_ADD [[SP]], [[CST]]
-; CHECK: G_STORE [[LD2]](i64), [[ADDR]](p0) :: (store (i64) into stack + 8, align 1)
+; CHECK: [[CST2:%[0-9]+]]:_(i64) = G_CONSTANT i64 8
+; CHECK: [[ADDR2:%[0-9]+]]:_(p0) = G_PTR_ADD [[SP]], [[CST2]](i64)
+; CHECK: G_STORE [[LD2]](i64), [[ADDR2]](p0) :: (store (i64) into stack + 8, align 1)
 define void @test_split_struct(ptr %ptr) {
   %struct = load [2 x i64], ptr %ptr
   call void @take_split_struct(ptr null, i64 1, i64 2, i64 3,
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/call-translator.ll b/llvm/test/CodeGen/AArch64/GlobalISel/call-translator.ll
index 25baf6a295b14..d8b83c951b0c3 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/call-translator.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/call-translator.ll
@@ -293,7 +293,8 @@ define void @take_128bit_struct(ptr %ptr, [2 x i64] %in) {
 ; CHECK: [[CST2:%[0-9]+]]:_(i64) = G_CONSTANT i64 0
 ; CHECK: [[GEP2:%[0-9]+]]:_(p0) = G_PTR_ADD [[SP]], [[CST2]](i64)
 ; CHECK: G_STORE [[LO]](i64), [[GEP2]](p0) :: (store (i64) into stack, align 1)
-; CHECK: [[GEP3:%[0-9]+]]:_(p0) = G_PTR_ADD [[SP]], [[CST]](i64)
+; CHECK: [[CST3:%[0-9]+]]:_(i64) = G_CONSTANT i64 8
+; CHECK: [[GEP3:%[0-9]+]]:_(p0) = G_PTR_ADD [[SP]], [[CST3]](i64)
 ; CHECK: G_STORE [[HI]](i64), [[GEP3]](p0) :: (store (i64) into stack + 8, align 1)
 define void @test_split_struct(ptr %ptr) {
   %struct = load [2 x i64], ptr %ptr
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-bitcast.ll b/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-bitcast.ll
index 6749a32e237db..4f676283cd62d 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-bitcast.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-bitcast.ll
@@ -37,22 +37,22 @@ define i32 @test_bitcast_invalid_vreg() {
   ; CHECK-NEXT:   [[C28:%[0-9]+]]:_(i32) = G_CONSTANT i32 29
   ; CHECK-NEXT:   [[C29:%[0-9]+]]:_(i32) = G_CONSTANT i32 30
   ; CHECK-NEXT:   [[C30:%[0-9]+]]:_(i32) = G_CONSTANT i32 100
-  ; CHECK-NEXT:   [[C31:%[0-9]+]]:_(i32) = G_CONSTANT i32 3
-  ; CHECK-NEXT:   [[C32:%[0-9]+]]:_(i32) = G_CONSTANT i32 7
-  ; CHECK-NEXT:   [[C33:%[0-9]+]]:_(i32) = G_CONSTANT i32 11
-  ; CHE...
[truncated]

Copy link
Copy Markdown
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the whole reason we had the CSE builder was to improve compile time. Is it worth enabling at all? Especially in all these legalizer tests where the instruction count multiplies in the end

@c-rhodes
Copy link
Copy Markdown
Contributor Author

I thought the whole reason we had the CSE builder was to improve compile time. Is it worth enabling at all? Especially in all these legalizer tests where the instruction count multiplies in the end

do you mean at other opt levels? Looking at https://reviews.llvm.org/D52803 which added it I don't get the impression it was added to improve compile-time, if anything it's framed as the opposite (small compile-time regression), but perhaps @aemerson knows the history better.

Copy link
Copy Markdown
Contributor

@aemerson aemerson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's definitely possible that environment is just different now than 8 years ago when it was implemented. For one thing the combiners are just more fleshed out.

I'm somewhat nervous about wholesale removing it but again if the data suggests it's not useful anymore it's fine with me. It is nice though that for testing purposes it creates simplified MIR for dumb folds.

Anyway, for this PR specifically at -O0 I think it's fine to do. I do have a question on one of the test changes...

Comment thread llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-rcpc_immo.ll
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants