Skip to content

Conversation

ReVe1uv
Copy link
Contributor

@ReVe1uv ReVe1uv commented Sep 9, 2025

This change introduces legalization for G_ATOMIC_CMPXCHG and G_ATOMIC_CMPXCHG_WITH_SUCCESS. Additionally, support for the riscv_masked_cmpxchg intrinsic is added to legalizeIntrinsic, ensuring that masked compare-and-exchange operations are recognized during legalization.

…TH_SUCCESS

This change introduces legalization for G_ATOMIC_CMPXCHG and G_ATOMIC_CMPXCHG_WITH_SUCCESS.
Additionally, support for the riscv_masked_cmpxchg intrinsic is added to legalizeIntrinsic, ensuring that masked compare-and-exchange operations are recognized during legalization.
@llvmbot
Copy link
Member

llvmbot commented Sep 9, 2025

@llvm/pr-subscribers-llvm-globalisel

Author: Kane Wang (ReVe1uv)

Changes

This change introduces legalization for G_ATOMIC_CMPXCHG and G_ATOMIC_CMPXCHG_WITH_SUCCESS. Additionally, support for the riscv_masked_cmpxchg intrinsic is added to legalizeIntrinsic, ensuring that masked compare-and-exchange operations are recognized during legalization.


Patch is 256.98 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/157634.diff

7 Files Affected:

  • (modified) llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp (+5-1)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/atomic-cmpxchg.ll (+5910)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/instruction-select/atomic-cmpxchg-rv32.mir (+119)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/instruction-select/atomic-cmpxchg-rv64.mir (+144)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/legalizer-info-validation.mir (+5-4)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-atomic-cmpxchg-rv32.mir (+155)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-atomic-cmpxchg-rv64.mir (+240)
diff --git a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
index ab5c9e17b9a37..21bcbbb0ded65 100644
--- a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
+++ b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
@@ -697,7 +697,10 @@ RISCVLegalizerInfo::RISCVLegalizerInfo(const RISCVSubtarget &ST)
       .customIf(all(typeIsLegalIntOrFPVec(0, IntOrFPVecTys, ST),
                     typeIsLegalIntOrFPVec(1, IntOrFPVecTys, ST)));
 
-  getActionDefinitionsBuilder(G_ATOMICRMW_ADD)
+  getActionDefinitionsBuilder(G_ATOMIC_CMPXCHG_WITH_SUCCESS)
+      .lowerIf(all(typeInSet(0, {s8, s16, s32, s64}), typeIs(2, p0)));
+
+  getActionDefinitionsBuilder({G_ATOMIC_CMPXCHG, G_ATOMICRMW_ADD})
       .legalFor(ST.hasStdExtA(), {{sXLen, p0}})
       .libcallFor(!ST.hasStdExtA(), {{s8, p0}, {s16, p0}, {s32, p0}, {s64, p0}})
       .clampScalar(0, sXLen, sXLen);
@@ -746,6 +749,7 @@ bool RISCVLegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper,
   }
   case Intrinsic::riscv_masked_atomicrmw_add:
   case Intrinsic::riscv_masked_atomicrmw_sub:
+  case Intrinsic::riscv_masked_cmpxchg:
     return true;
   }
 }
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/atomic-cmpxchg.ll b/llvm/test/CodeGen/RISCV/GlobalISel/atomic-cmpxchg.ll
new file mode 100644
index 0000000000000..cffb714ac873f
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/atomic-cmpxchg.ll
@@ -0,0 +1,5910 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -global-isel -mtriple=riscv32 -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefix=RV32I %s
+; RUN: llc -global-isel -mtriple=riscv32 -mattr=+a -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-WMO %s
+; RUN: llc -global-isel -mtriple=riscv32 -mattr=+a,+zacas -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-ZACAS,RV32IA-WMO-ZACAS %s
+; RUN: llc -global-isel -mtriple=riscv32 -mattr=+a,+ztso -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-TSO %s
+; RUN: llc -global-isel -mtriple=riscv32 -mattr=+a,+ztso,+zacas -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-ZACAS,RV32IA-TSO-ZACAS %s
+; RUN: llc -global-isel -mtriple=riscv64 -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefix=RV64I %s
+; RUN: llc -global-isel -mtriple=riscv64 -mattr=+a -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-WMO %s
+; RUN: llc -global-isel -mtriple=riscv64 -mattr=+a,+zacas -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-ZACAS,RV64IA-WMO-ZACAS %s
+; RUN: llc -global-isel -mtriple=riscv64 -mattr=+a,+zacas,+zabha -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-ZABHA,RV64IA-WMO-ZABHA %s
+; RUN: llc -global-isel -mtriple=riscv64 -mattr=+a,+ztso -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-TSO %s
+; RUN: llc -global-isel -mtriple=riscv64 -mattr=+a,+ztso,+zacas -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-ZACAS,RV64IA-TSO-ZACAS %s
+; RUN: llc -global-isel -mtriple=riscv64 -mattr=+a,+ztso,+zacas,+zabha -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-ZABHA,RV64IA-TSO-ZABHA %s
+
+define void @cmpxchg_i8_monotonic_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind {
+; RV32I-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi sp, sp, -16
+; RV32I-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    sb a1, 11(sp)
+; RV32I-NEXT:    addi a1, sp, 11
+; RV32I-NEXT:    li a3, 0
+; RV32I-NEXT:    li a4, 0
+; RV32I-NEXT:    call __atomic_compare_exchange_1
+; RV32I-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    addi sp, sp, 16
+; RV32I-NEXT:    ret
+;
+; RV32IA-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV32IA:       # %bb.0:
+; RV32IA-NEXT:    li a3, 255
+; RV32IA-NEXT:    andi a4, a0, -4
+; RV32IA-NEXT:    andi a0, a0, 3
+; RV32IA-NEXT:    zext.b a1, a1
+; RV32IA-NEXT:    zext.b a2, a2
+; RV32IA-NEXT:    slli a0, a0, 3
+; RV32IA-NEXT:    sll a3, a3, a0
+; RV32IA-NEXT:    sll a1, a1, a0
+; RV32IA-NEXT:    sll a0, a2, a0
+; RV32IA-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; RV32IA-NEXT:    lr.w a2, (a4)
+; RV32IA-NEXT:    and a5, a2, a3
+; RV32IA-NEXT:    bne a5, a1, .LBB0_3
+; RV32IA-NEXT:  # %bb.2: # in Loop: Header=BB0_1 Depth=1
+; RV32IA-NEXT:    xor a5, a2, a0
+; RV32IA-NEXT:    and a5, a5, a3
+; RV32IA-NEXT:    xor a5, a2, a5
+; RV32IA-NEXT:    sc.w a5, a5, (a4)
+; RV32IA-NEXT:    bnez a5, .LBB0_1
+; RV32IA-NEXT:  .LBB0_3:
+; RV32IA-NEXT:    ret
+;
+; RV64I-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    addi sp, sp, -16
+; RV64I-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-NEXT:    sb a1, 7(sp)
+; RV64I-NEXT:    addi a1, sp, 7
+; RV64I-NEXT:    li a3, 0
+; RV64I-NEXT:    li a4, 0
+; RV64I-NEXT:    call __atomic_compare_exchange_1
+; RV64I-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-NEXT:    addi sp, sp, 16
+; RV64I-NEXT:    ret
+;
+; RV64IA-WMO-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV64IA-WMO:       # %bb.0:
+; RV64IA-WMO-NEXT:    li a3, 255
+; RV64IA-WMO-NEXT:    andi a4, a0, -4
+; RV64IA-WMO-NEXT:    andi a0, a0, 3
+; RV64IA-WMO-NEXT:    zext.b a1, a1
+; RV64IA-WMO-NEXT:    zext.b a2, a2
+; RV64IA-WMO-NEXT:    slli a0, a0, 3
+; RV64IA-WMO-NEXT:    sllw a3, a3, a0
+; RV64IA-WMO-NEXT:    sllw a1, a1, a0
+; RV64IA-WMO-NEXT:    sllw a0, a2, a0
+; RV64IA-WMO-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-WMO-NEXT:    lr.w a2, (a4)
+; RV64IA-WMO-NEXT:    and a5, a2, a3
+; RV64IA-WMO-NEXT:    bne a5, a1, .LBB0_3
+; RV64IA-WMO-NEXT:  # %bb.2: # in Loop: Header=BB0_1 Depth=1
+; RV64IA-WMO-NEXT:    xor a5, a2, a0
+; RV64IA-WMO-NEXT:    and a5, a5, a3
+; RV64IA-WMO-NEXT:    xor a5, a2, a5
+; RV64IA-WMO-NEXT:    sc.w a5, a5, (a4)
+; RV64IA-WMO-NEXT:    bnez a5, .LBB0_1
+; RV64IA-WMO-NEXT:  .LBB0_3:
+; RV64IA-WMO-NEXT:    ret
+;
+; RV64IA-ZACAS-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV64IA-ZACAS:       # %bb.0:
+; RV64IA-ZACAS-NEXT:    li a3, 255
+; RV64IA-ZACAS-NEXT:    andi a4, a0, -4
+; RV64IA-ZACAS-NEXT:    andi a0, a0, 3
+; RV64IA-ZACAS-NEXT:    zext.b a1, a1
+; RV64IA-ZACAS-NEXT:    zext.b a2, a2
+; RV64IA-ZACAS-NEXT:    slli a0, a0, 3
+; RV64IA-ZACAS-NEXT:    sllw a3, a3, a0
+; RV64IA-ZACAS-NEXT:    sllw a1, a1, a0
+; RV64IA-ZACAS-NEXT:    sllw a0, a2, a0
+; RV64IA-ZACAS-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-ZACAS-NEXT:    lr.w a2, (a4)
+; RV64IA-ZACAS-NEXT:    and a5, a2, a3
+; RV64IA-ZACAS-NEXT:    bne a5, a1, .LBB0_3
+; RV64IA-ZACAS-NEXT:  # %bb.2: # in Loop: Header=BB0_1 Depth=1
+; RV64IA-ZACAS-NEXT:    xor a5, a2, a0
+; RV64IA-ZACAS-NEXT:    and a5, a5, a3
+; RV64IA-ZACAS-NEXT:    xor a5, a2, a5
+; RV64IA-ZACAS-NEXT:    sc.w a5, a5, (a4)
+; RV64IA-ZACAS-NEXT:    bnez a5, .LBB0_1
+; RV64IA-ZACAS-NEXT:  .LBB0_3:
+; RV64IA-ZACAS-NEXT:    ret
+;
+; RV64IA-ZABHA-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV64IA-ZABHA:       # %bb.0:
+; RV64IA-ZABHA-NEXT:    amocas.b a1, a2, (a0)
+; RV64IA-ZABHA-NEXT:    ret
+;
+; RV64IA-TSO-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV64IA-TSO:       # %bb.0:
+; RV64IA-TSO-NEXT:    li a3, 255
+; RV64IA-TSO-NEXT:    andi a4, a0, -4
+; RV64IA-TSO-NEXT:    andi a0, a0, 3
+; RV64IA-TSO-NEXT:    zext.b a1, a1
+; RV64IA-TSO-NEXT:    zext.b a2, a2
+; RV64IA-TSO-NEXT:    slli a0, a0, 3
+; RV64IA-TSO-NEXT:    sllw a3, a3, a0
+; RV64IA-TSO-NEXT:    sllw a1, a1, a0
+; RV64IA-TSO-NEXT:    sllw a0, a2, a0
+; RV64IA-TSO-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-TSO-NEXT:    lr.w a2, (a4)
+; RV64IA-TSO-NEXT:    and a5, a2, a3
+; RV64IA-TSO-NEXT:    bne a5, a1, .LBB0_3
+; RV64IA-TSO-NEXT:  # %bb.2: # in Loop: Header=BB0_1 Depth=1
+; RV64IA-TSO-NEXT:    xor a5, a2, a0
+; RV64IA-TSO-NEXT:    and a5, a5, a3
+; RV64IA-TSO-NEXT:    xor a5, a2, a5
+; RV64IA-TSO-NEXT:    sc.w a5, a5, (a4)
+; RV64IA-TSO-NEXT:    bnez a5, .LBB0_1
+; RV64IA-TSO-NEXT:  .LBB0_3:
+; RV64IA-TSO-NEXT:    ret
+  %res = cmpxchg ptr %ptr, i8 %cmp, i8 %val monotonic monotonic
+  ret void
+}
+
+define void @cmpxchg_i8_acquire_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind {
+; RV32I-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi sp, sp, -16
+; RV32I-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    sb a1, 11(sp)
+; RV32I-NEXT:    addi a1, sp, 11
+; RV32I-NEXT:    li a3, 2
+; RV32I-NEXT:    li a4, 0
+; RV32I-NEXT:    call __atomic_compare_exchange_1
+; RV32I-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    addi sp, sp, 16
+; RV32I-NEXT:    ret
+;
+; RV32IA-WMO-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV32IA-WMO:       # %bb.0:
+; RV32IA-WMO-NEXT:    li a3, 255
+; RV32IA-WMO-NEXT:    andi a4, a0, -4
+; RV32IA-WMO-NEXT:    andi a0, a0, 3
+; RV32IA-WMO-NEXT:    zext.b a1, a1
+; RV32IA-WMO-NEXT:    zext.b a2, a2
+; RV32IA-WMO-NEXT:    slli a0, a0, 3
+; RV32IA-WMO-NEXT:    sll a3, a3, a0
+; RV32IA-WMO-NEXT:    sll a1, a1, a0
+; RV32IA-WMO-NEXT:    sll a0, a2, a0
+; RV32IA-WMO-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV32IA-WMO-NEXT:    lr.w.aq a2, (a4)
+; RV32IA-WMO-NEXT:    and a5, a2, a3
+; RV32IA-WMO-NEXT:    bne a5, a1, .LBB1_3
+; RV32IA-WMO-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV32IA-WMO-NEXT:    xor a5, a2, a0
+; RV32IA-WMO-NEXT:    and a5, a5, a3
+; RV32IA-WMO-NEXT:    xor a5, a2, a5
+; RV32IA-WMO-NEXT:    sc.w a5, a5, (a4)
+; RV32IA-WMO-NEXT:    bnez a5, .LBB1_1
+; RV32IA-WMO-NEXT:  .LBB1_3:
+; RV32IA-WMO-NEXT:    ret
+;
+; RV32IA-WMO-ZACAS-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV32IA-WMO-ZACAS:       # %bb.0:
+; RV32IA-WMO-ZACAS-NEXT:    li a3, 255
+; RV32IA-WMO-ZACAS-NEXT:    andi a4, a0, -4
+; RV32IA-WMO-ZACAS-NEXT:    andi a0, a0, 3
+; RV32IA-WMO-ZACAS-NEXT:    zext.b a1, a1
+; RV32IA-WMO-ZACAS-NEXT:    zext.b a2, a2
+; RV32IA-WMO-ZACAS-NEXT:    slli a0, a0, 3
+; RV32IA-WMO-ZACAS-NEXT:    sll a3, a3, a0
+; RV32IA-WMO-ZACAS-NEXT:    sll a1, a1, a0
+; RV32IA-WMO-ZACAS-NEXT:    sll a0, a2, a0
+; RV32IA-WMO-ZACAS-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV32IA-WMO-ZACAS-NEXT:    lr.w.aq a2, (a4)
+; RV32IA-WMO-ZACAS-NEXT:    and a5, a2, a3
+; RV32IA-WMO-ZACAS-NEXT:    bne a5, a1, .LBB1_3
+; RV32IA-WMO-ZACAS-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV32IA-WMO-ZACAS-NEXT:    xor a5, a2, a0
+; RV32IA-WMO-ZACAS-NEXT:    and a5, a5, a3
+; RV32IA-WMO-ZACAS-NEXT:    xor a5, a2, a5
+; RV32IA-WMO-ZACAS-NEXT:    sc.w a5, a5, (a4)
+; RV32IA-WMO-ZACAS-NEXT:    bnez a5, .LBB1_1
+; RV32IA-WMO-ZACAS-NEXT:  .LBB1_3:
+; RV32IA-WMO-ZACAS-NEXT:    ret
+;
+; RV32IA-TSO-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV32IA-TSO:       # %bb.0:
+; RV32IA-TSO-NEXT:    li a3, 255
+; RV32IA-TSO-NEXT:    andi a4, a0, -4
+; RV32IA-TSO-NEXT:    andi a0, a0, 3
+; RV32IA-TSO-NEXT:    zext.b a1, a1
+; RV32IA-TSO-NEXT:    zext.b a2, a2
+; RV32IA-TSO-NEXT:    slli a0, a0, 3
+; RV32IA-TSO-NEXT:    sll a3, a3, a0
+; RV32IA-TSO-NEXT:    sll a1, a1, a0
+; RV32IA-TSO-NEXT:    sll a0, a2, a0
+; RV32IA-TSO-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV32IA-TSO-NEXT:    lr.w a2, (a4)
+; RV32IA-TSO-NEXT:    and a5, a2, a3
+; RV32IA-TSO-NEXT:    bne a5, a1, .LBB1_3
+; RV32IA-TSO-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV32IA-TSO-NEXT:    xor a5, a2, a0
+; RV32IA-TSO-NEXT:    and a5, a5, a3
+; RV32IA-TSO-NEXT:    xor a5, a2, a5
+; RV32IA-TSO-NEXT:    sc.w a5, a5, (a4)
+; RV32IA-TSO-NEXT:    bnez a5, .LBB1_1
+; RV32IA-TSO-NEXT:  .LBB1_3:
+; RV32IA-TSO-NEXT:    ret
+;
+; RV32IA-TSO-ZACAS-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV32IA-TSO-ZACAS:       # %bb.0:
+; RV32IA-TSO-ZACAS-NEXT:    li a3, 255
+; RV32IA-TSO-ZACAS-NEXT:    andi a4, a0, -4
+; RV32IA-TSO-ZACAS-NEXT:    andi a0, a0, 3
+; RV32IA-TSO-ZACAS-NEXT:    zext.b a1, a1
+; RV32IA-TSO-ZACAS-NEXT:    zext.b a2, a2
+; RV32IA-TSO-ZACAS-NEXT:    slli a0, a0, 3
+; RV32IA-TSO-ZACAS-NEXT:    sll a3, a3, a0
+; RV32IA-TSO-ZACAS-NEXT:    sll a1, a1, a0
+; RV32IA-TSO-ZACAS-NEXT:    sll a0, a2, a0
+; RV32IA-TSO-ZACAS-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV32IA-TSO-ZACAS-NEXT:    lr.w a2, (a4)
+; RV32IA-TSO-ZACAS-NEXT:    and a5, a2, a3
+; RV32IA-TSO-ZACAS-NEXT:    bne a5, a1, .LBB1_3
+; RV32IA-TSO-ZACAS-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV32IA-TSO-ZACAS-NEXT:    xor a5, a2, a0
+; RV32IA-TSO-ZACAS-NEXT:    and a5, a5, a3
+; RV32IA-TSO-ZACAS-NEXT:    xor a5, a2, a5
+; RV32IA-TSO-ZACAS-NEXT:    sc.w a5, a5, (a4)
+; RV32IA-TSO-ZACAS-NEXT:    bnez a5, .LBB1_1
+; RV32IA-TSO-ZACAS-NEXT:  .LBB1_3:
+; RV32IA-TSO-ZACAS-NEXT:    ret
+;
+; RV64I-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    addi sp, sp, -16
+; RV64I-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-NEXT:    sb a1, 7(sp)
+; RV64I-NEXT:    addi a1, sp, 7
+; RV64I-NEXT:    li a3, 2
+; RV64I-NEXT:    li a4, 0
+; RV64I-NEXT:    call __atomic_compare_exchange_1
+; RV64I-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-NEXT:    addi sp, sp, 16
+; RV64I-NEXT:    ret
+;
+; RV64IA-WMO-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64IA-WMO:       # %bb.0:
+; RV64IA-WMO-NEXT:    li a3, 255
+; RV64IA-WMO-NEXT:    andi a4, a0, -4
+; RV64IA-WMO-NEXT:    andi a0, a0, 3
+; RV64IA-WMO-NEXT:    zext.b a1, a1
+; RV64IA-WMO-NEXT:    zext.b a2, a2
+; RV64IA-WMO-NEXT:    slli a0, a0, 3
+; RV64IA-WMO-NEXT:    sllw a3, a3, a0
+; RV64IA-WMO-NEXT:    sllw a1, a1, a0
+; RV64IA-WMO-NEXT:    sllw a0, a2, a0
+; RV64IA-WMO-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-WMO-NEXT:    lr.w.aq a2, (a4)
+; RV64IA-WMO-NEXT:    and a5, a2, a3
+; RV64IA-WMO-NEXT:    bne a5, a1, .LBB1_3
+; RV64IA-WMO-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV64IA-WMO-NEXT:    xor a5, a2, a0
+; RV64IA-WMO-NEXT:    and a5, a5, a3
+; RV64IA-WMO-NEXT:    xor a5, a2, a5
+; RV64IA-WMO-NEXT:    sc.w a5, a5, (a4)
+; RV64IA-WMO-NEXT:    bnez a5, .LBB1_1
+; RV64IA-WMO-NEXT:  .LBB1_3:
+; RV64IA-WMO-NEXT:    ret
+;
+; RV64IA-WMO-ZACAS-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64IA-WMO-ZACAS:       # %bb.0:
+; RV64IA-WMO-ZACAS-NEXT:    li a3, 255
+; RV64IA-WMO-ZACAS-NEXT:    andi a4, a0, -4
+; RV64IA-WMO-ZACAS-NEXT:    andi a0, a0, 3
+; RV64IA-WMO-ZACAS-NEXT:    zext.b a1, a1
+; RV64IA-WMO-ZACAS-NEXT:    zext.b a2, a2
+; RV64IA-WMO-ZACAS-NEXT:    slli a0, a0, 3
+; RV64IA-WMO-ZACAS-NEXT:    sllw a3, a3, a0
+; RV64IA-WMO-ZACAS-NEXT:    sllw a1, a1, a0
+; RV64IA-WMO-ZACAS-NEXT:    sllw a0, a2, a0
+; RV64IA-WMO-ZACAS-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-WMO-ZACAS-NEXT:    lr.w.aq a2, (a4)
+; RV64IA-WMO-ZACAS-NEXT:    and a5, a2, a3
+; RV64IA-WMO-ZACAS-NEXT:    bne a5, a1, .LBB1_3
+; RV64IA-WMO-ZACAS-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV64IA-WMO-ZACAS-NEXT:    xor a5, a2, a0
+; RV64IA-WMO-ZACAS-NEXT:    and a5, a5, a3
+; RV64IA-WMO-ZACAS-NEXT:    xor a5, a2, a5
+; RV64IA-WMO-ZACAS-NEXT:    sc.w a5, a5, (a4)
+; RV64IA-WMO-ZACAS-NEXT:    bnez a5, .LBB1_1
+; RV64IA-WMO-ZACAS-NEXT:  .LBB1_3:
+; RV64IA-WMO-ZACAS-NEXT:    ret
+;
+; RV64IA-WMO-ZABHA-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64IA-WMO-ZABHA:       # %bb.0:
+; RV64IA-WMO-ZABHA-NEXT:    amocas.b.aq a1, a2, (a0)
+; RV64IA-WMO-ZABHA-NEXT:    ret
+;
+; RV64IA-TSO-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64IA-TSO:       # %bb.0:
+; RV64IA-TSO-NEXT:    li a3, 255
+; RV64IA-TSO-NEXT:    andi a4, a0, -4
+; RV64IA-TSO-NEXT:    andi a0, a0, 3
+; RV64IA-TSO-NEXT:    zext.b a1, a1
+; RV64IA-TSO-NEXT:    zext.b a2, a2
+; RV64IA-TSO-NEXT:    slli a0, a0, 3
+; RV64IA-TSO-NEXT:    sllw a3, a3, a0
+; RV64IA-TSO-NEXT:    sllw a1, a1, a0
+; RV64IA-TSO-NEXT:    sllw a0, a2, a0
+; RV64IA-TSO-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-TSO-NEXT:    lr.w a2, (a4)
+; RV64IA-TSO-NEXT:    and a5, a2, a3
+; RV64IA-TSO-NEXT:    bne a5, a1, .LBB1_3
+; RV64IA-TSO-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV64IA-TSO-NEXT:    xor a5, a2, a0
+; RV64IA-TSO-NEXT:    and a5, a5, a3
+; RV64IA-TSO-NEXT:    xor a5, a2, a5
+; RV64IA-TSO-NEXT:    sc.w a5, a5, (a4)
+; RV64IA-TSO-NEXT:    bnez a5, .LBB1_1
+; RV64IA-TSO-NEXT:  .LBB1_3:
+; RV64IA-TSO-NEXT:    ret
+;
+; RV64IA-TSO-ZACAS-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64IA-TSO-ZACAS:       # %bb.0:
+; RV64IA-TSO-ZACAS-NEXT:    li a3, 255
+; RV64IA-TSO-ZACAS-NEXT:    andi a4, a0, -4
+; RV64IA-TSO-ZACAS-NEXT:    andi a0, a0, 3
+; RV64IA-TSO-ZACAS-NEXT:    zext.b a1, a1
+; RV64IA-TSO-ZACAS-NEXT:    zext.b a2, a2
+; RV64IA-TSO-ZACAS-NEXT:    slli a0, a0, 3
+; RV64IA-TSO-ZACAS-NEXT:    sllw a3, a3, a0
+; RV64IA-TSO-ZACAS-NEXT:    sllw a1, a1, a0
+; RV64IA-TSO-ZACAS-NEXT:    sllw a0, a2, a0
+; RV64IA-TSO-ZACAS-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-TSO-ZACAS-NEXT:    lr.w a2, (a4)
+; RV64IA-TSO-ZACAS-NEXT:    and a5, a2, a3
+; RV64IA-TSO-ZACAS-NEXT:    bne a5, a1, .LBB1_3
+; RV64IA-TSO-ZACAS-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV64IA-TSO-ZACAS-NEXT:    xor a5, a2, a0
+; RV64IA-TSO-ZACAS-NEXT:    and a5, a5, a3
+; RV64IA-TSO-ZACAS-NEXT:    xor a5, a2, a5
+; RV64IA-TSO-ZACAS-NEXT:    sc.w a5, a5, (a4)
+; RV64IA-TSO-ZACAS-NEXT:    bnez a5, .LBB1_1
+; RV64IA-TSO-ZACAS-NEXT:  .LBB1_3:
+; RV64IA-TSO-ZACAS-NEXT:    ret
+;
+; RV64IA-TSO-ZABHA-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64IA-TSO-ZABHA:       # %bb.0:
+; RV64IA-TSO-ZABHA-NEXT:    amocas.b a1, a2, (a0)
+; RV64IA-TSO-ZABHA-NEXT:    ret
+  %res = cmpxchg ptr %ptr, i8 %cmp, i8 %val acquire monotonic
+  ret void
+}
+
+define void @cmpxchg_i8_acquire_acquire(ptr %ptr, i8 %cmp, i8 %val) nounwind {
+; RV32I-LABEL: cmpxchg_i8_acquire_acquire:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi sp, sp, -16
+; RV32I-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    sb a1, 11(sp)
+; RV32I-NEXT:    addi a1, sp, 11
+; RV32I-NEXT:    li a3, 2
+; RV32I-NEXT:    li a4, 2
+; RV32I-NEXT:    call __atomic_compare_exchange_1
+; RV32I-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    addi sp, sp, 16
+; RV32I-NEXT:    ret
+;
+; RV32IA-WMO-LABEL: cmpxchg_i8_acquire_acquire:
+; RV32IA-WMO:       # %bb.0:
+; RV32IA-WMO-NEXT:    li a3, 255
+; RV32IA-WMO-NEXT:    andi a4, a0, -4
+; RV32IA-WMO-NEXT:    andi a0, a0, 3
+; RV32IA-WMO-NEXT:    zext.b a1, a1
+; RV32IA-WMO-NEXT:    zext.b a2, a2
+; RV32IA-WMO-NEXT:    slli a0, a0, 3
+; RV32IA-WMO-NEXT:    sll a3, a3, a0
+; RV32IA-WMO-NEXT:    sll a1, a1, a0
+; RV32IA-WMO-NEXT:    sll a0, a2, a0
+; RV32IA-WMO-NEXT:  .LBB2_1: # =>This Inner Loop Header: Depth=1
+; RV32IA-WMO-NEXT:    lr.w.aq a2, (a4)
+; RV32IA-WMO-NEXT:    and a5, a2, a3
+; RV32IA-WMO-NEXT:    bne a5, a1, .LBB2_3
+; RV32IA-WMO-NEXT:  # %bb.2: # in Loop: Header=BB2_1 Depth=1
+; RV32IA-WMO-NEXT:    xor a5, a2, a0
+; RV32IA-WMO-NEXT:    and a5, a5, a3
+; RV32IA-WMO-NEXT:    xor a5, a2, a5
+; RV32IA-WMO-NEXT:    sc.w a5, a5, (a4)
+; RV32IA-WMO-NEXT:    bnez a5, .LBB2_1
+; RV32IA-WMO-NEXT:  .LBB2_3:
+; RV32IA-WMO-NEXT:    ret
+;
+; RV32IA-WMO-ZACAS-LABEL: cmpxchg_i8_acquire_acquire:
+; RV32IA-WMO-ZACAS:       # %bb.0:
+; RV32IA-WMO-ZACAS-NEXT:    li a3, 255
+; RV32IA-WMO-ZACAS-NEXT:    andi a4, a0, -4
+; RV32IA-WMO-ZACAS-NEXT:    andi a0, a0, 3
+; RV32IA-WMO-ZACAS-NEXT:    zext.b a1, a1
+; RV32IA-WMO-ZACAS-NEXT:    zext.b a2, a2
+; RV32IA-WMO-ZACAS-NEXT:    slli a0, a0, 3
+; RV32IA-WMO-ZACAS-NEXT:    sll a3, a3, a0
+; RV32IA-WMO-ZACAS-NEXT:    sll a1, a1, a0
+; RV32IA-WMO-ZACAS-NEXT:    sll a0, a2, a0
+; RV32IA-WMO-ZACAS-NEXT:  .LBB2_1: # =>This Inner Loop Header: Depth=1
+; RV32IA-WMO-ZACAS-NEXT:    lr.w.aq a2, (a4)
+; RV32IA-WMO-ZACAS-NEXT:    and a5, a2, a3
+; RV32IA-WMO-ZACAS-NEXT:    bne a5, a1, .LBB2_3
+; RV32IA-WMO-ZACAS-NEXT:  # %bb.2: # in Loop: Header=BB2_1 Depth=1
+; RV32IA-WMO-ZACAS-NEXT:    xor a5, a2, a0
+; RV32IA-WMO-ZACAS-NEXT:    and a5, a5, a3
+; RV32IA-WMO-ZACAS-NEXT:    xor a5, a2, a5
+; RV32IA-WMO-ZACAS-NEXT:    sc.w a5, a5, (a4)
+; RV32IA-WMO-ZACAS-NEXT:    bnez a5, .LBB2_1
+; RV32IA-WMO-ZACAS-NEXT:  .LBB2_3:
+; RV32IA-WMO-ZACAS-NEXT:    ret
+;
+; RV32IA-TSO-LABEL: cmpxchg_i8_acquire_acquire:
+; RV32IA-TSO:       # %bb.0:
+; RV32IA-TSO-NEXT:    li a3, 255
+; RV32IA-TSO-NEXT:    andi a4, a0, -4
+; RV32IA-TSO-NEXT:    andi a0, a0, 3
+; RV32IA-TSO-NEXT:    zext.b a1, a1
+; RV32IA-TSO-NEXT:    zext.b a2, a2
+; RV32IA-TSO-NEXT:    slli a0, a0, 3
+; RV32IA-TSO-NEXT:    sll a3, a3, a0
+; RV32IA-TSO-NEXT:    sll a1, a1, a0
+; RV32IA-TSO-NEXT:    sll a0, a2, a0
+; RV32IA...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Sep 9, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Kane Wang (ReVe1uv)

Changes

This change introduces legalization for G_ATOMIC_CMPXCHG and G_ATOMIC_CMPXCHG_WITH_SUCCESS. Additionally, support for the riscv_masked_cmpxchg intrinsic is added to legalizeIntrinsic, ensuring that masked compare-and-exchange operations are recognized during legalization.


Patch is 256.98 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/157634.diff

7 Files Affected:

  • (modified) llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp (+5-1)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/atomic-cmpxchg.ll (+5910)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/instruction-select/atomic-cmpxchg-rv32.mir (+119)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/instruction-select/atomic-cmpxchg-rv64.mir (+144)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/legalizer-info-validation.mir (+5-4)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-atomic-cmpxchg-rv32.mir (+155)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-atomic-cmpxchg-rv64.mir (+240)
diff --git a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
index ab5c9e17b9a37..21bcbbb0ded65 100644
--- a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
+++ b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
@@ -697,7 +697,10 @@ RISCVLegalizerInfo::RISCVLegalizerInfo(const RISCVSubtarget &ST)
       .customIf(all(typeIsLegalIntOrFPVec(0, IntOrFPVecTys, ST),
                     typeIsLegalIntOrFPVec(1, IntOrFPVecTys, ST)));
 
-  getActionDefinitionsBuilder(G_ATOMICRMW_ADD)
+  getActionDefinitionsBuilder(G_ATOMIC_CMPXCHG_WITH_SUCCESS)
+      .lowerIf(all(typeInSet(0, {s8, s16, s32, s64}), typeIs(2, p0)));
+
+  getActionDefinitionsBuilder({G_ATOMIC_CMPXCHG, G_ATOMICRMW_ADD})
       .legalFor(ST.hasStdExtA(), {{sXLen, p0}})
       .libcallFor(!ST.hasStdExtA(), {{s8, p0}, {s16, p0}, {s32, p0}, {s64, p0}})
       .clampScalar(0, sXLen, sXLen);
@@ -746,6 +749,7 @@ bool RISCVLegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper,
   }
   case Intrinsic::riscv_masked_atomicrmw_add:
   case Intrinsic::riscv_masked_atomicrmw_sub:
+  case Intrinsic::riscv_masked_cmpxchg:
     return true;
   }
 }
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/atomic-cmpxchg.ll b/llvm/test/CodeGen/RISCV/GlobalISel/atomic-cmpxchg.ll
new file mode 100644
index 0000000000000..cffb714ac873f
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/atomic-cmpxchg.ll
@@ -0,0 +1,5910 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -global-isel -mtriple=riscv32 -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefix=RV32I %s
+; RUN: llc -global-isel -mtriple=riscv32 -mattr=+a -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-WMO %s
+; RUN: llc -global-isel -mtriple=riscv32 -mattr=+a,+zacas -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-ZACAS,RV32IA-WMO-ZACAS %s
+; RUN: llc -global-isel -mtriple=riscv32 -mattr=+a,+ztso -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-TSO %s
+; RUN: llc -global-isel -mtriple=riscv32 -mattr=+a,+ztso,+zacas -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-ZACAS,RV32IA-TSO-ZACAS %s
+; RUN: llc -global-isel -mtriple=riscv64 -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefix=RV64I %s
+; RUN: llc -global-isel -mtriple=riscv64 -mattr=+a -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-WMO %s
+; RUN: llc -global-isel -mtriple=riscv64 -mattr=+a,+zacas -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-ZACAS,RV64IA-WMO-ZACAS %s
+; RUN: llc -global-isel -mtriple=riscv64 -mattr=+a,+zacas,+zabha -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-ZABHA,RV64IA-WMO-ZABHA %s
+; RUN: llc -global-isel -mtriple=riscv64 -mattr=+a,+ztso -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-TSO %s
+; RUN: llc -global-isel -mtriple=riscv64 -mattr=+a,+ztso,+zacas -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-ZACAS,RV64IA-TSO-ZACAS %s
+; RUN: llc -global-isel -mtriple=riscv64 -mattr=+a,+ztso,+zacas,+zabha -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-ZABHA,RV64IA-TSO-ZABHA %s
+
+define void @cmpxchg_i8_monotonic_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind {
+; RV32I-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi sp, sp, -16
+; RV32I-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    sb a1, 11(sp)
+; RV32I-NEXT:    addi a1, sp, 11
+; RV32I-NEXT:    li a3, 0
+; RV32I-NEXT:    li a4, 0
+; RV32I-NEXT:    call __atomic_compare_exchange_1
+; RV32I-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    addi sp, sp, 16
+; RV32I-NEXT:    ret
+;
+; RV32IA-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV32IA:       # %bb.0:
+; RV32IA-NEXT:    li a3, 255
+; RV32IA-NEXT:    andi a4, a0, -4
+; RV32IA-NEXT:    andi a0, a0, 3
+; RV32IA-NEXT:    zext.b a1, a1
+; RV32IA-NEXT:    zext.b a2, a2
+; RV32IA-NEXT:    slli a0, a0, 3
+; RV32IA-NEXT:    sll a3, a3, a0
+; RV32IA-NEXT:    sll a1, a1, a0
+; RV32IA-NEXT:    sll a0, a2, a0
+; RV32IA-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; RV32IA-NEXT:    lr.w a2, (a4)
+; RV32IA-NEXT:    and a5, a2, a3
+; RV32IA-NEXT:    bne a5, a1, .LBB0_3
+; RV32IA-NEXT:  # %bb.2: # in Loop: Header=BB0_1 Depth=1
+; RV32IA-NEXT:    xor a5, a2, a0
+; RV32IA-NEXT:    and a5, a5, a3
+; RV32IA-NEXT:    xor a5, a2, a5
+; RV32IA-NEXT:    sc.w a5, a5, (a4)
+; RV32IA-NEXT:    bnez a5, .LBB0_1
+; RV32IA-NEXT:  .LBB0_3:
+; RV32IA-NEXT:    ret
+;
+; RV64I-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    addi sp, sp, -16
+; RV64I-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-NEXT:    sb a1, 7(sp)
+; RV64I-NEXT:    addi a1, sp, 7
+; RV64I-NEXT:    li a3, 0
+; RV64I-NEXT:    li a4, 0
+; RV64I-NEXT:    call __atomic_compare_exchange_1
+; RV64I-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-NEXT:    addi sp, sp, 16
+; RV64I-NEXT:    ret
+;
+; RV64IA-WMO-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV64IA-WMO:       # %bb.0:
+; RV64IA-WMO-NEXT:    li a3, 255
+; RV64IA-WMO-NEXT:    andi a4, a0, -4
+; RV64IA-WMO-NEXT:    andi a0, a0, 3
+; RV64IA-WMO-NEXT:    zext.b a1, a1
+; RV64IA-WMO-NEXT:    zext.b a2, a2
+; RV64IA-WMO-NEXT:    slli a0, a0, 3
+; RV64IA-WMO-NEXT:    sllw a3, a3, a0
+; RV64IA-WMO-NEXT:    sllw a1, a1, a0
+; RV64IA-WMO-NEXT:    sllw a0, a2, a0
+; RV64IA-WMO-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-WMO-NEXT:    lr.w a2, (a4)
+; RV64IA-WMO-NEXT:    and a5, a2, a3
+; RV64IA-WMO-NEXT:    bne a5, a1, .LBB0_3
+; RV64IA-WMO-NEXT:  # %bb.2: # in Loop: Header=BB0_1 Depth=1
+; RV64IA-WMO-NEXT:    xor a5, a2, a0
+; RV64IA-WMO-NEXT:    and a5, a5, a3
+; RV64IA-WMO-NEXT:    xor a5, a2, a5
+; RV64IA-WMO-NEXT:    sc.w a5, a5, (a4)
+; RV64IA-WMO-NEXT:    bnez a5, .LBB0_1
+; RV64IA-WMO-NEXT:  .LBB0_3:
+; RV64IA-WMO-NEXT:    ret
+;
+; RV64IA-ZACAS-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV64IA-ZACAS:       # %bb.0:
+; RV64IA-ZACAS-NEXT:    li a3, 255
+; RV64IA-ZACAS-NEXT:    andi a4, a0, -4
+; RV64IA-ZACAS-NEXT:    andi a0, a0, 3
+; RV64IA-ZACAS-NEXT:    zext.b a1, a1
+; RV64IA-ZACAS-NEXT:    zext.b a2, a2
+; RV64IA-ZACAS-NEXT:    slli a0, a0, 3
+; RV64IA-ZACAS-NEXT:    sllw a3, a3, a0
+; RV64IA-ZACAS-NEXT:    sllw a1, a1, a0
+; RV64IA-ZACAS-NEXT:    sllw a0, a2, a0
+; RV64IA-ZACAS-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-ZACAS-NEXT:    lr.w a2, (a4)
+; RV64IA-ZACAS-NEXT:    and a5, a2, a3
+; RV64IA-ZACAS-NEXT:    bne a5, a1, .LBB0_3
+; RV64IA-ZACAS-NEXT:  # %bb.2: # in Loop: Header=BB0_1 Depth=1
+; RV64IA-ZACAS-NEXT:    xor a5, a2, a0
+; RV64IA-ZACAS-NEXT:    and a5, a5, a3
+; RV64IA-ZACAS-NEXT:    xor a5, a2, a5
+; RV64IA-ZACAS-NEXT:    sc.w a5, a5, (a4)
+; RV64IA-ZACAS-NEXT:    bnez a5, .LBB0_1
+; RV64IA-ZACAS-NEXT:  .LBB0_3:
+; RV64IA-ZACAS-NEXT:    ret
+;
+; RV64IA-ZABHA-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV64IA-ZABHA:       # %bb.0:
+; RV64IA-ZABHA-NEXT:    amocas.b a1, a2, (a0)
+; RV64IA-ZABHA-NEXT:    ret
+;
+; RV64IA-TSO-LABEL: cmpxchg_i8_monotonic_monotonic:
+; RV64IA-TSO:       # %bb.0:
+; RV64IA-TSO-NEXT:    li a3, 255
+; RV64IA-TSO-NEXT:    andi a4, a0, -4
+; RV64IA-TSO-NEXT:    andi a0, a0, 3
+; RV64IA-TSO-NEXT:    zext.b a1, a1
+; RV64IA-TSO-NEXT:    zext.b a2, a2
+; RV64IA-TSO-NEXT:    slli a0, a0, 3
+; RV64IA-TSO-NEXT:    sllw a3, a3, a0
+; RV64IA-TSO-NEXT:    sllw a1, a1, a0
+; RV64IA-TSO-NEXT:    sllw a0, a2, a0
+; RV64IA-TSO-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-TSO-NEXT:    lr.w a2, (a4)
+; RV64IA-TSO-NEXT:    and a5, a2, a3
+; RV64IA-TSO-NEXT:    bne a5, a1, .LBB0_3
+; RV64IA-TSO-NEXT:  # %bb.2: # in Loop: Header=BB0_1 Depth=1
+; RV64IA-TSO-NEXT:    xor a5, a2, a0
+; RV64IA-TSO-NEXT:    and a5, a5, a3
+; RV64IA-TSO-NEXT:    xor a5, a2, a5
+; RV64IA-TSO-NEXT:    sc.w a5, a5, (a4)
+; RV64IA-TSO-NEXT:    bnez a5, .LBB0_1
+; RV64IA-TSO-NEXT:  .LBB0_3:
+; RV64IA-TSO-NEXT:    ret
+  %res = cmpxchg ptr %ptr, i8 %cmp, i8 %val monotonic monotonic
+  ret void
+}
+
+define void @cmpxchg_i8_acquire_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind {
+; RV32I-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi sp, sp, -16
+; RV32I-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    sb a1, 11(sp)
+; RV32I-NEXT:    addi a1, sp, 11
+; RV32I-NEXT:    li a3, 2
+; RV32I-NEXT:    li a4, 0
+; RV32I-NEXT:    call __atomic_compare_exchange_1
+; RV32I-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    addi sp, sp, 16
+; RV32I-NEXT:    ret
+;
+; RV32IA-WMO-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV32IA-WMO:       # %bb.0:
+; RV32IA-WMO-NEXT:    li a3, 255
+; RV32IA-WMO-NEXT:    andi a4, a0, -4
+; RV32IA-WMO-NEXT:    andi a0, a0, 3
+; RV32IA-WMO-NEXT:    zext.b a1, a1
+; RV32IA-WMO-NEXT:    zext.b a2, a2
+; RV32IA-WMO-NEXT:    slli a0, a0, 3
+; RV32IA-WMO-NEXT:    sll a3, a3, a0
+; RV32IA-WMO-NEXT:    sll a1, a1, a0
+; RV32IA-WMO-NEXT:    sll a0, a2, a0
+; RV32IA-WMO-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV32IA-WMO-NEXT:    lr.w.aq a2, (a4)
+; RV32IA-WMO-NEXT:    and a5, a2, a3
+; RV32IA-WMO-NEXT:    bne a5, a1, .LBB1_3
+; RV32IA-WMO-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV32IA-WMO-NEXT:    xor a5, a2, a0
+; RV32IA-WMO-NEXT:    and a5, a5, a3
+; RV32IA-WMO-NEXT:    xor a5, a2, a5
+; RV32IA-WMO-NEXT:    sc.w a5, a5, (a4)
+; RV32IA-WMO-NEXT:    bnez a5, .LBB1_1
+; RV32IA-WMO-NEXT:  .LBB1_3:
+; RV32IA-WMO-NEXT:    ret
+;
+; RV32IA-WMO-ZACAS-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV32IA-WMO-ZACAS:       # %bb.0:
+; RV32IA-WMO-ZACAS-NEXT:    li a3, 255
+; RV32IA-WMO-ZACAS-NEXT:    andi a4, a0, -4
+; RV32IA-WMO-ZACAS-NEXT:    andi a0, a0, 3
+; RV32IA-WMO-ZACAS-NEXT:    zext.b a1, a1
+; RV32IA-WMO-ZACAS-NEXT:    zext.b a2, a2
+; RV32IA-WMO-ZACAS-NEXT:    slli a0, a0, 3
+; RV32IA-WMO-ZACAS-NEXT:    sll a3, a3, a0
+; RV32IA-WMO-ZACAS-NEXT:    sll a1, a1, a0
+; RV32IA-WMO-ZACAS-NEXT:    sll a0, a2, a0
+; RV32IA-WMO-ZACAS-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV32IA-WMO-ZACAS-NEXT:    lr.w.aq a2, (a4)
+; RV32IA-WMO-ZACAS-NEXT:    and a5, a2, a3
+; RV32IA-WMO-ZACAS-NEXT:    bne a5, a1, .LBB1_3
+; RV32IA-WMO-ZACAS-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV32IA-WMO-ZACAS-NEXT:    xor a5, a2, a0
+; RV32IA-WMO-ZACAS-NEXT:    and a5, a5, a3
+; RV32IA-WMO-ZACAS-NEXT:    xor a5, a2, a5
+; RV32IA-WMO-ZACAS-NEXT:    sc.w a5, a5, (a4)
+; RV32IA-WMO-ZACAS-NEXT:    bnez a5, .LBB1_1
+; RV32IA-WMO-ZACAS-NEXT:  .LBB1_3:
+; RV32IA-WMO-ZACAS-NEXT:    ret
+;
+; RV32IA-TSO-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV32IA-TSO:       # %bb.0:
+; RV32IA-TSO-NEXT:    li a3, 255
+; RV32IA-TSO-NEXT:    andi a4, a0, -4
+; RV32IA-TSO-NEXT:    andi a0, a0, 3
+; RV32IA-TSO-NEXT:    zext.b a1, a1
+; RV32IA-TSO-NEXT:    zext.b a2, a2
+; RV32IA-TSO-NEXT:    slli a0, a0, 3
+; RV32IA-TSO-NEXT:    sll a3, a3, a0
+; RV32IA-TSO-NEXT:    sll a1, a1, a0
+; RV32IA-TSO-NEXT:    sll a0, a2, a0
+; RV32IA-TSO-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV32IA-TSO-NEXT:    lr.w a2, (a4)
+; RV32IA-TSO-NEXT:    and a5, a2, a3
+; RV32IA-TSO-NEXT:    bne a5, a1, .LBB1_3
+; RV32IA-TSO-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV32IA-TSO-NEXT:    xor a5, a2, a0
+; RV32IA-TSO-NEXT:    and a5, a5, a3
+; RV32IA-TSO-NEXT:    xor a5, a2, a5
+; RV32IA-TSO-NEXT:    sc.w a5, a5, (a4)
+; RV32IA-TSO-NEXT:    bnez a5, .LBB1_1
+; RV32IA-TSO-NEXT:  .LBB1_3:
+; RV32IA-TSO-NEXT:    ret
+;
+; RV32IA-TSO-ZACAS-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV32IA-TSO-ZACAS:       # %bb.0:
+; RV32IA-TSO-ZACAS-NEXT:    li a3, 255
+; RV32IA-TSO-ZACAS-NEXT:    andi a4, a0, -4
+; RV32IA-TSO-ZACAS-NEXT:    andi a0, a0, 3
+; RV32IA-TSO-ZACAS-NEXT:    zext.b a1, a1
+; RV32IA-TSO-ZACAS-NEXT:    zext.b a2, a2
+; RV32IA-TSO-ZACAS-NEXT:    slli a0, a0, 3
+; RV32IA-TSO-ZACAS-NEXT:    sll a3, a3, a0
+; RV32IA-TSO-ZACAS-NEXT:    sll a1, a1, a0
+; RV32IA-TSO-ZACAS-NEXT:    sll a0, a2, a0
+; RV32IA-TSO-ZACAS-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV32IA-TSO-ZACAS-NEXT:    lr.w a2, (a4)
+; RV32IA-TSO-ZACAS-NEXT:    and a5, a2, a3
+; RV32IA-TSO-ZACAS-NEXT:    bne a5, a1, .LBB1_3
+; RV32IA-TSO-ZACAS-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV32IA-TSO-ZACAS-NEXT:    xor a5, a2, a0
+; RV32IA-TSO-ZACAS-NEXT:    and a5, a5, a3
+; RV32IA-TSO-ZACAS-NEXT:    xor a5, a2, a5
+; RV32IA-TSO-ZACAS-NEXT:    sc.w a5, a5, (a4)
+; RV32IA-TSO-ZACAS-NEXT:    bnez a5, .LBB1_1
+; RV32IA-TSO-ZACAS-NEXT:  .LBB1_3:
+; RV32IA-TSO-ZACAS-NEXT:    ret
+;
+; RV64I-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    addi sp, sp, -16
+; RV64I-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-NEXT:    sb a1, 7(sp)
+; RV64I-NEXT:    addi a1, sp, 7
+; RV64I-NEXT:    li a3, 2
+; RV64I-NEXT:    li a4, 0
+; RV64I-NEXT:    call __atomic_compare_exchange_1
+; RV64I-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-NEXT:    addi sp, sp, 16
+; RV64I-NEXT:    ret
+;
+; RV64IA-WMO-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64IA-WMO:       # %bb.0:
+; RV64IA-WMO-NEXT:    li a3, 255
+; RV64IA-WMO-NEXT:    andi a4, a0, -4
+; RV64IA-WMO-NEXT:    andi a0, a0, 3
+; RV64IA-WMO-NEXT:    zext.b a1, a1
+; RV64IA-WMO-NEXT:    zext.b a2, a2
+; RV64IA-WMO-NEXT:    slli a0, a0, 3
+; RV64IA-WMO-NEXT:    sllw a3, a3, a0
+; RV64IA-WMO-NEXT:    sllw a1, a1, a0
+; RV64IA-WMO-NEXT:    sllw a0, a2, a0
+; RV64IA-WMO-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-WMO-NEXT:    lr.w.aq a2, (a4)
+; RV64IA-WMO-NEXT:    and a5, a2, a3
+; RV64IA-WMO-NEXT:    bne a5, a1, .LBB1_3
+; RV64IA-WMO-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV64IA-WMO-NEXT:    xor a5, a2, a0
+; RV64IA-WMO-NEXT:    and a5, a5, a3
+; RV64IA-WMO-NEXT:    xor a5, a2, a5
+; RV64IA-WMO-NEXT:    sc.w a5, a5, (a4)
+; RV64IA-WMO-NEXT:    bnez a5, .LBB1_1
+; RV64IA-WMO-NEXT:  .LBB1_3:
+; RV64IA-WMO-NEXT:    ret
+;
+; RV64IA-WMO-ZACAS-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64IA-WMO-ZACAS:       # %bb.0:
+; RV64IA-WMO-ZACAS-NEXT:    li a3, 255
+; RV64IA-WMO-ZACAS-NEXT:    andi a4, a0, -4
+; RV64IA-WMO-ZACAS-NEXT:    andi a0, a0, 3
+; RV64IA-WMO-ZACAS-NEXT:    zext.b a1, a1
+; RV64IA-WMO-ZACAS-NEXT:    zext.b a2, a2
+; RV64IA-WMO-ZACAS-NEXT:    slli a0, a0, 3
+; RV64IA-WMO-ZACAS-NEXT:    sllw a3, a3, a0
+; RV64IA-WMO-ZACAS-NEXT:    sllw a1, a1, a0
+; RV64IA-WMO-ZACAS-NEXT:    sllw a0, a2, a0
+; RV64IA-WMO-ZACAS-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-WMO-ZACAS-NEXT:    lr.w.aq a2, (a4)
+; RV64IA-WMO-ZACAS-NEXT:    and a5, a2, a3
+; RV64IA-WMO-ZACAS-NEXT:    bne a5, a1, .LBB1_3
+; RV64IA-WMO-ZACAS-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV64IA-WMO-ZACAS-NEXT:    xor a5, a2, a0
+; RV64IA-WMO-ZACAS-NEXT:    and a5, a5, a3
+; RV64IA-WMO-ZACAS-NEXT:    xor a5, a2, a5
+; RV64IA-WMO-ZACAS-NEXT:    sc.w a5, a5, (a4)
+; RV64IA-WMO-ZACAS-NEXT:    bnez a5, .LBB1_1
+; RV64IA-WMO-ZACAS-NEXT:  .LBB1_3:
+; RV64IA-WMO-ZACAS-NEXT:    ret
+;
+; RV64IA-WMO-ZABHA-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64IA-WMO-ZABHA:       # %bb.0:
+; RV64IA-WMO-ZABHA-NEXT:    amocas.b.aq a1, a2, (a0)
+; RV64IA-WMO-ZABHA-NEXT:    ret
+;
+; RV64IA-TSO-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64IA-TSO:       # %bb.0:
+; RV64IA-TSO-NEXT:    li a3, 255
+; RV64IA-TSO-NEXT:    andi a4, a0, -4
+; RV64IA-TSO-NEXT:    andi a0, a0, 3
+; RV64IA-TSO-NEXT:    zext.b a1, a1
+; RV64IA-TSO-NEXT:    zext.b a2, a2
+; RV64IA-TSO-NEXT:    slli a0, a0, 3
+; RV64IA-TSO-NEXT:    sllw a3, a3, a0
+; RV64IA-TSO-NEXT:    sllw a1, a1, a0
+; RV64IA-TSO-NEXT:    sllw a0, a2, a0
+; RV64IA-TSO-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-TSO-NEXT:    lr.w a2, (a4)
+; RV64IA-TSO-NEXT:    and a5, a2, a3
+; RV64IA-TSO-NEXT:    bne a5, a1, .LBB1_3
+; RV64IA-TSO-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV64IA-TSO-NEXT:    xor a5, a2, a0
+; RV64IA-TSO-NEXT:    and a5, a5, a3
+; RV64IA-TSO-NEXT:    xor a5, a2, a5
+; RV64IA-TSO-NEXT:    sc.w a5, a5, (a4)
+; RV64IA-TSO-NEXT:    bnez a5, .LBB1_1
+; RV64IA-TSO-NEXT:  .LBB1_3:
+; RV64IA-TSO-NEXT:    ret
+;
+; RV64IA-TSO-ZACAS-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64IA-TSO-ZACAS:       # %bb.0:
+; RV64IA-TSO-ZACAS-NEXT:    li a3, 255
+; RV64IA-TSO-ZACAS-NEXT:    andi a4, a0, -4
+; RV64IA-TSO-ZACAS-NEXT:    andi a0, a0, 3
+; RV64IA-TSO-ZACAS-NEXT:    zext.b a1, a1
+; RV64IA-TSO-ZACAS-NEXT:    zext.b a2, a2
+; RV64IA-TSO-ZACAS-NEXT:    slli a0, a0, 3
+; RV64IA-TSO-ZACAS-NEXT:    sllw a3, a3, a0
+; RV64IA-TSO-ZACAS-NEXT:    sllw a1, a1, a0
+; RV64IA-TSO-ZACAS-NEXT:    sllw a0, a2, a0
+; RV64IA-TSO-ZACAS-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; RV64IA-TSO-ZACAS-NEXT:    lr.w a2, (a4)
+; RV64IA-TSO-ZACAS-NEXT:    and a5, a2, a3
+; RV64IA-TSO-ZACAS-NEXT:    bne a5, a1, .LBB1_3
+; RV64IA-TSO-ZACAS-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
+; RV64IA-TSO-ZACAS-NEXT:    xor a5, a2, a0
+; RV64IA-TSO-ZACAS-NEXT:    and a5, a5, a3
+; RV64IA-TSO-ZACAS-NEXT:    xor a5, a2, a5
+; RV64IA-TSO-ZACAS-NEXT:    sc.w a5, a5, (a4)
+; RV64IA-TSO-ZACAS-NEXT:    bnez a5, .LBB1_1
+; RV64IA-TSO-ZACAS-NEXT:  .LBB1_3:
+; RV64IA-TSO-ZACAS-NEXT:    ret
+;
+; RV64IA-TSO-ZABHA-LABEL: cmpxchg_i8_acquire_monotonic:
+; RV64IA-TSO-ZABHA:       # %bb.0:
+; RV64IA-TSO-ZABHA-NEXT:    amocas.b a1, a2, (a0)
+; RV64IA-TSO-ZABHA-NEXT:    ret
+  %res = cmpxchg ptr %ptr, i8 %cmp, i8 %val acquire monotonic
+  ret void
+}
+
+define void @cmpxchg_i8_acquire_acquire(ptr %ptr, i8 %cmp, i8 %val) nounwind {
+; RV32I-LABEL: cmpxchg_i8_acquire_acquire:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi sp, sp, -16
+; RV32I-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    sb a1, 11(sp)
+; RV32I-NEXT:    addi a1, sp, 11
+; RV32I-NEXT:    li a3, 2
+; RV32I-NEXT:    li a4, 2
+; RV32I-NEXT:    call __atomic_compare_exchange_1
+; RV32I-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    addi sp, sp, 16
+; RV32I-NEXT:    ret
+;
+; RV32IA-WMO-LABEL: cmpxchg_i8_acquire_acquire:
+; RV32IA-WMO:       # %bb.0:
+; RV32IA-WMO-NEXT:    li a3, 255
+; RV32IA-WMO-NEXT:    andi a4, a0, -4
+; RV32IA-WMO-NEXT:    andi a0, a0, 3
+; RV32IA-WMO-NEXT:    zext.b a1, a1
+; RV32IA-WMO-NEXT:    zext.b a2, a2
+; RV32IA-WMO-NEXT:    slli a0, a0, 3
+; RV32IA-WMO-NEXT:    sll a3, a3, a0
+; RV32IA-WMO-NEXT:    sll a1, a1, a0
+; RV32IA-WMO-NEXT:    sll a0, a2, a0
+; RV32IA-WMO-NEXT:  .LBB2_1: # =>This Inner Loop Header: Depth=1
+; RV32IA-WMO-NEXT:    lr.w.aq a2, (a4)
+; RV32IA-WMO-NEXT:    and a5, a2, a3
+; RV32IA-WMO-NEXT:    bne a5, a1, .LBB2_3
+; RV32IA-WMO-NEXT:  # %bb.2: # in Loop: Header=BB2_1 Depth=1
+; RV32IA-WMO-NEXT:    xor a5, a2, a0
+; RV32IA-WMO-NEXT:    and a5, a5, a3
+; RV32IA-WMO-NEXT:    xor a5, a2, a5
+; RV32IA-WMO-NEXT:    sc.w a5, a5, (a4)
+; RV32IA-WMO-NEXT:    bnez a5, .LBB2_1
+; RV32IA-WMO-NEXT:  .LBB2_3:
+; RV32IA-WMO-NEXT:    ret
+;
+; RV32IA-WMO-ZACAS-LABEL: cmpxchg_i8_acquire_acquire:
+; RV32IA-WMO-ZACAS:       # %bb.0:
+; RV32IA-WMO-ZACAS-NEXT:    li a3, 255
+; RV32IA-WMO-ZACAS-NEXT:    andi a4, a0, -4
+; RV32IA-WMO-ZACAS-NEXT:    andi a0, a0, 3
+; RV32IA-WMO-ZACAS-NEXT:    zext.b a1, a1
+; RV32IA-WMO-ZACAS-NEXT:    zext.b a2, a2
+; RV32IA-WMO-ZACAS-NEXT:    slli a0, a0, 3
+; RV32IA-WMO-ZACAS-NEXT:    sll a3, a3, a0
+; RV32IA-WMO-ZACAS-NEXT:    sll a1, a1, a0
+; RV32IA-WMO-ZACAS-NEXT:    sll a0, a2, a0
+; RV32IA-WMO-ZACAS-NEXT:  .LBB2_1: # =>This Inner Loop Header: Depth=1
+; RV32IA-WMO-ZACAS-NEXT:    lr.w.aq a2, (a4)
+; RV32IA-WMO-ZACAS-NEXT:    and a5, a2, a3
+; RV32IA-WMO-ZACAS-NEXT:    bne a5, a1, .LBB2_3
+; RV32IA-WMO-ZACAS-NEXT:  # %bb.2: # in Loop: Header=BB2_1 Depth=1
+; RV32IA-WMO-ZACAS-NEXT:    xor a5, a2, a0
+; RV32IA-WMO-ZACAS-NEXT:    and a5, a5, a3
+; RV32IA-WMO-ZACAS-NEXT:    xor a5, a2, a5
+; RV32IA-WMO-ZACAS-NEXT:    sc.w a5, a5, (a4)
+; RV32IA-WMO-ZACAS-NEXT:    bnez a5, .LBB2_1
+; RV32IA-WMO-ZACAS-NEXT:  .LBB2_3:
+; RV32IA-WMO-ZACAS-NEXT:    ret
+;
+; RV32IA-TSO-LABEL: cmpxchg_i8_acquire_acquire:
+; RV32IA-TSO:       # %bb.0:
+; RV32IA-TSO-NEXT:    li a3, 255
+; RV32IA-TSO-NEXT:    andi a4, a0, -4
+; RV32IA-TSO-NEXT:    andi a0, a0, 3
+; RV32IA-TSO-NEXT:    zext.b a1, a1
+; RV32IA-TSO-NEXT:    zext.b a2, a2
+; RV32IA-TSO-NEXT:    slli a0, a0, 3
+; RV32IA-TSO-NEXT:    sll a3, a3, a0
+; RV32IA-TSO-NEXT:    sll a1, a1, a0
+; RV32IA-TSO-NEXT:    sll a0, a2, a0
+; RV32IA...
[truncated]

@@ -0,0 +1,5910 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc -global-isel -mtriple=riscv32 -verify-machineinstrs < %s \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need all the -verify-machineinstrs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done.

@ReVe1uv ReVe1uv requested a review from arsenm September 10, 2025 01:32
Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ReVe1uv
Copy link
Contributor Author

ReVe1uv commented Sep 11, 2025

Hi @topperc ! Could you please help to merge this change? Thanks!

@arsenm arsenm merged commit fa63642 into llvm:main Sep 11, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants