Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV][ISEL] Lowering to load-acquire/store-release for RISCV Zalasr #82914

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mehnadnerd
Copy link
Contributor

Lowering to load-acquire/store-release for RISCV Zalasr.

Currently uses the psABI lowerings for WMO load-acquire/store-release (which are identical to A.7). These are incompatable with the A.6 lowerings currently used by LLVM. This should be OK for now since Zalasr is behind the enable experimental extensions flag, but needs to be fixed before it is removed from that.

For TSO, it uses the standard Ztso mappings except for lowering seq_cst loads/store to load-acquire/store-release, I had Andrea review that.

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 25, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Brendan Sweeney (mehnadnerd)

Changes

Lowering to load-acquire/store-release for RISCV Zalasr.

Currently uses the psABI lowerings for WMO load-acquire/store-release (which are identical to A.7). These are incompatable with the A.6 lowerings currently used by LLVM. This should be OK for now since Zalasr is behind the enable experimental extensions flag, but needs to be fixed before it is removed from that.

For TSO, it uses the standard Ztso mappings except for lowering seq_cst loads/store to load-acquire/store-release, I had Andrea review that.


Patch is 24.80 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/82914.diff

5 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+24)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.h (+2-3)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoA.td (+58-14)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoZalasr.td (+44)
  • (modified) llvm/test/CodeGen/RISCV/atomic-load-store.ll (+218-1)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 540c2e7476dc18..d1829903a0dc11 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -20832,6 +20832,30 @@ unsigned RISCVTargetLowering::getCustomCtpopCost(EVT VT,
   return isCtpopFast(VT) ? 0 : 1;
 }
 
+bool RISCVTargetLowering::shouldInsertFencesForAtomic(
+    const Instruction *I) const {
+  if (Subtarget.hasStdExtZalasr()) {
+    if (Subtarget.hasStdExtZtso()) {
+      // Zalasr + TSO means that atomic_load_acquire and atomic_store_release
+      //  should be lowered to plain load/store. The easiest way to do this is
+      //  to say we should insert fences for them, and the fence insertion code
+      //  will just not insert any fences
+      auto LI = dyn_cast<LoadInst>(I);
+      auto SI = dyn_cast<StoreInst>(I);
+      if ((LI && (LI->getOrdering() == AtomicOrdering::SequentiallyConsistent))
+       || (SI && (SI->getOrdering() == AtomicOrdering::SequentiallyConsistent))) {
+        // Here, this is a load or store which is seq_cst, and needs a .aq or .rl
+        //  therefore we shouldn't try to insert fences
+        return false;
+      }
+      // Here, we are a TSO inst that isn't a seq_cst load/store
+      return isa<LoadInst>(I) || isa<StoreInst>(I);
+    }
+    return false;
+  }
+  return isa<LoadInst>(I) || isa<StoreInst>(I);
+}
+
 bool RISCVTargetLowering::fallBackToDAGISel(const Instruction &Inst) const {
 
   // GISel support is in progress or complete for these opcodes.
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.h b/llvm/lib/Target/RISCV/RISCVISelLowering.h
index a38463f810270a..76cfe09817c676 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.h
@@ -650,9 +650,8 @@ class RISCVTargetLowering : public TargetLowering {
 
   bool preferZeroCompareBranch() const override { return true; }
 
-  bool shouldInsertFencesForAtomic(const Instruction *I) const override {
-    return isa<LoadInst>(I) || isa<StoreInst>(I);
-  }
+  bool shouldInsertFencesForAtomic(const Instruction *I) const override;
+
   Instruction *emitLeadingFence(IRBuilderBase &Builder, Instruction *Inst,
                                 AtomicOrdering Ord) const override;
   Instruction *emitTrailingFence(IRBuilderBase &Builder, Instruction *Inst,
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoA.td b/llvm/lib/Target/RISCV/RISCVInstrInfoA.td
index 36842ceb49bfb8..13b219685297c2 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoA.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoA.td
@@ -105,22 +105,66 @@ defm AMOMAXU_D  : AMO_rr_aq_rl<0b11100, 0b011, "amomaxu.d">,
 // Pseudo-instructions and codegen patterns
 //===----------------------------------------------------------------------===//
 
+// An atomic load operation that does not need either acquire or release
+// semantics.
+class relaxed_load<PatFrags base>
+  : PatFrag<(ops node:$ptr), (base node:$ptr)> {
+  let IsAtomic = 1;
+  let IsAtomicOrderingAcquireOrStronger = 0;
+}
+
+// A atomic load operation that actually needs acquire semantics.
+class acquiring_load<PatFrags base>
+  : PatFrag<(ops node:$ptr), (base node:$ptr)> {
+  let IsAtomic = 1;
+  let IsAtomicOrderingAcquire = 1;
+}
+
+// An atomic load operation that needs sequential consistency.
+class seq_cst_load<PatFrags base>
+  : PatFrag<(ops node:$ptr), (base node:$ptr)> {
+  let IsAtomic = 1;
+  let IsAtomicOrderingSequentiallyConsistent = 1;
+}
+
+// An atomic store operation that does not need either acquire or release
+// semantics.
+class relaxed_store<PatFrag base>
+  : PatFrag<(ops node:$val, node:$ptr), (base node:$val, node:$ptr)> {
+  let IsAtomic = 1;
+  let IsAtomicOrderingReleaseOrStronger = 0;
+}
+
+// A store operation that actually needs release semantics.
+class releasing_store<PatFrag base>
+  : PatFrag<(ops node:$val, node:$ptr), (base node:$val, node:$ptr)> {
+  let IsAtomic = 1;
+  let IsAtomicOrderingRelease = 1;
+}
+
+// A store operation that actually needs sequential consistency.
+class seq_cst_store<PatFrag base>
+  : PatFrag<(ops node:$val, node:$ptr), (base node:$val, node:$ptr)> {
+  let IsAtomic = 1;
+  let IsAtomicOrderingSequentiallyConsistent = 1;
+}
+
 // Atomic load/store are available under both +a and +force-atomics.
 // Fences will be inserted for atomic load/stores according to the logic in
 // RISCVTargetLowering::{emitLeadingFence,emitTrailingFence}.
 let Predicates = [HasAtomicLdSt] in {
-  def : LdPat<atomic_load_8,  LB>;
-  def : LdPat<atomic_load_16, LH>;
-  def : LdPat<atomic_load_32, LW>;
+  def : LdPat<relaxed_load<atomic_load_8>,  LB>;
+  def : LdPat<relaxed_load<atomic_load_16>, LH>;
+  def : LdPat<relaxed_load<atomic_load_32>, LW>;
 
-  def : StPat<atomic_store_8,  SB, GPR, XLenVT>;
-  def : StPat<atomic_store_16, SH, GPR, XLenVT>;
-  def : StPat<atomic_store_32, SW, GPR, XLenVT>;
+  def : StPat<relaxed_store<atomic_store_8>,  SB, GPR, XLenVT>;
+  def : StPat<relaxed_store<atomic_store_16>, SH, GPR, XLenVT>;
+  def : StPat<relaxed_store<atomic_store_32>, SW, GPR, XLenVT>;
 }
 
 let Predicates = [HasAtomicLdSt, IsRV64] in {
-  def : LdPat<atomic_load_64, LD, i64>;
-  def : StPat<atomic_store_64, SD, GPR, i64>;
+  def : LdPat<relaxed_load<atomic_load_64>, LD, i64>;
+  def : StPat<relaxed_store<atomic_store_64>, SD, GPR, i64>;
 }
 
 /// AMOs
@@ -423,12 +467,12 @@ let Predicates = [HasStdExtA, IsRV64] in
 defm : PseudoCmpXchgPat<"atomic_cmp_swap_32", PseudoCmpXchg32, i32>;
 
 let Predicates = [HasAtomicLdSt] in {
-  def : LdPat<atomic_load_8,  LB, i32>;
-  def : LdPat<atomic_load_16, LH, i32>;
-  def : LdPat<atomic_load_32, LW, i32>;
+  def : LdPat<relaxed_load<atomic_load_8>,  LB, i32>;
+  def : LdPat<relaxed_load<atomic_load_16>, LH, i32>;
+  def : LdPat<relaxed_load<atomic_load_32>, LW, i32>;
 
-  def : StPat<atomic_store_8,  SB, GPR, i32>;
-  def : StPat<atomic_store_16, SH, GPR, i32>;
-  def : StPat<atomic_store_32, SW, GPR, i32>;
+  def : StPat<relaxed_store<atomic_store_8>,  SB, GPR, i32>;
+  def : StPat<relaxed_store<atomic_store_16>, SH, GPR, i32>;
+  def : StPat<relaxed_store<atomic_store_32>, SW, GPR, i32>;
 }
 
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoZalasr.td b/llvm/lib/Target/RISCV/RISCVInstrInfoZalasr.td
index fd5bdea95c7223..d1aa21af92d2e2 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoZalasr.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoZalasr.td
@@ -57,3 +57,47 @@ let Predicates = [HasStdExtZalasr, IsRV64] in {
 defm LD : LAQ_r_aq_rl<0b011, "ld">;
 defm SD : SRL_r_aq_rl<0b011, "sd">;
 } // Predicates = [HasStdExtZalasr, IsRV64]
+
+//===----------------------------------------------------------------------===//
+// Pseudo-instructions and codegen patterns
+//===----------------------------------------------------------------------===//
+
+class PatLAQ<SDPatternOperator OpNode, RVInst Inst, ValueType vt = XLenVT>
+    : Pat<(vt (OpNode (vt GPRMemZeroOffset:$rs1))), (Inst GPRMemZeroOffset:$rs1)>;
+
+class PatSRL<SDPatternOperator OpNode, RVInst Inst, ValueType vt = XLenVT>
+    : Pat<(OpNode (vt GPR:$rs2), (vt GPRMemZeroOffset:$rs1)),
+          (Inst GPRMemZeroOffset:$rs1, GPR:$rs2)>; // n.b. this switches order of arguments
+                                                   // to deal with the fact that SRL has addr, data
+                                                   // while atomic_store has data, addr
+                              
+
+let Predicates = [HasStdExtZalasr] in {
+  def : PatLAQ<acquiring_load<atomic_load_8>, LB_AQ>;
+  def : PatLAQ<seq_cst_load<atomic_load_8>, LB_AQ>; // the sequentially consistent loads use
+                                                    // .aq instead of .aqrl to match the psABI/A.7
+
+  def : PatLAQ<acquiring_load<atomic_load_16>, LH_AQ>;
+  def : PatLAQ<seq_cst_load<atomic_load_16>, LH_AQ>;
+
+  def : PatLAQ<acquiring_load<atomic_load_32>, LW_AQ>;
+  def : PatLAQ<seq_cst_load<atomic_load_32>, LW_AQ>;
+
+  def : PatSRL<releasing_store<atomic_store_8>, SB_RL>;
+  def : PatSRL<seq_cst_store<atomic_store_8>, SB_RL>; // the sequentially consistent stores use
+                                                      // .rl instead of .aqrl to match the psABI/A.7
+
+  def : PatSRL<releasing_store<atomic_store_16>, SH_RL>;
+  def : PatSRL<seq_cst_store<atomic_store_16>, SH_RL>;
+
+  def : PatSRL<releasing_store<atomic_store_32>, SW_RL>;
+  def : PatSRL<seq_cst_store<atomic_store_32>, SW_RL>;
+} // Predicates HasStdExtZalasr
+
+let Predicates = [HasStdExtZalasr, IsRV64] in {
+  def : PatLAQ<acquiring_load<atomic_load_64>, LD_AQ>;
+  def : PatLAQ<seq_cst_load<atomic_load_64>, LD_AQ>;
+
+  def : PatSRL<releasing_store<atomic_store_64>, SD_RL>;
+  def : PatSRL<seq_cst_store<atomic_store_64>, SD_RL>;
+} // Predicates HasStdExtZalasr, IsRV64
diff --git a/llvm/test/CodeGen/RISCV/atomic-load-store.ll b/llvm/test/CodeGen/RISCV/atomic-load-store.ll
index 2d1fc21cda89b0..d6e08efc370438 100644
--- a/llvm/test/CodeGen/RISCV/atomic-load-store.ll
+++ b/llvm/test/CodeGen/RISCV/atomic-load-store.ll
@@ -5,13 +5,20 @@
 ; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-WMO %s
 ; RUN: llc -mtriple=riscv32 -mattr=+a,+experimental-ztso -verify-machineinstrs < %s \
 ; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-TSO %s
+; RUN: llc -mtriple=riscv32 -mattr=+a,+experimental-zalasr -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-ZALASR,RV32IA-ZALASR-WMO %s
+; RUN: llc -mtriple=riscv32 -mattr=+a,+experimental-zalasr,+experimental-ztso -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-ZALASR,RV32IA-ZALASR-TSO %s
 ; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
 ; RUN:   | FileCheck -check-prefix=RV64I %s
 ; RUN: llc -mtriple=riscv64 -mattr=+a -verify-machineinstrs < %s \
 ; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-WMO %s
 ; RUN: llc -mtriple=riscv64 -mattr=+a,+experimental-ztso -verify-machineinstrs < %s \
 ; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-TSO %s
-
+; RUN: llc -mtriple=riscv64 -mattr=+a,+experimental-zalasr -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-ZALASR,RV64IA-ZALASR-WMO %s
+; RUN: llc -mtriple=riscv64 -mattr=+a,+experimental-zalasr,+experimental-ztso -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-ZALASR,RV64IA-ZALASR-TSO %s
 
 ; RUN: llc -mtriple=riscv32 -mattr=+a,+seq-cst-trailing-fence -verify-machineinstrs < %s \
 ; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-WMO-TRAILING-FENCE %s
@@ -114,6 +121,16 @@ define i8 @atomic_load_i8_acquire(ptr %a) nounwind {
 ; RV32IA-TSO-NEXT:    lb a0, 0(a0)
 ; RV32IA-TSO-NEXT:    ret
 ;
+; RV32IA-ZALASR-WMO-LABEL: atomic_load_i8_acquire:
+; RV32IA-ZALASR-WMO:       # %bb.0:
+; RV32IA-ZALASR-WMO-NEXT:    lb.aq a0, (a0)
+; RV32IA-ZALASR-WMO-NEXT:    ret
+;
+; RV32IA-ZALASR-TSO-LABEL: atomic_load_i8_acquire:
+; RV32IA-ZALASR-TSO:       # %bb.0:
+; RV32IA-ZALASR-TSO-NEXT:    lb a0, 0(a0)
+; RV32IA-ZALASR-TSO-NEXT:    ret
+;
 ; RV64I-LABEL: atomic_load_i8_acquire:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    addi sp, sp, -16
@@ -135,6 +152,16 @@ define i8 @atomic_load_i8_acquire(ptr %a) nounwind {
 ; RV64IA-TSO-NEXT:    lb a0, 0(a0)
 ; RV64IA-TSO-NEXT:    ret
 ;
+; RV64IA-ZALASR-WMO-LABEL: atomic_load_i8_acquire:
+; RV64IA-ZALASR-WMO:       # %bb.0:
+; RV64IA-ZALASR-WMO-NEXT:    lb.aq a0, (a0)
+; RV64IA-ZALASR-WMO-NEXT:    ret
+;
+; RV64IA-ZALASR-TSO-LABEL: atomic_load_i8_acquire:
+; RV64IA-ZALASR-TSO:       # %bb.0:
+; RV64IA-ZALASR-TSO-NEXT:    lb a0, 0(a0)
+; RV64IA-ZALASR-TSO-NEXT:    ret
+;
 ; RV32IA-WMO-TRAILING-FENCE-LABEL: atomic_load_i8_acquire:
 ; RV32IA-WMO-TRAILING-FENCE:       # %bb.0:
 ; RV32IA-WMO-TRAILING-FENCE-NEXT:    lb a0, 0(a0)
@@ -184,6 +211,11 @@ define i8 @atomic_load_i8_seq_cst(ptr %a) nounwind {
 ; RV32IA-TSO-NEXT:    lb a0, 0(a0)
 ; RV32IA-TSO-NEXT:    ret
 ;
+; RV32IA-ZALASR-LABEL: atomic_load_i8_seq_cst:
+; RV32IA-ZALASR:       # %bb.0:
+; RV32IA-ZALASR-NEXT:    lb.aq a0, (a0)
+; RV32IA-ZALASR-NEXT:    ret
+;
 ; RV64I-LABEL: atomic_load_i8_seq_cst:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    addi sp, sp, -16
@@ -207,6 +239,11 @@ define i8 @atomic_load_i8_seq_cst(ptr %a) nounwind {
 ; RV64IA-TSO-NEXT:    lb a0, 0(a0)
 ; RV64IA-TSO-NEXT:    ret
 ;
+; RV64IA-ZALASR-LABEL: atomic_load_i8_seq_cst:
+; RV64IA-ZALASR:       # %bb.0:
+; RV64IA-ZALASR-NEXT:    lb.aq a0, (a0)
+; RV64IA-ZALASR-NEXT:    ret
+;
 ; RV32IA-WMO-TRAILING-FENCE-LABEL: atomic_load_i8_seq_cst:
 ; RV32IA-WMO-TRAILING-FENCE:       # %bb.0:
 ; RV32IA-WMO-TRAILING-FENCE-NEXT:    fence rw, rw
@@ -326,6 +363,16 @@ define i16 @atomic_load_i16_acquire(ptr %a) nounwind {
 ; RV32IA-TSO-NEXT:    lh a0, 0(a0)
 ; RV32IA-TSO-NEXT:    ret
 ;
+; RV32IA-ZALASR-WMO-LABEL: atomic_load_i16_acquire:
+; RV32IA-ZALASR-WMO:       # %bb.0:
+; RV32IA-ZALASR-WMO-NEXT:    lh.aq a0, (a0)
+; RV32IA-ZALASR-WMO-NEXT:    ret
+;
+; RV32IA-ZALASR-TSO-LABEL: atomic_load_i16_acquire:
+; RV32IA-ZALASR-TSO:       # %bb.0:
+; RV32IA-ZALASR-TSO-NEXT:    lh a0, 0(a0)
+; RV32IA-ZALASR-TSO-NEXT:    ret
+;
 ; RV64I-LABEL: atomic_load_i16_acquire:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    addi sp, sp, -16
@@ -347,6 +394,16 @@ define i16 @atomic_load_i16_acquire(ptr %a) nounwind {
 ; RV64IA-TSO-NEXT:    lh a0, 0(a0)
 ; RV64IA-TSO-NEXT:    ret
 ;
+; RV64IA-ZALASR-WMO-LABEL: atomic_load_i16_acquire:
+; RV64IA-ZALASR-WMO:       # %bb.0:
+; RV64IA-ZALASR-WMO-NEXT:    lh.aq a0, (a0)
+; RV64IA-ZALASR-WMO-NEXT:    ret
+;
+; RV64IA-ZALASR-TSO-LABEL: atomic_load_i16_acquire:
+; RV64IA-ZALASR-TSO:       # %bb.0:
+; RV64IA-ZALASR-TSO-NEXT:    lh a0, 0(a0)
+; RV64IA-ZALASR-TSO-NEXT:    ret
+;
 ; RV32IA-WMO-TRAILING-FENCE-LABEL: atomic_load_i16_acquire:
 ; RV32IA-WMO-TRAILING-FENCE:       # %bb.0:
 ; RV32IA-WMO-TRAILING-FENCE-NEXT:    lh a0, 0(a0)
@@ -396,6 +453,11 @@ define i16 @atomic_load_i16_seq_cst(ptr %a) nounwind {
 ; RV32IA-TSO-NEXT:    lh a0, 0(a0)
 ; RV32IA-TSO-NEXT:    ret
 ;
+; RV32IA-ZALASR-LABEL: atomic_load_i16_seq_cst:
+; RV32IA-ZALASR:       # %bb.0:
+; RV32IA-ZALASR-NEXT:    lh.aq a0, (a0)
+; RV32IA-ZALASR-NEXT:    ret
+;
 ; RV64I-LABEL: atomic_load_i16_seq_cst:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    addi sp, sp, -16
@@ -419,6 +481,11 @@ define i16 @atomic_load_i16_seq_cst(ptr %a) nounwind {
 ; RV64IA-TSO-NEXT:    lh a0, 0(a0)
 ; RV64IA-TSO-NEXT:    ret
 ;
+; RV64IA-ZALASR-LABEL: atomic_load_i16_seq_cst:
+; RV64IA-ZALASR:       # %bb.0:
+; RV64IA-ZALASR-NEXT:    lh.aq a0, (a0)
+; RV64IA-ZALASR-NEXT:    ret
+;
 ; RV32IA-WMO-TRAILING-FENCE-LABEL: atomic_load_i16_seq_cst:
 ; RV32IA-WMO-TRAILING-FENCE:       # %bb.0:
 ; RV32IA-WMO-TRAILING-FENCE-NEXT:    fence rw, rw
@@ -538,6 +605,16 @@ define i32 @atomic_load_i32_acquire(ptr %a) nounwind {
 ; RV32IA-TSO-NEXT:    lw a0, 0(a0)
 ; RV32IA-TSO-NEXT:    ret
 ;
+; RV32IA-ZALASR-WMO-LABEL: atomic_load_i32_acquire:
+; RV32IA-ZALASR-WMO:       # %bb.0:
+; RV32IA-ZALASR-WMO-NEXT:    lw.aq a0, (a0)
+; RV32IA-ZALASR-WMO-NEXT:    ret
+;
+; RV32IA-ZALASR-TSO-LABEL: atomic_load_i32_acquire:
+; RV32IA-ZALASR-TSO:       # %bb.0:
+; RV32IA-ZALASR-TSO-NEXT:    lw a0, 0(a0)
+; RV32IA-ZALASR-TSO-NEXT:    ret
+;
 ; RV64I-LABEL: atomic_load_i32_acquire:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    addi sp, sp, -16
@@ -559,6 +636,16 @@ define i32 @atomic_load_i32_acquire(ptr %a) nounwind {
 ; RV64IA-TSO-NEXT:    lw a0, 0(a0)
 ; RV64IA-TSO-NEXT:    ret
 ;
+; RV64IA-ZALASR-WMO-LABEL: atomic_load_i32_acquire:
+; RV64IA-ZALASR-WMO:       # %bb.0:
+; RV64IA-ZALASR-WMO-NEXT:    lw.aq a0, (a0)
+; RV64IA-ZALASR-WMO-NEXT:    ret
+;
+; RV64IA-ZALASR-TSO-LABEL: atomic_load_i32_acquire:
+; RV64IA-ZALASR-TSO:       # %bb.0:
+; RV64IA-ZALASR-TSO-NEXT:    lw a0, 0(a0)
+; RV64IA-ZALASR-TSO-NEXT:    ret
+;
 ; RV32IA-WMO-TRAILING-FENCE-LABEL: atomic_load_i32_acquire:
 ; RV32IA-WMO-TRAILING-FENCE:       # %bb.0:
 ; RV32IA-WMO-TRAILING-FENCE-NEXT:    lw a0, 0(a0)
@@ -608,6 +695,11 @@ define i32 @atomic_load_i32_seq_cst(ptr %a) nounwind {
 ; RV32IA-TSO-NEXT:    lw a0, 0(a0)
 ; RV32IA-TSO-NEXT:    ret
 ;
+; RV32IA-ZALASR-LABEL: atomic_load_i32_seq_cst:
+; RV32IA-ZALASR:       # %bb.0:
+; RV32IA-ZALASR-NEXT:    lw.aq a0, (a0)
+; RV32IA-ZALASR-NEXT:    ret
+;
 ; RV64I-LABEL: atomic_load_i32_seq_cst:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    addi sp, sp, -16
@@ -631,6 +723,11 @@ define i32 @atomic_load_i32_seq_cst(ptr %a) nounwind {
 ; RV64IA-TSO-NEXT:    lw a0, 0(a0)
 ; RV64IA-TSO-NEXT:    ret
 ;
+; RV64IA-ZALASR-LABEL: atomic_load_i32_seq_cst:
+; RV64IA-ZALASR:       # %bb.0:
+; RV64IA-ZALASR-NEXT:    lw.aq a0, (a0)
+; RV64IA-ZALASR-NEXT:    ret
+;
 ; RV32IA-WMO-TRAILING-FENCE-LABEL: atomic_load_i32_seq_cst:
 ; RV32IA-WMO-TRAILING-FENCE:       # %bb.0:
 ; RV32IA-WMO-TRAILING-FENCE-NEXT:    fence rw, rw
@@ -780,6 +877,16 @@ define i64 @atomic_load_i64_acquire(ptr %a) nounwind {
 ; RV64IA-TSO-NEXT:    ld a0, 0(a0)
 ; RV64IA-TSO-NEXT:    ret
 ;
+; RV64IA-ZALASR-WMO-LABEL: atomic_load_i64_acquire:
+; RV64IA-ZALASR-WMO:       # %bb.0:
+; RV64IA-ZALASR-WMO-NEXT:    ld.aq a0, (a0)
+; RV64IA-ZALASR-WMO-NEXT:    ret
+;
+; RV64IA-ZALASR-TSO-LABEL: atomic_load_i64_acquire:
+; RV64IA-ZALASR-TSO:       # %bb.0:
+; RV64IA-ZALASR-TSO-NEXT:    ld a0, 0(a0)
+; RV64IA-ZALASR-TSO-NEXT:    ret
+;
 ; RV64IA-WMO-TRAILING-FENCE-LABEL: atomic_load_i64_acquire:
 ; RV64IA-WMO-TRAILING-FENCE:       # %bb.0:
 ; RV64IA-WMO-TRAILING-FENCE-NEXT:    ld a0, 0(a0)
@@ -838,6 +945,11 @@ define i64 @atomic_load_i64_seq_cst(ptr %a) nounwind {
 ; RV64IA-TSO-NEXT:    ld a0, 0(a0)
 ; RV64IA-TSO-NEXT:    ret
 ;
+; RV64IA-ZALASR-LABEL: atomic_load_i64_seq_cst:
+; RV64IA-ZALASR:       # %bb.0:
+; RV64IA-ZALASR-NEXT:    ld.aq a0, (a0)
+; RV64IA-ZALASR-NEXT:    ret
+;
 ; RV64IA-WMO-TRAILING-FENCE-LABEL: atomic_load_i64_seq_cst:
 ; RV64IA-WMO-TRAILING-FENCE:       # %bb.0:
 ; RV64IA-WMO-TRAILING-FENCE-NEXT:    fence rw, rw
@@ -944,6 +1056,16 @@ define void @atomic_store_i8_release(ptr %a, i8 %b) nounwind {
 ; RV32IA-TSO-NEXT:    sb a1, 0(a0)
 ; RV32IA-TSO-NEXT:    ret
 ;
+; RV32IA-ZALASR-WMO-LABEL: atomic_store_i8_release:
+; RV32IA-ZALASR-WMO:       # %bb.0:
+; RV32IA-ZALASR-WMO-NEXT:    sb.rl a1, (a0)
+; RV32IA-ZALASR-WMO-NEXT:    ret
+;
+; RV32IA-ZALASR-TSO-LABEL: atomic_store_i8_release:
+; RV32IA-ZALASR-TSO:       # %bb.0:
+; RV32IA-ZALASR-TSO-NEXT:    sb a1, 0(a0)
+; RV32IA-ZALASR-TSO-NEXT:    ret
+;
 ; RV64I-LABEL: atomic_store_i8_release:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    addi sp, sp, -16
@@ -965,6 +1087,16 @@ define void @atomic_store_i8_release(ptr %a, i8 %b) nounwind {
 ; RV64IA-TSO-NEXT:    sb a1, 0(a0)
 ; RV64IA-TSO-NEXT:    ret
 ;
+; RV64IA-ZALASR-WMO-LABEL: atomic_store_i8_release:
+; RV64IA-ZALASR-WMO:       # %bb.0:
+; RV64IA-ZALASR-WMO-NEXT:    sb.rl a1, (a0)
+; RV64IA-ZALASR-WMO-NEXT:    ret
+;
+; RV64IA-ZALASR-TSO-LABEL: atomic_store_i8_release:
+; RV64IA-ZALASR-TSO:       # %bb.0:
+; RV64IA-ZALASR-TSO-NEXT:    sb a1, 0(a0)
+; RV64IA-ZALASR-TSO-NEXT:    ret
+;
 ; RV32IA-WMO-TRAILING-FENCE-LABEL: atomic_store_i8_release:
 ; RV32IA-WMO-TRAILING-FENCE:       # %bb.0:
 ; RV32IA-WMO-TRAILING-FENCE-NEXT:    fence rw, w
@@ -1013,6 +1145,11 @@ define void @atomic_store_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; RV32IA-TSO-NEXT:    fence rw, rw
 ; RV32IA-TSO-NEXT:    ret
 ;
+; RV32IA-ZALASR-LABEL: atomic_store_i8_seq_cst:
+; RV32IA-ZALASR:       # %bb.0:
+; RV32IA-ZALASR-NEXT:    sb.rl a1, (a0)
+; RV32IA-ZALASR-NEXT:    ret
+;
 ; RV64I-LABEL: atomic_store_i8_seq_cst:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    addi sp, sp, -16
@@ -1035,6 +1172,11 @@ define void @atomic_store_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; RV64IA-TSO-NEXT:    fence rw, rw
 ; RV64IA-TSO-NEXT:    ret
 ;
+; RV64IA-ZALASR-LABEL: atomic_store_i8_seq_cst:
+; RV64IA-ZALASR:       # %bb.0:
+; RV64IA-ZALASR-NEXT:    sb.rl a1, (a0)
+; RV64IA-ZALASR-NEXT:    ret
+;
 ; RV32IA-WMO-TRAILING-FENCE-LABEL: atomic_store_i8_seq_cst:
 ; RV32IA-WMO-TRAILING-FENCE:       # %bb.0:
 ; RV32IA-WMO-TRAILING-FENCE-NEXT:    fence rw, w
@@ -1154,6 +1296,16 @@ define void @atomic_store_i16_release(ptr %a, i16 %b) nounwind {
 ; RV32IA-TSO-NEXT:    sh a1, 0(a0)
 ; RV32IA-TSO-NEXT:    ret
 ;
+; RV32IA-ZALASR-WMO-LABEL: atomic_store_i16_release:
+; RV32IA-ZALASR-WMO:       # %bb.0:
+; RV32IA-ZALASR-WMO-NEXT:    sh.rl a1, (a0)
+; RV32IA-ZALASR-WMO-NEXT:    ret
+;
+; RV32IA-ZALASR-TSO-LABEL: atomic_store_i16_release:
+; RV32IA-ZALASR-TSO:       # %bb.0:
+; RV32IA-ZALASR-TSO-NEXT:    sh a1, 0(a0)
+; RV32IA-ZALASR-TS...
[truncated]

Copy link

github-actions bot commented Feb 25, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These formatting changes don't belong to this PR, please remove them as they cause a lot of noises.

llvm/lib/Target/RISCV/RISCVInstrInfoA.td Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVInstrInfoA.td Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVInstrInfoZalasr.td Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVInstrInfoZalasr.td Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVInstrInfoZalasr.td Outdated Show resolved Hide resolved
llvm/test/CodeGen/RISCV/atomic-load-store.ll Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVISelLowering.h Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVInstrInfoA.td Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVInstrInfoZalasr.td Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVInstrInfoZalasr.td Outdated Show resolved Hide resolved
@mehnadnerd mehnadnerd force-pushed the dev/brs/zalasrlower branch 3 times, most recently from cbd8968 to 5e69134 Compare February 26, 2024 08:49
…Zalasr extension)

Currently uses the psABI lowerings for WMO load-acquire/store-release (which are identical to A.7).
These are incompatable with the A.6 lowerings currently used by LLVM.
This should be OK for now since Zalasr is behind the enable experimental extensions flag, but needs to be fixed before it is removed from that.

For TSO, it uses the standard Ztso mappings except for lowering seq_cst loads/store to load-acquire/store-release, I had Andrea review that.
@@ -5,13 +5,20 @@
; RUN: | FileCheck -check-prefixes=RV32IA,RV32IA-WMO %s
; RUN: llc -mtriple=riscv32 -mattr=+a,+experimental-ztso -verify-machineinstrs < %s \
; RUN: | FileCheck -check-prefixes=RV32IA,RV32IA-TSO %s
; RUN: llc -mtriple=riscv32 -mattr=+a,+experimental-zalasr -verify-machineinstrs < %s \
Copy link
Contributor

@wangpc-pp wangpc-pp Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I may have misled you. The CHECKs look weird now.
Here is my example:

; RUN: llc -mtriple=riscv32 -mattr=+a -verify-machineinstrs < %s \                                                                                                                                                                        
 ; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-WMO %s                                                                                                                                                                                 
 ; RUN: llc -mtriple=riscv32 -mattr=+a,+experimental-ztso -verify-machineinstrs < %s \                                                                                                                                                     
-; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-TSO %s                                                                                                                                                                                 
+; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-TSO,RV32IA-NOZALASR-TSO %s                                                                                                                                                             
+; RUN: llc -mtriple=riscv32 -mattr=+a,+experimental-zalasr -verify-machineinstrs < %s \                                                                                                                                                   
+; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-ZALASR-WMO %s                                                                                                                                                                          
+; RUN: llc -mtriple=riscv32 -mattr=+a,+experimental-zalasr,+experimental-ztso -verify-machineinstrs < %s \                                                                                                                                
+; RUN:   | FileCheck -check-prefixes=RV32IA,RV32IA-TSO,RV32IA-ZALASR-TSO %s                                                                                                                                                               
 ; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \                                                                                                                                                                                  
 ; RUN:   | FileCheck -check-prefix=RV64I %s
 ; RUN: llc -mtriple=riscv64 -mattr=+a -verify-machineinstrs < %s \
 ; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-WMO %s
 ; RUN: llc -mtriple=riscv64 -mattr=+a,+experimental-ztso -verify-machineinstrs < %s \
-; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-TSO %s
-
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-TSO,RV64IA-NOZALASR-TSO %s
+; RUN: llc -mtriple=riscv64 -mattr=+a,+experimental-zalasr -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-ZALASR-WMO %s
+; RUN: llc -mtriple=riscv64 -mattr=+a,+experimental-zalasr,+experimental-ztso -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefixes=RV64IA,RV64IA-TSO,RV64IA-ZALASR-TSO %s

Since there are differences (because of the difference of lowering seq_cst) between RV32IA-TSO and RV32IA-ZALASR-TSO, we need to add a prefix to the RUN line that is TSO without Zalasr to generate correct CHECKs. I name them RV32IA-NOZALASR-TSO and RV64IA-NOZALASR-TSO. I don't know if they are good names, maybe RV32IA-STANDARD-TSO, RV32IA-NORMAL-TSO, ...?

Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
I don't have any more comment if the CHECKs are generated correctly. Thanks for your work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants