-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[RISCV][GISel] Support Zalasr #161774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV][GISel] Support Zalasr #161774
Conversation
We need additional patterns for GISel because we make s16 and s32 legal for load/store. GISel does not distinquish integer and FP scalar types in LLT. We only know whether the load should be integer or FP after register bank selection.
@llvm/pr-subscribers-llvm-globalisel @llvm/pr-subscribers-backend-risc-v Author: Craig Topper (topperc) ChangesWe need additional patterns for GISel because we make s16 and s32 legal for load/store. GISel does not distinquish integer and FP scalar types in LLT. We only know whether the load should be integer or FP after register bank selection. Full diff: https://github.com/llvm/llvm-project/pull/161774.diff 3 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVGISel.td b/llvm/lib/Target/RISCV/RISCVGISel.td
index cf6f83a09610d..97cf2b4956fb5 100644
--- a/llvm/lib/Target/RISCV/RISCVGISel.td
+++ b/llvm/lib/Target/RISCV/RISCVGISel.td
@@ -202,3 +202,29 @@ let Predicates = [HasStdExtZbkb, NoStdExtZbb, IsRV64] in {
def : Pat<(i64 (zext (i16 GPR:$rs))), (PACKW GPR:$rs, (XLenVT X0))>;
def : Pat<(i32 (zext (i16 GPR:$rs))), (PACKW GPR:$rs, (XLenVT X0))>;
}
+
+//===----------------------------------------------------------------------===//
+// Zalasr patterns not used by SelectionDAG
+//===----------------------------------------------------------------------===//
+
+let Predicates = [HasStdExtZalasr] in {
+ // the sequentially consistent loads use
+ // .aq instead of .aqrl to match the psABI/A.7
+ def : PatLAQ<acquiring_load<atomic_load_aext_8>, LB_AQ, i16>;
+ def : PatLAQ<seq_cst_load<atomic_load_aext_8>, LB_AQ, i16>;
+
+ def : PatLAQ<acquiring_load<atomic_load_nonext_16>, LH_AQ, i16>;
+ def : PatLAQ<seq_cst_load<atomic_load_nonext_16>, LH_AQ, i16>;
+
+ def : PatSRL<releasing_store<atomic_store_8>, SB_RL, i16>;
+ def : PatSRL<seq_cst_store<atomic_store_8>, SB_RL, i16>;
+
+ def : PatSRL<releasing_store<atomic_store_16>, SH_RL, i16>;
+ def : PatSRL<seq_cst_store<atomic_store_16>, SH_RL, i16>;
+}
+
+let Predicates = [HasStdExtZalasr, IsRV64] in {
+ // Load pattern is in RISCVInstrInfoZalasr.td and shared with RV32.
+ def : PatSRL<releasing_store<atomic_store_32>, SW_RL, i32>;
+ def : PatSRL<seq_cst_store<atomic_store_32>, SW_RL, i32>;
+}
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoZalasr.td b/llvm/lib/Target/RISCV/RISCVInstrInfoZalasr.td
index 1674c957b6579..aa37679f68ddf 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoZalasr.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoZalasr.td
@@ -93,12 +93,11 @@ let Predicates = [HasStdExtZalasr] in {
def : PatSRL<releasing_store<atomic_store_32>, SW_RL>;
def : PatSRL<seq_cst_store<atomic_store_32>, SW_RL>;
-} // Predicates = [HasStdExtZalasr]
-let Predicates = [HasStdExtZalasr, IsRV32] in {
- def : PatLAQ<acquiring_load<atomic_load_nonext_32>, LW_AQ>;
- def : PatLAQ<seq_cst_load<atomic_load_nonext_32>, LW_AQ>;
-} // Predicates = [HasStdExtZalasr, IsRV32]
+ // Used by GISel for RV32 and RV64.
+ def : PatLAQ<acquiring_load<atomic_load_nonext_32>, LW_AQ, i32>;
+ def : PatLAQ<seq_cst_load<atomic_load_nonext_32>, LW_AQ, i32>;
+} // Predicates = [HasStdExtZalasr]
let Predicates = [HasStdExtZalasr, IsRV64] in {
def : PatLAQ<acquiring_load<atomic_load_asext_32>, LW_AQ, i64>;
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/atomic-load-store.ll b/llvm/test/CodeGen/RISCV/GlobalISel/atomic-load-store.ll
index 1d5d918422b28..5d3fed48bf82b 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/atomic-load-store.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/atomic-load-store.ll
@@ -23,6 +23,15 @@
; RUN: llc -mtriple=riscv64 -global-isel -mattr=+a,+ztso -verify-machineinstrs < %s \
; RUN: | FileCheck -check-prefixes=RV64IA,RV64IA-TSO-TRAILING-FENCE %s
+; RUN: llc -mtriple=riscv32 -global-isel -mattr=+a,+experimental-zalasr -verify-machineinstrs < %s \
+; RUN: | FileCheck -check-prefixes=RV32IA,RV32IA-ZALASR,RV32IA-ZALASR-WMO %s
+; RUN: llc -mtriple=riscv32 -global-isel -mattr=+a,+experimental-zalasr,+ztso -verify-machineinstrs < %s \
+; RUN: | FileCheck -check-prefixes=RV32IA,RV32IA-ZALASR,RV32IA-ZALASR-TSO %s
+
+; RUN: llc -mtriple=riscv64 -global-isel -mattr=+a,+experimental-zalasr -verify-machineinstrs < %s \
+; RUN: | FileCheck -check-prefixes=RV64IA,RV64IA-ZALASR,RV64IA-ZALASR-WMO %s
+; RUN: llc -mtriple=riscv64 -global-isel -mattr=+a,+experimental-zalasr,+ztso -verify-machineinstrs < %s \
+; RUN: | FileCheck -check-prefixes=RV64IA,RV64IA-ZALASR,RV64IA-ZALASR-TSO %s
define i8 @atomic_load_i8_unordered(ptr %a) nounwind {
; RV32I-LABEL: atomic_load_i8_unordered:
@@ -156,6 +165,26 @@ define i8 @atomic_load_i8_acquire(ptr %a) nounwind {
; RV64IA-TSO-TRAILING-FENCE: # %bb.0:
; RV64IA-TSO-TRAILING-FENCE-NEXT: lbu a0, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV32IA-ZALASR-WMO-LABEL: atomic_load_i8_acquire:
+; RV32IA-ZALASR-WMO: # %bb.0:
+; RV32IA-ZALASR-WMO-NEXT: lb.aq a0, (a0)
+; RV32IA-ZALASR-WMO-NEXT: ret
+;
+; RV32IA-ZALASR-TSO-LABEL: atomic_load_i8_acquire:
+; RV32IA-ZALASR-TSO: # %bb.0:
+; RV32IA-ZALASR-TSO-NEXT: lbu a0, 0(a0)
+; RV32IA-ZALASR-TSO-NEXT: ret
+;
+; RV64IA-ZALASR-WMO-LABEL: atomic_load_i8_acquire:
+; RV64IA-ZALASR-WMO: # %bb.0:
+; RV64IA-ZALASR-WMO-NEXT: lb.aq a0, (a0)
+; RV64IA-ZALASR-WMO-NEXT: ret
+;
+; RV64IA-ZALASR-TSO-LABEL: atomic_load_i8_acquire:
+; RV64IA-ZALASR-TSO: # %bb.0:
+; RV64IA-ZALASR-TSO-NEXT: lbu a0, 0(a0)
+; RV64IA-ZALASR-TSO-NEXT: ret
%1 = load atomic i8, ptr %a acquire, align 1
ret i8 %1
}
@@ -232,6 +261,16 @@ define i8 @atomic_load_i8_seq_cst(ptr %a) nounwind {
; RV64IA-TSO-TRAILING-FENCE-NEXT: fence rw, rw
; RV64IA-TSO-TRAILING-FENCE-NEXT: lbu a0, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV32IA-ZALASR-LABEL: atomic_load_i8_seq_cst:
+; RV32IA-ZALASR: # %bb.0:
+; RV32IA-ZALASR-NEXT: lb.aq a0, (a0)
+; RV32IA-ZALASR-NEXT: ret
+;
+; RV64IA-ZALASR-LABEL: atomic_load_i8_seq_cst:
+; RV64IA-ZALASR: # %bb.0:
+; RV64IA-ZALASR-NEXT: lb.aq a0, (a0)
+; RV64IA-ZALASR-NEXT: ret
%1 = load atomic i8, ptr %a seq_cst, align 1
ret i8 %1
}
@@ -368,6 +407,26 @@ define i16 @atomic_load_i16_acquire(ptr %a) nounwind {
; RV64IA-TSO-TRAILING-FENCE: # %bb.0:
; RV64IA-TSO-TRAILING-FENCE-NEXT: lh a0, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV32IA-ZALASR-WMO-LABEL: atomic_load_i16_acquire:
+; RV32IA-ZALASR-WMO: # %bb.0:
+; RV32IA-ZALASR-WMO-NEXT: lh.aq a0, (a0)
+; RV32IA-ZALASR-WMO-NEXT: ret
+;
+; RV32IA-ZALASR-TSO-LABEL: atomic_load_i16_acquire:
+; RV32IA-ZALASR-TSO: # %bb.0:
+; RV32IA-ZALASR-TSO-NEXT: lh a0, 0(a0)
+; RV32IA-ZALASR-TSO-NEXT: ret
+;
+; RV64IA-ZALASR-WMO-LABEL: atomic_load_i16_acquire:
+; RV64IA-ZALASR-WMO: # %bb.0:
+; RV64IA-ZALASR-WMO-NEXT: lh.aq a0, (a0)
+; RV64IA-ZALASR-WMO-NEXT: ret
+;
+; RV64IA-ZALASR-TSO-LABEL: atomic_load_i16_acquire:
+; RV64IA-ZALASR-TSO: # %bb.0:
+; RV64IA-ZALASR-TSO-NEXT: lh a0, 0(a0)
+; RV64IA-ZALASR-TSO-NEXT: ret
%1 = load atomic i16, ptr %a acquire, align 2
ret i16 %1
}
@@ -444,6 +503,16 @@ define i16 @atomic_load_i16_seq_cst(ptr %a) nounwind {
; RV64IA-TSO-TRAILING-FENCE-NEXT: fence rw, rw
; RV64IA-TSO-TRAILING-FENCE-NEXT: lh a0, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV32IA-ZALASR-LABEL: atomic_load_i16_seq_cst:
+; RV32IA-ZALASR: # %bb.0:
+; RV32IA-ZALASR-NEXT: lh.aq a0, (a0)
+; RV32IA-ZALASR-NEXT: ret
+;
+; RV64IA-ZALASR-LABEL: atomic_load_i16_seq_cst:
+; RV64IA-ZALASR: # %bb.0:
+; RV64IA-ZALASR-NEXT: lh.aq a0, (a0)
+; RV64IA-ZALASR-NEXT: ret
%1 = load atomic i16, ptr %a seq_cst, align 2
ret i16 %1
}
@@ -580,6 +649,26 @@ define i32 @atomic_load_i32_acquire(ptr %a) nounwind {
; RV64IA-TSO-TRAILING-FENCE: # %bb.0:
; RV64IA-TSO-TRAILING-FENCE-NEXT: lw a0, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV32IA-ZALASR-WMO-LABEL: atomic_load_i32_acquire:
+; RV32IA-ZALASR-WMO: # %bb.0:
+; RV32IA-ZALASR-WMO-NEXT: lw.aq a0, (a0)
+; RV32IA-ZALASR-WMO-NEXT: ret
+;
+; RV32IA-ZALASR-TSO-LABEL: atomic_load_i32_acquire:
+; RV32IA-ZALASR-TSO: # %bb.0:
+; RV32IA-ZALASR-TSO-NEXT: lw a0, 0(a0)
+; RV32IA-ZALASR-TSO-NEXT: ret
+;
+; RV64IA-ZALASR-WMO-LABEL: atomic_load_i32_acquire:
+; RV64IA-ZALASR-WMO: # %bb.0:
+; RV64IA-ZALASR-WMO-NEXT: lw.aq a0, (a0)
+; RV64IA-ZALASR-WMO-NEXT: ret
+;
+; RV64IA-ZALASR-TSO-LABEL: atomic_load_i32_acquire:
+; RV64IA-ZALASR-TSO: # %bb.0:
+; RV64IA-ZALASR-TSO-NEXT: lw a0, 0(a0)
+; RV64IA-ZALASR-TSO-NEXT: ret
%1 = load atomic i32, ptr %a acquire, align 4
ret i32 %1
}
@@ -656,6 +745,16 @@ define i32 @atomic_load_i32_seq_cst(ptr %a) nounwind {
; RV64IA-TSO-TRAILING-FENCE-NEXT: fence rw, rw
; RV64IA-TSO-TRAILING-FENCE-NEXT: lw a0, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV32IA-ZALASR-LABEL: atomic_load_i32_seq_cst:
+; RV32IA-ZALASR: # %bb.0:
+; RV32IA-ZALASR-NEXT: lw.aq a0, (a0)
+; RV32IA-ZALASR-NEXT: ret
+;
+; RV64IA-ZALASR-LABEL: atomic_load_i32_seq_cst:
+; RV64IA-ZALASR: # %bb.0:
+; RV64IA-ZALASR-NEXT: lw.aq a0, (a0)
+; RV64IA-ZALASR-NEXT: ret
%1 = load atomic i32, ptr %a seq_cst, align 4
ret i32 %1
}
@@ -790,6 +889,16 @@ define i64 @atomic_load_i64_acquire(ptr %a) nounwind {
; RV64IA-TSO-TRAILING-FENCE: # %bb.0:
; RV64IA-TSO-TRAILING-FENCE-NEXT: ld a0, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV64IA-ZALASR-WMO-LABEL: atomic_load_i64_acquire:
+; RV64IA-ZALASR-WMO: # %bb.0:
+; RV64IA-ZALASR-WMO-NEXT: ld.aq a0, (a0)
+; RV64IA-ZALASR-WMO-NEXT: ret
+;
+; RV64IA-ZALASR-TSO-LABEL: atomic_load_i64_acquire:
+; RV64IA-ZALASR-TSO: # %bb.0:
+; RV64IA-ZALASR-TSO-NEXT: ld a0, 0(a0)
+; RV64IA-ZALASR-TSO-NEXT: ret
%1 = load atomic i64, ptr %a acquire, align 8
ret i64 %1
}
@@ -850,6 +959,11 @@ define i64 @atomic_load_i64_seq_cst(ptr %a) nounwind {
; RV64IA-TSO-TRAILING-FENCE-NEXT: fence rw, rw
; RV64IA-TSO-TRAILING-FENCE-NEXT: ld a0, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV64IA-ZALASR-LABEL: atomic_load_i64_seq_cst:
+; RV64IA-ZALASR: # %bb.0:
+; RV64IA-ZALASR-NEXT: ld.aq a0, (a0)
+; RV64IA-ZALASR-NEXT: ret
%1 = load atomic i64, ptr %a seq_cst, align 8
ret i64 %1
}
@@ -986,6 +1100,26 @@ define void @atomic_store_i8_release(ptr %a, i8 %b) nounwind {
; RV64IA-TSO-TRAILING-FENCE: # %bb.0:
; RV64IA-TSO-TRAILING-FENCE-NEXT: sb a1, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV32IA-ZALASR-WMO-LABEL: atomic_store_i8_release:
+; RV32IA-ZALASR-WMO: # %bb.0:
+; RV32IA-ZALASR-WMO-NEXT: sb.rl a1, (a0)
+; RV32IA-ZALASR-WMO-NEXT: ret
+;
+; RV32IA-ZALASR-TSO-LABEL: atomic_store_i8_release:
+; RV32IA-ZALASR-TSO: # %bb.0:
+; RV32IA-ZALASR-TSO-NEXT: sb a1, 0(a0)
+; RV32IA-ZALASR-TSO-NEXT: ret
+;
+; RV64IA-ZALASR-WMO-LABEL: atomic_store_i8_release:
+; RV64IA-ZALASR-WMO: # %bb.0:
+; RV64IA-ZALASR-WMO-NEXT: sb.rl a1, (a0)
+; RV64IA-ZALASR-WMO-NEXT: ret
+;
+; RV64IA-ZALASR-TSO-LABEL: atomic_store_i8_release:
+; RV64IA-ZALASR-TSO: # %bb.0:
+; RV64IA-ZALASR-TSO-NEXT: sb a1, 0(a0)
+; RV64IA-ZALASR-TSO-NEXT: ret
store atomic i8 %b, ptr %a release, align 1
ret void
}
@@ -1060,6 +1194,16 @@ define void @atomic_store_i8_seq_cst(ptr %a, i8 %b) nounwind {
; RV64IA-TSO-TRAILING-FENCE-NEXT: sb a1, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: fence rw, rw
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV32IA-ZALASR-LABEL: atomic_store_i8_seq_cst:
+; RV32IA-ZALASR: # %bb.0:
+; RV32IA-ZALASR-NEXT: sb.rl a1, (a0)
+; RV32IA-ZALASR-NEXT: ret
+;
+; RV64IA-ZALASR-LABEL: atomic_store_i8_seq_cst:
+; RV64IA-ZALASR: # %bb.0:
+; RV64IA-ZALASR-NEXT: sb.rl a1, (a0)
+; RV64IA-ZALASR-NEXT: ret
store atomic i8 %b, ptr %a seq_cst, align 1
ret void
}
@@ -1196,6 +1340,26 @@ define void @atomic_store_i16_release(ptr %a, i16 %b) nounwind {
; RV64IA-TSO-TRAILING-FENCE: # %bb.0:
; RV64IA-TSO-TRAILING-FENCE-NEXT: sh a1, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV32IA-ZALASR-WMO-LABEL: atomic_store_i16_release:
+; RV32IA-ZALASR-WMO: # %bb.0:
+; RV32IA-ZALASR-WMO-NEXT: sh.rl a1, (a0)
+; RV32IA-ZALASR-WMO-NEXT: ret
+;
+; RV32IA-ZALASR-TSO-LABEL: atomic_store_i16_release:
+; RV32IA-ZALASR-TSO: # %bb.0:
+; RV32IA-ZALASR-TSO-NEXT: sh a1, 0(a0)
+; RV32IA-ZALASR-TSO-NEXT: ret
+;
+; RV64IA-ZALASR-WMO-LABEL: atomic_store_i16_release:
+; RV64IA-ZALASR-WMO: # %bb.0:
+; RV64IA-ZALASR-WMO-NEXT: sh.rl a1, (a0)
+; RV64IA-ZALASR-WMO-NEXT: ret
+;
+; RV64IA-ZALASR-TSO-LABEL: atomic_store_i16_release:
+; RV64IA-ZALASR-TSO: # %bb.0:
+; RV64IA-ZALASR-TSO-NEXT: sh a1, 0(a0)
+; RV64IA-ZALASR-TSO-NEXT: ret
store atomic i16 %b, ptr %a release, align 2
ret void
}
@@ -1270,6 +1434,16 @@ define void @atomic_store_i16_seq_cst(ptr %a, i16 %b) nounwind {
; RV64IA-TSO-TRAILING-FENCE-NEXT: sh a1, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: fence rw, rw
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV32IA-ZALASR-LABEL: atomic_store_i16_seq_cst:
+; RV32IA-ZALASR: # %bb.0:
+; RV32IA-ZALASR-NEXT: sh.rl a1, (a0)
+; RV32IA-ZALASR-NEXT: ret
+;
+; RV64IA-ZALASR-LABEL: atomic_store_i16_seq_cst:
+; RV64IA-ZALASR: # %bb.0:
+; RV64IA-ZALASR-NEXT: sh.rl a1, (a0)
+; RV64IA-ZALASR-NEXT: ret
store atomic i16 %b, ptr %a seq_cst, align 2
ret void
}
@@ -1406,6 +1580,26 @@ define void @atomic_store_i32_release(ptr %a, i32 %b) nounwind {
; RV64IA-TSO-TRAILING-FENCE: # %bb.0:
; RV64IA-TSO-TRAILING-FENCE-NEXT: sw a1, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV32IA-ZALASR-WMO-LABEL: atomic_store_i32_release:
+; RV32IA-ZALASR-WMO: # %bb.0:
+; RV32IA-ZALASR-WMO-NEXT: sw.rl a1, (a0)
+; RV32IA-ZALASR-WMO-NEXT: ret
+;
+; RV32IA-ZALASR-TSO-LABEL: atomic_store_i32_release:
+; RV32IA-ZALASR-TSO: # %bb.0:
+; RV32IA-ZALASR-TSO-NEXT: sw a1, 0(a0)
+; RV32IA-ZALASR-TSO-NEXT: ret
+;
+; RV64IA-ZALASR-WMO-LABEL: atomic_store_i32_release:
+; RV64IA-ZALASR-WMO: # %bb.0:
+; RV64IA-ZALASR-WMO-NEXT: sw.rl a1, (a0)
+; RV64IA-ZALASR-WMO-NEXT: ret
+;
+; RV64IA-ZALASR-TSO-LABEL: atomic_store_i32_release:
+; RV64IA-ZALASR-TSO: # %bb.0:
+; RV64IA-ZALASR-TSO-NEXT: sw a1, 0(a0)
+; RV64IA-ZALASR-TSO-NEXT: ret
store atomic i32 %b, ptr %a release, align 4
ret void
}
@@ -1480,6 +1674,16 @@ define void @atomic_store_i32_seq_cst(ptr %a, i32 %b) nounwind {
; RV64IA-TSO-TRAILING-FENCE-NEXT: sw a1, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: fence rw, rw
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV32IA-ZALASR-LABEL: atomic_store_i32_seq_cst:
+; RV32IA-ZALASR: # %bb.0:
+; RV32IA-ZALASR-NEXT: sw.rl a1, (a0)
+; RV32IA-ZALASR-NEXT: ret
+;
+; RV64IA-ZALASR-LABEL: atomic_store_i32_seq_cst:
+; RV64IA-ZALASR: # %bb.0:
+; RV64IA-ZALASR-NEXT: sw.rl a1, (a0)
+; RV64IA-ZALASR-NEXT: ret
store atomic i32 %b, ptr %a seq_cst, align 4
ret void
}
@@ -1614,6 +1818,16 @@ define void @atomic_store_i64_release(ptr %a, i64 %b) nounwind {
; RV64IA-TSO-TRAILING-FENCE: # %bb.0:
; RV64IA-TSO-TRAILING-FENCE-NEXT: sd a1, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV64IA-ZALASR-WMO-LABEL: atomic_store_i64_release:
+; RV64IA-ZALASR-WMO: # %bb.0:
+; RV64IA-ZALASR-WMO-NEXT: sd.rl a1, (a0)
+; RV64IA-ZALASR-WMO-NEXT: ret
+;
+; RV64IA-ZALASR-TSO-LABEL: atomic_store_i64_release:
+; RV64IA-ZALASR-TSO: # %bb.0:
+; RV64IA-ZALASR-TSO-NEXT: sd a1, 0(a0)
+; RV64IA-ZALASR-TSO-NEXT: ret
store atomic i64 %b, ptr %a release, align 8
ret void
}
@@ -1673,6 +1887,11 @@ define void @atomic_store_i64_seq_cst(ptr %a, i64 %b) nounwind {
; RV64IA-TSO-TRAILING-FENCE-NEXT: sd a1, 0(a0)
; RV64IA-TSO-TRAILING-FENCE-NEXT: fence rw, rw
; RV64IA-TSO-TRAILING-FENCE-NEXT: ret
+;
+; RV64IA-ZALASR-LABEL: atomic_store_i64_seq_cst:
+; RV64IA-ZALASR: # %bb.0:
+; RV64IA-ZALASR-NEXT: sd.rl a1, (a0)
+; RV64IA-ZALASR-NEXT: ret
store atomic i64 %b, ptr %a seq_cst, align 8
ret void
}
|
…bank. GISel doesn't distinquish integer and FP loads and stores. We only know which it is after register bank selection. This results in s16/s32 loads/stores on the GPR register bank that need to be selected. This required extra isel patterns not needed for SDAG and adding i16 and i32 to the GPR register class. Having i16/i32 on the GPR register class makes type interfence in tablegen less effective, requiring explicit casts to be added to patterns. This patch removes the extra isel patterns and replaces it with custom instruction selection similar to what is done on AArch64. A future patch will remove i16 and i32 from the GPR register class. Stacked on llvm#161774.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. One comment that is not blocking.
; | ||
; RV32IA-ZALASR-TSO-LABEL: atomic_load_i8_acquire: | ||
; RV32IA-ZALASR-TSO: # %bb.0: | ||
; RV32IA-ZALASR-TSO-NEXT: lbu a0, 0(a0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker: One minor difference between these cases and the equivalent SDag cases, is that these use lbu a0, 0(a0)
rather than lb a0, 0(a0)
, even after your patch to change this with GISel on Oct 1st (129d5ce). I think the GISel codegen (as here) is the preferred codegen though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this was noted in 129d5ce "This only affects global isel because SelectionDAG won't create an
anyext i8 atomic_load today."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I did read comments trying to look for a note like this, I didn't read the commit notes.
After llvm#161774 and llvm#162042, this works correctly.
We need additional patterns for GISel because we make s16 and s32 legal for load/store. GISel does not distinquish integer and FP scalar types in LLT. We only know whether the load should be integer or FP after register bank selection.