-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Xtensa multicore: AtomicXX::swap() broken for atomics smaller than 32 bit #106
Comments
Forgot to mention: tried esp Rustc 1.57 and 1.58, both are affected. Not sure if the LLVM Espressif fork was updated for 1.58 but the point is, that problem may have been always there. |
Thanks for the test case @ivmarkov, we've fixed this in our LLVM fork. I'll update this issue when the new LLVM release is ready. |
Is this fix available in the GH Espressif LLVM repo? I might want to rebuild the compiler. That is, unless a new LLVM release is imminent. |
Not available on github just yet, but here is a patchfile you can apply. I hope we can have a release out by next week. From a1c14dd680bdea61750431657b15ab10ae943fc5 Mon Sep 17 00:00:00 2001
From: Andrei Safronov <safronov@espressif.com>
Date: Wed, 9 Feb 2022 00:59:23 +0300
Subject: [PATCH] [Xtensa] Fix atomic swap for 8/16 bit operands.
---
llvm/lib/Target/Xtensa/XtensaISelLowering.cpp | 2 +-
llvm/test/CodeGen/Xtensa/atomicrmw.ll | 103 ++++++++++++++++++
2 files changed, 104 insertions(+), 1 deletion(-)
create mode 100644 llvm/test/CodeGen/Xtensa/atomicrmw.ll
diff --git a/llvm/lib/Target/Xtensa/XtensaISelLowering.cpp b/llvm/lib/Target/Xtensa/XtensaISelLowering.cpp
index 2598d3c9dd99..dca51a1f65da 100644
--- a/llvm/lib/Target/Xtensa/XtensaISelLowering.cpp
+++ b/llvm/lib/Target/Xtensa/XtensaISelLowering.cpp
@@ -2359,7 +2359,7 @@ XtensaTargetLowering::emitAtomicSwap(MachineInstr &MI, MachineBasicBlock *BB,
unsigned R8 = MRI.createVirtualRegister(RC);
BuildMI(*BB, St, DL, TII.get(Xtensa::SSR)).addReg(BitOffs);
- BuildMI(*BB, St, DL, TII.get(Xtensa::SLL), R8).addReg(AtomValLoop);
+ BuildMI(*BB, St, DL, TII.get(Xtensa::SRL), R8).addReg(AtomValLoop);
if (isByteOperand) {
BuildMI(*BB, St, DL, TII.get(Xtensa::SEXT), Res.getReg())
diff --git a/llvm/test/CodeGen/Xtensa/atomicrmw.ll b/llvm/test/CodeGen/Xtensa/atomicrmw.ll
new file mode 100644
index 000000000000..f2b7526e33e8
--- /dev/null
+++ b/llvm/test/CodeGen/Xtensa/atomicrmw.ll
@@ -0,0 +1,103 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=xtensa -verify-machineinstrs < %s \
+; RUN: | FileCheck -check-prefix=CHECK-XTENSA %s
+
+define i8 @atomicrmw_xchg_i8_seq_cst(i8* %a, i8 %b) nounwind {
+; CHECK-XTENSA-LABEL: atomicrmw_xchg_i8_seq_cst:
+; CHECK-XTENSA: # %bb.0:
+; CHECK-XTENSA-NEXT: entry a1, 32
+; CHECK-XTENSA-NEXT: memw
+; CHECK-XTENSA-NEXT: movi.n a8, 3
+; CHECK-XTENSA-NEXT: and a8, a8, a2
+; CHECK-XTENSA-NEXT: sub a9, a2, a8
+; CHECK-XTENSA-NEXT: slli a8, a8, 3
+; CHECK-XTENSA-NEXT: movi a10, 255
+; CHECK-XTENSA-NEXT: ssl a8
+; CHECK-XTENSA-NEXT: movi.n a11, -1
+; CHECK-XTENSA-NEXT: sll a10, a10
+; CHECK-XTENSA-NEXT: xor a11, a10, a11
+; CHECK-XTENSA-NEXT: l32i.n a12, a9, 0
+; CHECK-XTENSA-NEXT: sll a12, a3
+; CHECK-XTENSA-NEXT: l32i.n a13, a9, 0
+; CHECK-XTENSA-NEXT: and a14, a13, a10
+; CHECK-XTENSA-NEXT: .LBB0_1: # =>This Loop Header: Depth=1
+; CHECK-XTENSA-NEXT: # Child Loop BB0_2 Depth 2
+; CHECK-XTENSA-NEXT: mov.n a13, a14
+; CHECK-XTENSA-NEXT: memw
+; CHECK-XTENSA-NEXT: l32i.n a14, a9, 0
+; CHECK-XTENSA-NEXT: and a7, a14, a11
+; CHECK-XTENSA-NEXT: .LBB0_2: # Parent Loop BB0_1 Depth=1
+; CHECK-XTENSA-NEXT: # => This Inner Loop Header: Depth=2
+; CHECK-XTENSA-NEXT: mov.n a15, a7
+; CHECK-XTENSA-NEXT: or a14, a12, a15
+; CHECK-XTENSA-NEXT: or a7, a13, a15
+; CHECK-XTENSA-NEXT: wsr a7, scompare1
+; CHECK-XTENSA-NEXT: s32c1i a14, a9, 0
+; CHECK-XTENSA-NEXT: beq a7, a14, .LBB0_4
+; CHECK-XTENSA-NEXT: # %bb.3: # in Loop: Header=BB0_2 Depth=2
+; CHECK-XTENSA-NEXT: and a7, a14, a11
+; CHECK-XTENSA-NEXT: bne a7, a15, .LBB0_2
+; CHECK-XTENSA-NEXT: .LBB0_4: # in Loop: Header=BB0_1 Depth=1
+; CHECK-XTENSA-NEXT: and a14, a14, a10
+; CHECK-XTENSA-NEXT: bne a14, a13, .LBB0_1
+; CHECK-XTENSA-NEXT: # %bb.5:
+; CHECK-XTENSA-NEXT: ssr a8
+; CHECK-XTENSA-NEXT: srl a8, a14
+; CHECK-XTENSA-NEXT: sext a2, a8, 7
+; CHECK-XTENSA-NEXT: memw
+; CHECK-XTENSA-NEXT: retw.n
+
+ %1 = atomicrmw xchg i8* %a, i8 %b seq_cst
+ ret i8 %1
+}
+
+define i16 @atomicrmw_xchg_i16_seq_cst(i16* %a, i16 %b) nounwind {
+; CHECK-XTENSA-LABEL: atomicrmw_xchg_i16_seq_cst:
+; CHECK-XTENSA: # %bb.0:
+; CHECK-XTENSA-NEXT: entry a1, 32
+; CHECK-XTENSA-NEXT: memw
+; CHECK-XTENSA-NEXT: movi.n a8, 3
+; CHECK-XTENSA-NEXT: and a8, a8, a2
+; CHECK-XTENSA-NEXT: sub a9, a2, a8
+; CHECK-XTENSA-NEXT: slli a8, a8, 3
+; CHECK-XTENSA-NEXT: movi.n a10, 1
+; CHECK-XTENSA-NEXT: slli a10, a10, 16
+; CHECK-XTENSA-NEXT: addi.n a10, a10, -1
+; CHECK-XTENSA-NEXT: ssl a8
+; CHECK-XTENSA-NEXT: movi.n a11, -1
+; CHECK-XTENSA-NEXT: sll a10, a10
+; CHECK-XTENSA-NEXT: xor a11, a10, a11
+; CHECK-XTENSA-NEXT: l32i.n a12, a9, 0
+; CHECK-XTENSA-NEXT: sll a12, a3
+; CHECK-XTENSA-NEXT: l32i.n a13, a9, 0
+; CHECK-XTENSA-NEXT: and a14, a13, a10
+; CHECK-XTENSA-NEXT: .LBB1_1: # =>This Loop Header: Depth=1
+; CHECK-XTENSA-NEXT: # Child Loop BB1_2 Depth 2
+; CHECK-XTENSA-NEXT: mov.n a13, a14
+; CHECK-XTENSA-NEXT: memw
+; CHECK-XTENSA-NEXT: l32i.n a14, a9, 0
+; CHECK-XTENSA-NEXT: and a7, a14, a11
+; CHECK-XTENSA-NEXT: .LBB1_2: # Parent Loop BB1_1 Depth=1
+; CHECK-XTENSA-NEXT: # => This Inner Loop Header: Depth=2
+; CHECK-XTENSA-NEXT: mov.n a15, a7
+; CHECK-XTENSA-NEXT: or a14, a12, a15
+; CHECK-XTENSA-NEXT: or a7, a13, a15
+; CHECK-XTENSA-NEXT: wsr a7, scompare1
+; CHECK-XTENSA-NEXT: s32c1i a14, a9, 0
+; CHECK-XTENSA-NEXT: beq a7, a14, .LBB1_4
+; CHECK-XTENSA-NEXT: # %bb.3: # in Loop: Header=BB1_2 Depth=2
+; CHECK-XTENSA-NEXT: and a7, a14, a11
+; CHECK-XTENSA-NEXT: bne a7, a15, .LBB1_2
+; CHECK-XTENSA-NEXT: .LBB1_4: # in Loop: Header=BB1_1 Depth=1
+; CHECK-XTENSA-NEXT: and a14, a14, a10
+; CHECK-XTENSA-NEXT: bne a14, a13, .LBB1_1
+; CHECK-XTENSA-NEXT: # %bb.5:
+; CHECK-XTENSA-NEXT: ssr a8
+; CHECK-XTENSA-NEXT: srl a8, a14
+; CHECK-XTENSA-NEXT: sext a2, a8, 15
+; CHECK-XTENSA-NEXT: memw
+; CHECK-XTENSA-NEXT: retw.n
+
+ %1 = atomicrmw xchg i16* %a, i16 %b seq_cst
+ ret i16 %1
+}
-- |
Now available in the the github repo: espressif/llvm-project@8a93494. Will be live in this fork with the next Rust release. |
Fixed in the 1.59 release, thanks for the report :) |
AtomicXX::swap()
does not work correctly when the atomic is smaller than 32 bit (i.e. AtomicU16, AtomicBool, possibly AtomicU8 too).I suspect the LLVM atomic generation code is wrong, as these work just fine on riscv32imc (
esp32c3
) and (I would assume) single-core Xtensa (esp32s2
), where atomics are lowered to libcalls using critical sections (implemented in ESP-IDF).Here's a repo that demonstrates the problem. I've only tested it on the
esp32
target, butesp32s3
might be affected as well, as it is also multicore:cargo build
and then flash on esp32 - theassert_eq
s commented with// This breaks?!
will failcargo build --target riscv32imc-esp-espidf
- code should complete up toprintln!("All done correctly!")
The text was updated successfully, but these errors were encountered: