Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xtensa multicore: AtomicXX::swap() broken for atomics smaller than 32 bit #106

Closed
ivmarkov opened this issue Feb 7, 2022 · 7 comments
Closed
Assignees

Comments

@ivmarkov
Copy link

ivmarkov commented Feb 7, 2022

AtomicXX::swap() does not work correctly when the atomic is smaller than 32 bit (i.e. AtomicU16, AtomicBool, possibly AtomicU8 too).

I suspect the LLVM atomic generation code is wrong, as these work just fine on riscv32imc (esp32c3) and (I would assume) single-core Xtensa (esp32s2), where atomics are lowered to libcalls using critical sections (implemented in ESP-IDF).

Here's a repo that demonstrates the problem. I've only tested it on the esp32 target, but esp32s3 might be affected as well, as it is also multicore:

  • cargo build and then flash on esp32 - the assert_eqs commented with // This breaks?! will fail
  • cargo build --target riscv32imc-esp-espidf - code should complete up to println!("All done correctly!")
@ivmarkov
Copy link
Author

ivmarkov commented Feb 7, 2022

@MabezDev @igrr ^^^ Kind of severe.

@ivmarkov
Copy link
Author

ivmarkov commented Feb 7, 2022

Forgot to mention: tried esp Rustc 1.57 and 1.58, both are affected. Not sure if the LLVM Espressif fork was updated for 1.58 but the point is, that problem may have been always there.

@MabezDev MabezDev self-assigned this Feb 7, 2022
@MabezDev
Copy link
Member

Thanks for the test case @ivmarkov, we've fixed this in our LLVM fork. I'll update this issue when the new LLVM release is ready.

@ivmarkov
Copy link
Author

Is this fix available in the GH Espressif LLVM repo? I might want to rebuild the compiler. That is, unless a new LLVM release is imminent.

@MabezDev
Copy link
Member

Not available on github just yet, but here is a patchfile you can apply. I hope we can have a release out by next week.

From a1c14dd680bdea61750431657b15ab10ae943fc5 Mon Sep 17 00:00:00 2001
From: Andrei Safronov <safronov@espressif.com>
Date: Wed, 9 Feb 2022 00:59:23 +0300
Subject: [PATCH] [Xtensa] Fix atomic swap for 8/16 bit operands.

---
 llvm/lib/Target/Xtensa/XtensaISelLowering.cpp |   2 +-
 llvm/test/CodeGen/Xtensa/atomicrmw.ll         | 103 ++++++++++++++++++
 2 files changed, 104 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/Xtensa/atomicrmw.ll

diff --git a/llvm/lib/Target/Xtensa/XtensaISelLowering.cpp b/llvm/lib/Target/Xtensa/XtensaISelLowering.cpp
index 2598d3c9dd99..dca51a1f65da 100644
--- a/llvm/lib/Target/Xtensa/XtensaISelLowering.cpp
+++ b/llvm/lib/Target/Xtensa/XtensaISelLowering.cpp
@@ -2359,7 +2359,7 @@ XtensaTargetLowering::emitAtomicSwap(MachineInstr &MI, MachineBasicBlock *BB,
   unsigned R8 = MRI.createVirtualRegister(RC);
 
   BuildMI(*BB, St, DL, TII.get(Xtensa::SSR)).addReg(BitOffs);
-  BuildMI(*BB, St, DL, TII.get(Xtensa::SLL), R8).addReg(AtomValLoop);
+  BuildMI(*BB, St, DL, TII.get(Xtensa::SRL), R8).addReg(AtomValLoop);
 
   if (isByteOperand) {
     BuildMI(*BB, St, DL, TII.get(Xtensa::SEXT), Res.getReg())
diff --git a/llvm/test/CodeGen/Xtensa/atomicrmw.ll b/llvm/test/CodeGen/Xtensa/atomicrmw.ll
new file mode 100644
index 000000000000..f2b7526e33e8
--- /dev/null
+++ b/llvm/test/CodeGen/Xtensa/atomicrmw.ll
@@ -0,0 +1,103 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=xtensa -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefix=CHECK-XTENSA %s
+
+define i8 @atomicrmw_xchg_i8_seq_cst(i8* %a, i8 %b) nounwind {
+; CHECK-XTENSA-LABEL: atomicrmw_xchg_i8_seq_cst:
+; CHECK-XTENSA:       # %bb.0:
+; CHECK-XTENSA-NEXT:    entry	a1, 32
+; CHECK-XTENSA-NEXT:    memw
+; CHECK-XTENSA-NEXT:    movi.n	a8, 3
+; CHECK-XTENSA-NEXT:    and	a8, a8, a2
+; CHECK-XTENSA-NEXT:    sub	a9, a2, a8
+; CHECK-XTENSA-NEXT:    slli	a8, a8, 3
+; CHECK-XTENSA-NEXT:    movi	a10, 255
+; CHECK-XTENSA-NEXT:    ssl	a8
+; CHECK-XTENSA-NEXT:    movi.n	a11, -1
+; CHECK-XTENSA-NEXT:    sll	a10, a10
+; CHECK-XTENSA-NEXT:    xor	a11, a10, a11
+; CHECK-XTENSA-NEXT:    l32i.n	a12, a9, 0
+; CHECK-XTENSA-NEXT:    sll	a12, a3
+; CHECK-XTENSA-NEXT:    l32i.n	a13, a9, 0
+; CHECK-XTENSA-NEXT:    and	a14, a13, a10
+; CHECK-XTENSA-NEXT:  .LBB0_1: # =>This Loop Header: Depth=1
+; CHECK-XTENSA-NEXT:    #     Child Loop BB0_2 Depth 2
+; CHECK-XTENSA-NEXT:    mov.n	a13, a14
+; CHECK-XTENSA-NEXT:    memw
+; CHECK-XTENSA-NEXT:    l32i.n	a14, a9, 0
+; CHECK-XTENSA-NEXT:    and	a7, a14, a11
+; CHECK-XTENSA-NEXT:  .LBB0_2: #   Parent Loop BB0_1 Depth=1
+; CHECK-XTENSA-NEXT:  # =>  This Inner Loop Header: Depth=2
+; CHECK-XTENSA-NEXT:    mov.n	a15, a7
+; CHECK-XTENSA-NEXT:    or	a14, a12, a15
+; CHECK-XTENSA-NEXT:    or	a7, a13, a15
+; CHECK-XTENSA-NEXT:    wsr	a7, scompare1
+; CHECK-XTENSA-NEXT:    s32c1i	a14, a9, 0
+; CHECK-XTENSA-NEXT:    beq	a7, a14, .LBB0_4
+; CHECK-XTENSA-NEXT:  # %bb.3: #   in Loop: Header=BB0_2 Depth=2
+; CHECK-XTENSA-NEXT:    and	a7, a14, a11
+; CHECK-XTENSA-NEXT:    bne	a7, a15, .LBB0_2
+; CHECK-XTENSA-NEXT:  .LBB0_4: #   in Loop: Header=BB0_1 Depth=1
+; CHECK-XTENSA-NEXT:    and	a14, a14, a10
+; CHECK-XTENSA-NEXT:    bne	a14, a13, .LBB0_1
+; CHECK-XTENSA-NEXT:  # %bb.5:
+; CHECK-XTENSA-NEXT:    ssr	a8
+; CHECK-XTENSA-NEXT:    srl	a8, a14
+; CHECK-XTENSA-NEXT:    sext	a2, a8, 7
+; CHECK-XTENSA-NEXT:    memw
+; CHECK-XTENSA-NEXT:    retw.n
+
+  %1 = atomicrmw xchg i8* %a, i8 %b seq_cst
+  ret i8 %1
+}
+
+define i16 @atomicrmw_xchg_i16_seq_cst(i16* %a, i16 %b) nounwind {
+; CHECK-XTENSA-LABEL: atomicrmw_xchg_i16_seq_cst:
+; CHECK-XTENSA:       # %bb.0:
+; CHECK-XTENSA-NEXT:    entry	a1, 32
+; CHECK-XTENSA-NEXT:    memw
+; CHECK-XTENSA-NEXT:    movi.n	a8, 3
+; CHECK-XTENSA-NEXT:    and	a8, a8, a2
+; CHECK-XTENSA-NEXT:    sub	a9, a2, a8
+; CHECK-XTENSA-NEXT:    slli	a8, a8, 3
+; CHECK-XTENSA-NEXT:    movi.n	a10, 1
+; CHECK-XTENSA-NEXT:    slli	a10, a10, 16
+; CHECK-XTENSA-NEXT:    addi.n	a10, a10, -1
+; CHECK-XTENSA-NEXT:    ssl	a8
+; CHECK-XTENSA-NEXT:    movi.n	a11, -1
+; CHECK-XTENSA-NEXT:    sll	a10, a10
+; CHECK-XTENSA-NEXT:    xor	a11, a10, a11
+; CHECK-XTENSA-NEXT:    l32i.n	a12, a9, 0
+; CHECK-XTENSA-NEXT:    sll	a12, a3
+; CHECK-XTENSA-NEXT:    l32i.n	a13, a9, 0
+; CHECK-XTENSA-NEXT:    and	a14, a13, a10
+; CHECK-XTENSA-NEXT:  .LBB1_1: # =>This Loop Header: Depth=1
+; CHECK-XTENSA-NEXT:    #     Child Loop BB1_2 Depth 2
+; CHECK-XTENSA-NEXT:    mov.n	a13, a14
+; CHECK-XTENSA-NEXT:    memw
+; CHECK-XTENSA-NEXT:    l32i.n	a14, a9, 0
+; CHECK-XTENSA-NEXT:    and	a7, a14, a11
+; CHECK-XTENSA-NEXT:  .LBB1_2: #   Parent Loop BB1_1 Depth=1
+; CHECK-XTENSA-NEXT:    # =>  This Inner Loop Header: Depth=2
+; CHECK-XTENSA-NEXT:    mov.n	a15, a7
+; CHECK-XTENSA-NEXT:    or	a14, a12, a15
+; CHECK-XTENSA-NEXT:    or	a7, a13, a15
+; CHECK-XTENSA-NEXT:    wsr	a7, scompare1
+; CHECK-XTENSA-NEXT:    s32c1i	a14, a9, 0
+; CHECK-XTENSA-NEXT:    beq	a7, a14, .LBB1_4
+; CHECK-XTENSA-NEXT:  # %bb.3: #   in Loop: Header=BB1_2 Depth=2
+; CHECK-XTENSA-NEXT:    and	a7, a14, a11
+; CHECK-XTENSA-NEXT:    bne	a7, a15, .LBB1_2
+; CHECK-XTENSA-NEXT:  .LBB1_4: #   in Loop: Header=BB1_1 Depth=1
+; CHECK-XTENSA-NEXT:    and	a14, a14, a10
+; CHECK-XTENSA-NEXT:    bne	a14, a13, .LBB1_1
+; CHECK-XTENSA-NEXT:  # %bb.5:
+; CHECK-XTENSA-NEXT:    ssr	a8
+; CHECK-XTENSA-NEXT:    srl	a8, a14
+; CHECK-XTENSA-NEXT:    sext	a2, a8, 15
+; CHECK-XTENSA-NEXT:    memw
+; CHECK-XTENSA-NEXT:    retw.n
+
+  %1 = atomicrmw xchg i16* %a, i16 %b seq_cst
+  ret i16 %1
+}
-- 

ivmarkov added a commit to ivmarkov/llvm-project that referenced this issue Feb 11, 2022
@MabezDev
Copy link
Member

Now available in the the github repo: espressif/llvm-project@8a93494.

Will be live in this fork with the next Rust release.

@MabezDev
Copy link
Member

Fixed in the 1.59 release, thanks for the report :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

2 participants