[CHERI-RISC-V] Report true for __atomic_always_lock_free(sizeof(__intcap)) #721

arichardson · 2023-09-22T22:25:14Z

This fixes problems such as std::atomic<void*>::is_lock_free reporting
false in C++14 mode as well as a compilation error in compiler-rt
atomic.c.

Ideally the builtin would take a type rather than a size but unfortunately this cannot be changed any more. So instead we just ensure that all capability-sized atomics are lock free.

Making this possible involves quite a lot of changes:

lower atomic __int128 (int64 for RV32) operations using capabilities
as part of this adding an exact flag to cmpxchg to ensure all capability bits are compared

Note: we do not yet report true here for hybrid mode since we don't yet support all atomic ops with capability pointers: see #490

This is a rather large change but each commit should be small enough to review.

These projects now check for __atomic_always_lock_free on uintptr_t which returns false for purecap until the pull request #721 is merged.

arichardson · 2024-02-18T00:55:17Z

llvm/include/llvm/CodeGen/MachineMemOperand.h

-    unsigned FailureOrdering : 4; // enum AtomicOrdering
+    unsigned FailureOrdering : 3; // enum AtomicOrdering
+    /// Whether capability comparisons are exact for cmpxchg atomic operations.
+    unsigned ExactCompare : 1;


I considered adding two new ISD:: operations for the exact CMPXCHG, but adding it here seems less invasive.

This will be used in the follow-up commits to determine whether to lower cmpxchg with an exact capability comparison or just the address.

This will be used to support exact comparisons in cmxpchg lowering.

This also allows dropping one test that is now redundant since only the pointee type differed.

This should use an exact capability comparison instead of an address compare.

If the exact flag is set on the cmpxchg instruction, we use an exact comparison in the LL/SC expansion. This will allow use of capability cmpxchg for capability-sized integers (since we have to compare all bits in that case) and will also make it possible to correctly handle the exact-equals semantics in the future.

In order to report true for __atomic_always_lock_free(sizeof(uintptr_t), 0) we will have to also expand i128/i64 atomics using capability ops.

We can perform an atomic capability load and extract the raw bits. As can be seen from the test case, the IR-level conversion to i128 using shift+or is converted to a direct read into the register holding the high part of an i128.

We can perform an atomic capability store from the raw bits. As can be seen from the test case, the IR-level i128 shift is elided and directly replaced with a csethigh.

Now that we have an exact bit on cmpxchg we can lower 2*XLen integer cmpxchg using a capability operation with the raw bits. While this also compares the tag bit in addition to the data bits this should not matter. IMO writing a valid capability with the same bit pattern but a different tag bit to the location should be treated as a conflicting store.

This completes the set of atomic operations on i128/i64 and will allow use to return true for `__atomic_always_lock_free(sizeof(__intcap))`.

This is a lot more efficient that a CMPXCHG loop.

We were just missing the necessary tablegen patterns to support this.

With the latest commit we can lower i64/i128 loads stores as an atomic capability store with the appropriate fences.

We don't yet return true here for purecap CHERI since right now clang will emit libcalls for non-capability sizeof(uintptr_t) operations such as __int128 for 64-bit RISC-V.

…cap)) Now that we can expand all 2*XLen atomics inline (at least for purecap), we can report true for this builtin. This fixes problems such as std::atomic<void*>::is_lock_free reporting false in C++14 mode as well as a compilation error in compiler-rt atomic.c.

We are assuming there can only ever be one memory operand on the pseudo instructions but there could be more than one if the branch-folder pass merges identical blocks. Found while building CheriBSD.

Using separate pseudos for exact and inexact comparsions ensures that our lowering does not depend on the MachineMemOperand (after SDAG) since passes could drop it (which means use the most conservative approach). This adds a bit of boilerplate but it's not as bad as I expected and is less fragile than the previous approach.

arichardson · 2024-02-20T23:43:22Z

llvm/lib/CodeGen/AtomicExpandPass.cpp

+    // This can happen when expanding a RMW to a cmxchg that also needs
+    // expanding.
+    // TODO: assert that we don't recurse more than once or use a worklist?
+    runOnFunction(F);


This should be fixed to just run the expansion when needed. Otherwise this PR should be ready for review.

arichardson requested a review from jrtc27 September 22, 2023 22:25

arichardson force-pushed the atomic-always-lock-free-fixes branch from 060240c to d488825 Compare September 22, 2023 22:27

arichardson added a commit that referenced this pull request Dec 12, 2023

[builtins][libc++] Allow building atomics after latest merge

881f119

These projects now check for __atomic_always_lock_free on uintptr_t which returns false for purecap until the pull request #721 is merged.

arichardson added a commit that referenced this pull request Dec 12, 2023

[builtins][libc++] Allow building atomics after latest merge

c0a0c11

These projects now check for __atomic_always_lock_free on uintptr_t which returns false for purecap until the pull request #721 is merged.

arichardson force-pushed the atomic-always-lock-free-fixes branch 2 times, most recently from 056a8a4 to a541ac1 Compare February 18, 2024 00:54

arichardson commented Feb 18, 2024

View reviewed changes

arichardson changed the base branch from upstream-llvm-merge to dev February 18, 2024 00:55

arichardson added 18 commits February 18, 2024 20:41

Add an "exact" flag to AtomicCmpXchgInst

0fe7c3a

This will be used in the follow-up commits to determine whether to lower cmpxchg with an exact capability comparison or just the address.

Add an ExactCompare flag to MachineMemOperand

f680e93

This will be used to support exact comparisons in cmxpchg lowering.

[CHERI-Generic] Update a test to use opaque pointers

3bfa7ac

This also allows dropping one test that is now redundant since only the pointee type differed.

[CHERI-Generic] Add a baseline test for cmpxchg exact

b47f510

This should use an exact capability comparison instead of an address compare.

[CHERI] Add a baseline test for atomics on capability-size integers

67d9eb4

In order to report true for __atomic_always_lock_free(sizeof(uintptr_t), 0) we will have to also expand i128/i64 atomics using capability ops.

[CHERI-RISC-V] Support inline atomic stores for 2*XLen integers

75ce173

We can perform an atomic capability store from the raw bits. As can be seen from the test case, the IR-level i128 shift is elided and directly replaced with a csethigh.

[CHERI-RISC-V] Support inline atomic RMW for 2*XLen integers

f736e09

This completes the set of atomic operations on i128/i64 and will allow use to return true for `__atomic_always_lock_free(sizeof(__intcap))`.

[CHERI-RISC-V] Use camoswap.c for 2*XLen integer Xchg operations

7ecb141

This is a lot more efficient that a CMPXCHG loop.

[CHERI-RISC-V] Support atomic load/store with capability pointers

25303f9

We were just missing the necessary tablegen patterns to support this.

[CHERI-RISC-V] Support hybrid mode atomic load/store of 2*XLen integers

f4bd440

With the latest commit we can lower i64/i128 loads stores as an atomic capability store with the appropriate fences.

[CHERI] Add tests for __atomic_always_lock_free(sizeof(uintptr_t))

87e5054

We don't yet return true here for purecap CHERI since right now clang will emit libcalls for non-capability sizeof(uintptr_t) operations such as __int128 for 64-bit RISC-V.

Add a regression test for cmpxchg exact crash

1dcc97c

We are assuming there can only ever be one memory operand on the pseudo instructions but there could be more than one if the branch-folder pass merges identical blocks. Found while building CheriBSD.

[CHERI-Generic] Make it easier to add new substitutions

b30abe0

arichardson force-pushed the atomic-always-lock-free-fixes branch from a541ac1 to 0eb4a79 Compare February 19, 2024 04:41

arichardson commented Feb 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CHERI-RISC-V] Report true for __atomic_always_lock_free(sizeof(__intcap)) #721

[CHERI-RISC-V] Report true for __atomic_always_lock_free(sizeof(__intcap)) #721

arichardson commented Sep 22, 2023

arichardson Feb 18, 2024

arichardson Feb 20, 2024

[CHERI-RISC-V] Report true for __atomic_always_lock_free(sizeof(__intcap)) #721

Are you sure you want to change the base?

[CHERI-RISC-V] Report true for __atomic_always_lock_free(sizeof(__intcap)) #721

Conversation

arichardson commented Sep 22, 2023

arichardson Feb 18, 2024

Choose a reason for hiding this comment

arichardson Feb 20, 2024

Choose a reason for hiding this comment