ARM32 (and possibly some others) might take locks in Interlocked FCalls #97452

MichalPetryka · 2024-01-24T16:12:28Z

While working on #92974, I've realised that on some platforms without builtin atomics for all types, compilers might use a locking based implementation for them.
Checking the C11 defines, it seems like at least ARM32 is affected with Clang, but other platforms might be affected with GCC too.

Looking at the exact codegen, ARM32 and ARM 8.0 without atomics seem to use helper calls for Interlocked, only arm32 gives warnings about locking though.

The code should be fixed to avoid potential issues with this, I see 4 possible solutions:

Make the helper calls QCalls - expensive
Switch to preemptive in the FCall on affected platforms - also expensive
Implement Interlocked with always expand intrinsics on affected platforms - requires JIT work (should already be handled on ARM 8.0 in .NET 9)
Rewrite the unmanaged helpers in raw assembly in a way that doesn't use locks - requires VM work

Relying on the compiler to be lock free on any platform that uses helper calls for atomics seems unreliable as AFAIR those are free to implement them using locks.
cc @jkotas

jkotas · 2024-01-24T16:42:18Z

Do __sync... intrinsics have this problem? I think the simplest solution is to avoid __atomic_exchange_n and switch to __sync... intrinsics: #92974 (comment) .

MichalPetryka · 2024-01-24T16:43:44Z

Do __sync... intrinsics have this problem? I think the simplest solution is to avoid __atomic_exchange_n and switch to __sync... intrinsics: #92974 (comment) .

They also emit helper calls, they don't issue a warning though. I wouldn't rely on it being fine however as their docs say:

Not all operations are supported by all target processors. If a particular operation cannot be implemented on the target processor, a warning will be generated and a call an external function will be generated. The external function will carry the same name as the builtin, with an additional suffix `_n' where n is the size of the data type.

which makes it seem like there should be a warning in such case.

jkotas · 2024-01-24T22:40:18Z

It is reasonable to assume that the helper calls that __sync intrinsics compile into do not use locks. The helper calls can be used e.g. to select the most efficient implementation for given hardware.

Per LLVM documentation, __sync_... intrinsics are required to not use locks. From https://github.com/llvm/llvm-project/blob/main/llvm/docs/Atomics.rst#libcalls-__sync_ : "the Target in LLVM can claim support for atomics of an appropriate size, and then implement some subset of the operations via libcalls to a _sync* function. Such functions must not use locks in their implementation"

MichalPetryka · 2024-01-24T23:25:28Z

This doesn't seem to be guaranteed in GCC:

These functions are implemented in terms of the ‘__atomic’ builtins (see https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html).
They should not be used for new code which should use the ‘__atomic’ builtins instead.

Not all operations are supported by all target processors.
If a particular operation cannot be implemented on the target processor, a warning is generated and a call to an external function is generated.
The external function carries the same name as the built-in version, with an additional suffix ‘_n’ where n is the size of the data type.

jkotas · 2024-01-24T23:31:44Z

It may not be guaranteed, but it is reasonable to assume that it is the case.

Also, our officially supported build is clang/llvm. There is very little testing of gcc based builds. It is likely that gcc based builds have number of issues, so one more potential gcc-specific issue is not really a problem.

MichalPetryka · 2024-01-25T00:11:35Z

It seems however that with GCC all atomics are lock free with the exception of 64bit integers on RISC-V 32bit.
It seems to not use helpers on arm32.

Makes PAL use intrinsics that are guaranteed to not lock with Clang and with other compilers (probably only GCC) checks the C11 `lock free atomics` defines as there seems to be no better way to verify safety there. Fixes dotnet#97452.

ghost added the untriaged New issue has not been triaged by the area owner label Jan 24, 2024

dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jan 24, 2024

MichalPetryka mentioned this issue Jan 24, 2024

Implement Interlocked for small types. #92974

Merged

filipnavara added arch-arm32 area-PAL-coreclr and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jan 25, 2024

MichalPetryka mentioned this issue Jan 25, 2024

Use non locking atomic intrinsics #97527

Closed

ghost added the in-pr There is an active PR which will close this issue when it is merged label Jan 25, 2024

mangod9 removed the untriaged New issue has not been triaged by the area owner label Jul 3, 2024

mangod9 added this to the Future milestone Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARM32 (and possibly some others) might take locks in Interlocked FCalls #97452

ARM32 (and possibly some others) might take locks in Interlocked FCalls #97452

MichalPetryka commented Jan 24, 2024

jkotas commented Jan 24, 2024

MichalPetryka commented Jan 24, 2024

jkotas commented Jan 24, 2024

MichalPetryka commented Jan 24, 2024

jkotas commented Jan 24, 2024

MichalPetryka commented Jan 25, 2024

ARM32 (and possibly some others) might take locks in Interlocked FCalls #97452

ARM32 (and possibly some others) might take locks in Interlocked FCalls #97452

Comments

MichalPetryka commented Jan 24, 2024

jkotas commented Jan 24, 2024

MichalPetryka commented Jan 24, 2024

jkotas commented Jan 24, 2024

MichalPetryka commented Jan 24, 2024

jkotas commented Jan 24, 2024

MichalPetryka commented Jan 25, 2024