Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: rmutex compilation issue on ARM with llvm #12048

Closed
kaspar030 opened this issue Aug 20, 2019 · 5 comments · Fixed by #12401
Closed

core: rmutex compilation issue on ARM with llvm #12048

kaspar030 opened this issue Aug 20, 2019 · 5 comments · Fixed by #12401
Labels
Area: core Area: RIOT kernel. Handle PRs marked with this with care! Area: toolchain Area: toolchains; everything related to compilation, libc, linking, … Platform: ARM Platform: This PR/issue effects ARM-based platforms Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Comments

@kaspar030
Copy link
Contributor

Description

Cannot build anything with TOOLCHAIN=llvm for ARM.
It fails with a warning when compiling rmutex.c.

Googling the error points to where the warning was introduced in llvm.

Quote: "If an atomic variable is misaligned Clang will emit accesses to it as a libcall. These calls are likely to be slow and involve locks; they are almost certainly not what the developer intended. So this patch adds a warning (promotable or ignorable) for tat situation."

It seems like clang warns that it cannot use single-instruction magic (cmpxcg?), but has to resort to a compiler-rt library function call that uses locks etc.

I don't quite understand the issue. First I thought the owner variable sharing memory with the refcount variable was causing this, but changing "owner" to (word-sized) atomic_int doesn't solve the issue.

Steps to reproduce the issue

Try "BOARD=samr21-xpro make -C examples/hello-world".

Expected results

Actual results

rmutex.c:128:12: error: large atomic operation may incur significant performance penalty [-Werror,-Watomic-alignment]
    assert(atomic_load_explicit(&rmutex->owner,memory_order_relaxed) == thread_getpid());
           ^
/usr/arm-none-eabi/include/stdatomic.h:257:2: note: expanded from macro 'atomic_load_explicit'
        __c11_atomic_load(object, order)
        ^
rmutex.c:143:9: error: large atomic operation may incur significant performance penalty [-Werror,-Watomic-alignment]
        atomic_store_explicit(&rmutex->owner, KERNEL_PID_UNDEF, memory_order_relaxed);
        ^
/usr/arm-none-eabi/include/stdatomic.h:259:2: note: expanded from macro 'atomic_store_explicit'
        __c11_atomic_store(object, desired, order)
        ^
2 errors generated.

Versions

Using arch clang 8.0.1 and gcc 9.1.0 headers (full toolchain versions gist)

@kaspar030 kaspar030 added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Platform: ARM Platform: This PR/issue effects ARM-based platforms Area: core Area: RIOT kernel. Handle PRs marked with this with care! Area: toolchain Area: toolchains; everything related to compilation, libc, linking, … labels Aug 20, 2019
@maribu
Copy link
Member

maribu commented Oct 8, 2019

I cannot reproduce the issue with LLVM 9.0.0 - I guess it has been fixed upstream. Can you confirm that the issue has been resolved?

@maribu
Copy link
Member

maribu commented Oct 8, 2019

OK, works for nucleo-f767zi and bluepill (and does not insert the library calls to implement atomic accesses), but does not compile on samd21-xpro with TOOLCHAIN=llvm

@maribu
Copy link
Member

maribu commented Oct 8, 2019

I can also reproduce this with the nucleo-l073rz, another Cortex M0+ board.

@maribu
Copy link
Member

maribu commented Oct 8, 2019

I opened a bug report at the LLVM bug tracker: https://bugs.llvm.org/show_bug.cgi?id=43603

@maribu
Copy link
Member

maribu commented Oct 9, 2019

Some Background

  • GCC does implement atomic operations on embedded1 targets lock-free, whenever possible
    • This enforces that implementations of the library functions like __sync_fetch_and_add_2() and friends actually disable interrupts to achieve atomic behavior. RIOT already does this and it seems to be the best implementation to me anyway
  • If clang is not able to implement at least one possible atomic operation on a given type without resorting to the library call, it will using library calls for all atomic operations on that type
    • The reason is that clang takes into account that __sync_fetch_and_add_2() and friends could be using e.g. a mutex instead of disabling interrupts. But the mutex would not exclude other operations on the same value, if they are not using the library call. So in that case the guarantees of libatomic would be violated.

To us, the behavior of GCC is clearly better, as the assumption of disabling interrupts during the implementation of the library functions is totally sane for embedded1 targets, applies to RIOT, and yields better performance. I hope that I can convince the LLVM guys to just also assume that on embedded bare metal targets.

1 Embedded target here shall mean: Single-threaded, single-core bare metal targets without MMU and any meaningful distinction between user and kernel mode

The Actual Issue

The warning of clang that it used the library function to implement the atomic operation is preventing the compilation due to -Werror. But that warning has no value to us. We are not using C11 atomics to achieve the best performance in a highly concurrent high performance application on CPUs with many cores.

Taking the following example:

extern atomic_int_least32_t bar;

void foo1(void)
{
    unsigned state = irq_disable();
    bar += 42;
    irq_restore(state);
}

void foo2(void)
{
    atomic_fetch_add(&bar, 42);
}

In the context of RIOT: On systems supporting lock free implementations of atomic_fetch_add() the version in foo2() would be faster, on every other platform (a call to __sync_fetch_and_add_2() is generated to implement atomic_fetch_add(&bar, 42)) the performance is exactly the same with LTO enabled. And also foo2() only is one line of code, where foo1() is three.

So to us the warning really has no value - we are totally fine with the library calls. We should therefor just disable the warning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: core Area: RIOT kernel. Handle PRs marked with this with care! Area: toolchain Area: toolchains; everything related to compilation, libc, linking, … Platform: ARM Platform: This PR/issue effects ARM-based platforms Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
2 participants