-
Notifications
You must be signed in to change notification settings - Fork 903
PPC64 - may have problems with atomic make check tests #2610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
FWIW I tested the same RC on multiple PPC64 w/o failures of "make check". Additionally, I have just now tried gcc-6.2 on both platforms and pass "make check" just fine. v2.x defaults to using the gcc builtin atomics when present. So, changes to the inline asm in PR #2178 might be unrelated unless some non-default configure flags or a compiler was used which lacks GNU inline asm support (note xlc-13.1 for Linux does provide the gcc builtins). |
Full Fedora build logs: gcc is 6.2.1-2.fc26 |
I am able to reproduce by configuring with @hjelmn my OpenPOWER VM you used for PR #2178 has gcc-5.4, but I suspect (given failures w/ 4.x and 6.2) that it will also reproduce the problem if you lack alternatives. |
I see this in the configure logs:
|
@opoplawski Yes, I also noticed |
It turns out that the default for --enable-builtin-atomics is disabled if not specified. Is that expected? Should I be specifying --enable-builtin-atomics in my builds? |
I realize now that my original (no error) builds included --enable-debug while my later ones removed that flag in addition to adding --disable-builtin-atomics. So, I am not certain any longer about what I claimed about which atomics are enabled by default and about the correlation between disabling the builtins and reproducing your error. I am going to step away from this issue now, and leave it to Nathan and Josh (who probably have a better idea how to fix the problem). |
I'll note that it looks like if I set --enable-builtin-atomics my builds succeed: https://koji.fedoraproject.org/koji/taskinfo?taskID=17005965 |
@opoplawski is it okay to require --enable-builtin-atomics for your purposes for 2.0.2? |
Seems fine to me - I know nothing about these options though. |
@hppritcha
|
@PHHargrove I wasn't intending to imply we'd fix this in the documentation, but we can use this issue as such. |
I'm making this a blocker for 2.0.2. There appear to be more issues with PPC64 and atomics than we'd thought. |
removed @hjelmn he has enough to do. |
This looks like a regression caused by:
Adding the constraint to the input operands list seems to fix the problem in my limited tests. Please review/test and I'll open a PR. The issue with the build not using the builtin atomics needs investigation. However it shouldn't be a blocker. It might actually be better if we switch over to the builtin atomics on PPC only after we verify there are no performance regressions. --- a/opal/include/opal/sys/powerpc/atomic.h
+++ b/opal/include/opal/sys/powerpc/atomic.h
@@ -223,6 +223,7 @@ static inline int32_t opal_atomic_swap_32(volatile int32_t *addr, int32_t newval
#if (OPAL_ASSEMBLY_ARCH == OPAL_POWERPC64)
#if OPAL_GCC_INLINE_ASSEMBLY
+
static inline int64_t opal_atomic_add_64 (volatile int64_t* v, int64_t inc)
{
int64_t t;
@@ -232,7 +233,7 @@ static inline int64_t opal_atomic_add_64 (volatile int64_t* v, int64_t inc)
" stdcx. %0, 0, %3 \n\t"
" bne- 1b \n\t"
: "=&r" (t), "=m" (*v)
- : "r" (OPAL_ASM_VALUE64(inc)), "r" OPAL_ASM_ADDR(v)
+ : "r" (OPAL_ASM_VALUE64(inc)), "r" OPAL_ASM_ADDR(v), "m" (*v)
: "cc");
return t;
@@ -249,7 +250,7 @@ static inline int64_t opal_atomic_sub_64 (volatile int64_t* v, int64_t dec)
" stdcx. %0,0,%3 \n\t"
" bne- 1b \n\t"
: "=&r" (t), "=m" (*v)
- : "r" (OPAL_ASM_VALUE64(dec)), "r" OPAL_ASM_ADDR(v)
+ : "r" (OPAL_ASM_VALUE64(dec)), "r" OPAL_ASM_ADDR(v), "m" (*v)
: "cc");
return t;
@@ -268,7 +269,7 @@ static inline int opal_atomic_cmpset_64(volatile int64_t *addr,
" bne- 1b \n\t"
"2:"
: "=&r" (ret), "=m" (*addr)
- : "r" (addr), "r" (OPAL_ASM_VALUE64(oldval)), "r" (OPAL_ASM_VALUE64(newval))
+ : "r" (addr), "r" (OPAL_ASM_VALUE64(oldval)), "r" (OPAL_ASM_VALUE64(newval)), "m" (*addr)
: "cc", "memory");
return (ret == oldval); |
Do we want to get @nysal 's proposed fix in for 2.0.2 or push out to 2.0.3? |
@hppritcha I'd like to get this in 2.0.2. PR coming up shortly. |
@opoplawski is reporting a problem with the 2.0.2 rc1 candidate on PPC64:
There's a suspect PR #2178 - looks like only major change with PPC atomics between 2.0.1 (where the make distcheck works) and 2.0.2rc.
https://www.mail-archive.com/devel@lists.open-mpi.org//msg19851.html
The text was updated successfully, but these errors were encountered: