-
Notifications
You must be signed in to change notification settings - Fork 931
opal/fifo: add missing memory barrier #770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
|
This looks good to me. I verified that the opal_fifo test now passes on ppc. Although unrelated to this patch, opal_lifo seems to hit a segmentation fault. Does the opal_lifo test pass for you? |
|
I couldn't reproduce the lifo crash in the virtual machine. I will take another look today to see if I can identify the issue. |
opal/fifo: add missing memory barrier
|
Here's the stack: static inline opal_list_item_t *opal_lifo_pop_atomic (opal_lifo_t* lifo)
{
opal_list_item_t *item;
while ((item = (opal_list_item_t *) lifo->opal_lifo_head.data.item) != &lifo->opal_lifo_ghost) {
opal_atomic_rmb();
/* ensure it is safe to pop the head */
if (opal_atomic_swap_32((volatile int32_t *) &item->item_free, 1)) {
continue;
}Is the read barrier above for ordering the loads for "item" and "item->item_free"? If so, it might not be needed as dependent (data dependency) loads are always executed in order on most architectures (Probably Alpha is the exception). I tried adding a write barrier as shown below and that seems to help. I commented out the read barrier, but thats inconsequential to the segv issue: --- a/opal/class/opal_lifo.h
+++ b/opal/class/opal_lifo.h
@@ -187,13 +187,14 @@ static inline opal_list_item_t *opal_lifo_pop_atomic (opal_lifo_t* lifo)
{
opal_list_item_t *item;
while ((item = (opal_list_item_t *) lifo->opal_lifo_head.data.item) != &lifo->opal_lifo_ghost) {
- opal_atomic_rmb();
+ //opal_atomic_rmb();
/* ensure it is safe to pop the head */
if (opal_atomic_swap_32((volatile int32_t *) &item->item_free, 1)) {
continue;
}
+ opal_atomic_wmb ();
/* try to swap out the head pointer */
if (opal_atomic_cmpset_ptr (&lifo->opal_lifo_head.data.item, item,
(void *) item->opal_list_next)) { |
|
Thanks Nathan. The fix works fine here. |
|
@nysal Great! Thanks for testing. I will open a PR for 2.x. |
Pr/commits of week 47
Signed-off-by: Nathan Hjelm hjelmn@lanl.gov