Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v4.0.x: adding op-codes for syscall ipc for shmat/shmdt #7778

Merged
merged 1 commit into from
Jun 8, 2020

Conversation

markalle
Copy link
Contributor

@markalle markalle commented Jun 3, 2020

FOR USERS WHO ENDED UP HERE FROM THE README

This PR fixes a bug where glibc >= v2.23 removed some constants from a header file that Open MPI uses to intercept the shmat() and shmdt() function calls.

Does this bug affect you?

This bug affects your application only if all of the following are satisfied:

  • Systems that are using glibc >= v2.23 (e.g., RHEL 8/Centos 8)
  • Applications that call shmat() (i.e., SYSV shared memory APIs) and then invokes MPI messaging APIs such as MPI_SEND or MPI_RECV from that shared memory segment.
  • Environments that use Open MPI's memory registration cache (e.g., the openib BTL, the uct BTL, the pami PML, and/or the pami OSC).

This is a fairly narrow set of circumstances, but in this case, silent data corruption can occur. Users are encouraged to upgrade immediately.


These op codes used to be in bits/ipc.h but were removed in glibc in 2015
with a comment saying they should be defined in internal headers:
https://sourceware.org/bugzilla/show_bug.cgi?id=18560
and when glibc uses that syscall it seems to do so from its own definitions:
https://github.com/bminor/glibc/search?q=IPCOP_shmat&unscoped_q=IPCOP_shmat

So I think using #ifndef and defining them if they're not already defined
using the values from glibc is the best option.

At IBM it was the testing on redhat 8 that found this as an issue
(the opcodes being undefined on the system made the #define HAS_SHMDT
evaluate to false so intercept_shmat / intercept_shmdt were
left undefined so shmat/shmdt memory events went unintercepted).

(cherry picked from commit e8fab05)
Signed-off-by: Mark Allen markalle@us.ibm.com

@awlauria awlauria changed the title adding op-codes for syscall ipc for shmat/shmdt v4.0.x: adding op-codes for syscall ipc for shmat/shmdt Jun 3, 2020
@jsquyres
Copy link
Member

jsquyres commented Jun 3, 2020

Isn't this needed on master, too? It's unusual to bring this to a release branch before master.

EDIT: Disregard: I see that this is a cherry pick from a PR that isn't yet on master (#7777).

@gpaulsen
Copy link
Member

gpaulsen commented Jun 4, 2020

@markalle Please update the commit comment. As @bwbarrett pointed out on the master PR #7777, it looks like the commit message was truncated.

@gpaulsen gpaulsen self-requested a review June 4, 2020 16:21
Copy link
Member

@gpaulsen gpaulsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending the commit message update, I approve.

We saw this problem on RHEL8. These two OS calls are not defined in the OS headers, but if an application calls either of them, then it can cause a Wrong Answer since the memory hooks don’t intercept correctly.

Unfortunately, our only reliable test case is quite complex, and requires our ppc64le math library.

These op codes used to be in bits/ipc.h but were removed in glibc in 2015
with a comment saying they should be defined in internal headers:
https://sourceware.org/bugzilla/show_bug.cgi?id=18560
and when glibc uses that syscall it seems to do so from its own definitions:
https://github.com/bminor/glibc/search?q=IPCOP_shmat&unscoped_q=IPCOP_shmat

So I think using #ifndef and defining them if they're not already defined
using the values from glibc is the best option.

At IBM it was the testing on redhat 8 that found this as an issue
(the opcodes being undefined on the system made the #define HAS_SHMDT
evaluate to false so intercept_shmat / intercept_shmdt were
left undefined so shmat/shmdt memory events went unintercepted).

(cherry picked from commit e8fab05)
Signed-off-by: Mark Allen <markalle@us.ibm.com>
@markalle
Copy link
Contributor Author

markalle commented Jun 4, 2020

I updated the commit, I was just meaning for it to describe how IPCOP_shmat/shmat not being defined in the system causes the macro to be false so intercept_shmat/shmdt go undefined

@gpaulsen gpaulsen merged commit 9fb905f into open-mpi:v4.0.x Jun 8, 2020
@markalle
Copy link
Contributor Author

I've submitted a bug report to gcc to see if they agree that the suspicious early store of r2 should be considered a bug, and gave a simple reproducer. I'm also submitting a bug report to xlc because our compiler can do the same thing.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95692

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants