-
Notifications
You must be signed in to change notification settings - Fork 857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osc/rdma segmentation fault when using IB atomics (c_post_start) #1329
Comments
Here is the stack trace (sorry no line number). The crash happens with only 2 ranks on rank 1.
|
EDIT: Added triple-single-ticks to the verbatim section to make it easier to read. |
Thanks Jeff. Nathan, the crash does not always happen on rank 1, sometimes it is on rank 0 also. |
My MTT was out of date on our clusters. Updated and now I see this issue. Will fix now. |
The typo caused SEGVs on systems with only fetching atomic support. Fixes open-mpi#1329 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Simple enough. There was a typo in the osc/rdma MPI_Win_start implementation. Fixed in the referencing PR. |
The typo caused SEGVs on systems with only fetching atomic support. Fixes open-mpi/ompi#1329 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov> (cherry picked from open-mpi/ompi@a19c265) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
The typo caused SEGVs on systems with only fetching atomic support. Fixes open-mpi#1329 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
The MTT test ibm/onesided/c_post_start still fails with segmentation fault when using new IB atomics.
Setting btl_openib_flags to 12599 makes the bug disappear.
The core I get is corrupted so I don't have a stack trace yet - will post it as soon as I get it.
The text was updated successfully, but these errors were encountered: