-
Notifications
You must be signed in to change notification settings - Fork 931
osc/base: Detect unsupported data types and abort #2928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osc/base: Detect unsupported data types and abort #2928
Conversation
Using MPI_MINLOC or MPI_MAXLOC with the following data types leads to data corruption: * MPI_DOUBLE_INT * MPI_LONG_INT * MPI_SHORT_INT * MPI_LONG_DOUBLE_INT Detect this print a error message and abort. This workaround should be removed once the following issue is resolved: * open-mpi#1666 Signed-off-by: Joshua Hursey <jhursey@us.ibm.com> (cherry picked from commit 94f92f6) Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
|
Refs PR #2832 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consolidate the opal_output and opal_show_help into a single, descriptive/useful show_help message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me for a temporary stop-gap.
I'd like to see a write up of what Nathan's proposed solution might look like. We may (or may not) have time to work on this.
|
@jjhursey I think this patch will only work for osc/pt2pt. osc/rdma will probably need a similar check in ompi_osc_rdma_rget_accumulate_internal() |
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
362ff7b to
13d965f
Compare
|
I've updated this PR to reflect the changes in PR #2927 - I'll keep them in sync if there are any further changes necessary. |
Using MPI_MINLOC or MPI_MAXLOC with the following data types leads to data corruption: * MPI_DOUBLE_INT * MPI_LONG_INT * MPI_SHORT_INT * MPI_LONG_DOUBLE_INT Detect this print a error message and abort. This workaround should be removed once the following issue is resolved: * open-mpi#1666 Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
13d965f to
cc5747f
Compare
|
bot:mellanox:retest |
|
@hppritcha Once CI finishes, good to go |
Using MPI_MINLOC or MPI_MAXLOC with the following data types
leads to data corruption:
Detect this print a error message and abort.
This workaround should be removed once the following issue is resolved:
Signed-off-by: Joshua Hursey jhursey@us.ibm.com
(cherry picked from commit 94f92f6)
Signed-off-by: Joshua Hursey jhursey@us.ibm.com