-
Notifications
You must be signed in to change notification settings - Fork 929
fix problems with handling empty status #13478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
hppritcha
commented
Oct 28, 2025
Testing with mpi4py for MPI 4.1 compliance uncovered a long existing problem in the way Open MPI handles (incorrectly) returning an empty status when that is specified by the MPI standard. With this fix, mpi4py passes if we declare MPI 4.1 compliance except for the big count tests. Signed-off-by: Howard Pritchard <howardp@lanl.gov>
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
but without bigcount tests Signed-off-by: Howard Pritchard <howardp@lanl.gov>
|
The fix is not to touch the |
| *indx = MPI_UNDEFINED; | ||
| if (MPI_STATUS_IGNORE != status) { | ||
| OMPI_COPY_STATUS(status, ompi_status_empty, false); | ||
| OMPI_COPY_STATUS(status, ompi_status_empty, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not what the standard mandates. Section 3.2.5 it states:
... message-passing calls do not modify the value of the error code field of
status variables. This field may be updated only by the functions in Section 3.7.5 that
return multiple statuses. The field is updated if and only if such function returns with an
error code of MPI_ERR_IN_STATUS.
In most of these cases the return is MPI_SUCCESS so the MPI_ERROR shall not be modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are multiple places in the standard where it states an empty status object is to be returned. see my comments below about that.
|
nope its to set it to MPI_SUCCESS in the special cases. That's what mpi4py was catching for a |
|
I don't think we are talking about the same thing. Whatever is in the empty status is one thing, what I was pointing out is that you alter the field MPI_ERROR in the status object on a call that does not return MPI_ERR_IN_STATUS. This is incorrect according to my reading of the standard. |
|
@dalcinl check out this discussion since it was your tests which showed the problem. |
|
here's an example of wording for |
|
and here's what the standard says about what an empty status is |
|
I also pointed to a different section in the standard that states something different. I think the correct solution is to properly define the empty status as you did, but do not set the MPI_ERROR field unless you are returning |
|
then the mpi4py tests would not pass |
|
Whatever! mpi4py is not the MPI standard, it is an interpretation that might be incorrect. |
|
@hppritcha Can you point me to the problematic test? |
I think bosilca would say line 101 is incorrect. |
|
So, let's reduce the lines to the relevant ones index, flag = MPI.Request.Get_status_any(self.REQUESTS, status)
self.assertEqual(index, MPI.UNDEFINED)
self.assertTrue(flag)
self.assertEqual(status.source, MPI.ANY_SOURCE)
self.assertEqual(status.tag, MPI.ANY_TAG)
self.assertEqual(status.error, MPI.SUCCESS)The call to @bosilca What am I missing? |
|
Please read above. According to Section 3.2.5 we should never alter the MPI_ERROR field in the status object unless we are returning MPI_ERR_IN_STATUS. I'm not stating this makes sense, but it is what the standard mandates. |
|
So basically if I set some value X in the |
|
to me there seem to be conflicting statements in the standard. Where it is state that an empty status is returned under certain conditions, this would imply to me that if it were tested using a status query method that the behavior would be as described by the empty status description. |
|
Well, the section saying to not touch MPI_ERROR is talking about "message passing calls", which is a bit generic. I'm wondering if whoever wrote that had test for completion calls in mind (most certainly not). In my personal taste, the more specific mandate to return empty status in completion calls should trump. |
|
I agree, these are conflicting statement. What I don't agree to is changing OMPI behavior to match a different understanding of a conflicting statement, for the single reason to pass some tests. Instead change the test to accept a different, but still legitimate, reading of the standard. |
|
in any case thanks @dalcinl for rooting out we had an incorrect definition of empty status irrespective of whether to copy the error field over or not. I'll open an issue about this for mpi standard. |
|
@hppritcha I this the only test failing? What about the test for |
|
that was the only one that failed for me. |
We are talking about the behavior in a very new MPI routine, that Open MPI has not yet released, right?
No, that is inaccurate. I fundamentally disagree with your interpretation, as it is based in a part of the standard that was likely written years before the new stuff was added. Also, the behavior that I champion is also the one the other major MPI implementation chose. I respectfully request for the Open MPI community to reconsider this behavior, and put it to vote if there is such procedure. |
|
All tests handling requests and status should fail, because we were very careful never to overwrite the MPI_ERROR. |
|
i'll split out the controversial parts of this PR into another one. for sure we didn't define the empty status correctly and i want that in. |
There you have, then... Is |
|
the mpi4py stopped at the first failure which was for the any test. it probably would have failed the all test if the test framework had kept on going. |
|
if you want to do more testing, modify your testany, wait, and waitall tests to catch the cases where an "empty status" is returned. |
|
Indeed, it seems like @hppritcha I pushed a branch at mpi4py@update-ompi-main that will automatically build with all the new MPI 5.0 stuff. You can use that branch against the legacy OMPI ABI. There are some additional test failures. Sorry, I cannot do more for now, I'm really busy with other stuff. Additionally, and despite @bosilca's claims, it seems that |
This PR does not solely address the behavior in a new MPI routine that Open MPI has yet to release. Instead, it affects all request testing and completion routines.
I believe it is incorrect to dismiss portions of the standard that were established before the introduction of new requirements, unless clearly deprecated. It could also be possible, even highly plausible, that the individual(s) who added the new requirements was unaware of the existing, well-defined standards related to status handling.
I am not sure why there is a preference for the behavior championed by another major MPI implementation. Our approach is clearly mandated by the standard. Why would you not champion ours ?
From a practical standpoint, I find this specific MPI requirement to be cumbersome and unnecessary, particularly given the rationale in the standard (saving a store on an already loaded cache line at the expense of a branch). We adhered to a different interpretation (similar to yours) for a considerable time until it was brought to our attention four years ago that we were not in compliance with the standard. Consequently, we modified our code to ensure compliance. I see no reason to change this until the MPI standard clarifies the outcome of status handling. |
related to open-mpi#13478 but without the controversial stuff. Signed-off-by: Howard Pritchard <howardp@lanl.gov>
related to open-mpi#13478 but without the controversial stuff. Signed-off-by: Howard Pritchard <howardp@lanl.gov> (cherry picked from commit 4220027)