-
Notifications
You must be signed in to change notification settings - Fork 912
pmix3x: use PMIX_VALUE_LOAD() and PMIX_INFO_LOAD() macros #11472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pmix3x: use PMIX_VALUE_LOAD() and PMIX_INFO_LOAD() macros #11472
Conversation
Hello! The Git Commit Checker CI bot found a few problems with this PR: 381aabf: pmix3x: use PMIX_VALUE_LOAD() and PMIX_INFO_LOAD()...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
381aabf
to
2b97485
Compare
Hello! The Git Commit Checker CI bot found a few problems with this PR: 2b97485: pmix3x: use PMIX_VALUE_LOAD() and PMIX_INFO_LOAD()...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
2b97485
to
51acc18
Compare
Hello! The Git Commit Checker CI bot found a few problems with this PR: 51acc18: pmix3x: use PMIX_VALUE_LOAD() and PMIX_INFO_LOAD()...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
Refs. open-mpi#10416 bot:notacherrypick Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
51acc18
to
6e8e14f
Compare
@ggouaillardet What's the status of this PR? It's still in "Draft" mode. |
I did not have much time to work on it.
It is not fully completed, but i think it can be used.
FWIW, it avoids an immediate problem with the latest PMIx (4.2.3 IIRC) that
will be fixed in 4.2.4.
…On Wed, Mar 15, 2023, 00:00 Jeff Squyres ***@***.***> wrote:
@ggouaillardet <https://github.com/ggouaillardet> What's the status of
this PR? It's still in "Draft" mode.
—
Reply to this email directly, view it on GitHub
<#11472 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABXF524I44TMIOZGX5OLAJLW4CBYJANCNFSM6AAAAAAVTGOHZI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Well, it won't be "fixed" in 4.2.4 as it isn't really a "problem" in 4.2.3 😄 - it has to do with the OMPI code not sticking to the official public APIs and instead using internal PMIx library functions (albeit publicly visible symbols). All I've done for 4.2.4 is to "typedef" the old symbols to their new standardized versions to help alleviate breakage. I'm not sure I understand why all the changes were made to switch to "info_load" instead of "value_load" as the latter remains a perfectly fine, standardized way to load values. You just need to use the official macros instead of the internal functions. You are welcome to make the change if that's what you want - but it certainly isn't required. |
@rhc54 could you point out which commits you are referring to on https://github.com/openpmix/openpmix/tree/v4.2 so that we can potentially cherry-pick from that branch? |
Looks like it was 8e70e37e2 |
@rhc54 I actually tested these two approaches:
But only the first one fixes the issues with |
Quite possible that the fix post-dates the |
The pinpointed commit openpmix/openpmix@8e70e37 is older than the tag - it's 3rd line below:
|
Could be that there are other commits required - I honestly don't know. Try with the head of the v4.2 branch and see if the problem still exists. |
Ok, I just did that and the current head of the v4.2 branch (openpmix/openpmix@cd813ef) behaves the same as the v4.2.4rc1 tag (I get a deadlock). |
Then I guess you should use this patch - I have no other advice as this is an OMPI integration issue. |
@ggouaillardet It looks like this patch is necessary to both fix #11729 and make Open MPI v4.1.x compatible with any PMIx >= v4.2.3. You still have this patch marked as "Draft". What's the current status? |
I believe it is draft solely because I questioned whether all the changes are truly necessary - however, there is no harm done by the larger patch. This patch should be fine as-is. |
Ok. I see @ggouaillardet's comment from 14 Mar:
Is there more to be done here? |
We have a report from #11729 (comment) that it makes his usage of Open MPI v4.1.x work with PMIx 4.2.3. @ggouaillardet Is this PR sufficient as-is? |
We had some off-PR discussion about this:
|
I looked it over and think it is okay. There are additional data types supported by PMIx that are not covered here, but they also are not data types used by OMPI - so no point in worrying about something you'll never see. |
Thanks @rhc54! |
#11472 has been merged, so I'm closing this issue. Thank you for the report! |
I needed this to use any MPI application with srun with pmix or with plain mpirun, or else I would run into OOB/TCP communication errors for even "mpirun hostname" across two nodes (see open-mpi/ompi#11729 for another user with the same problem). This patch is taken from open-mpi/ompi#11472 and with it, both srun and mpirun run flawlessly without further ado.
Refs. #10416
bot:notacherrypick