Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PMIx detection is broken in OMPI 4.x and below when PMIx 3x and above are used #8823

Closed
artpol84 opened this issue Apr 16, 2021 · 3 comments
Closed

Comments

@artpol84
Copy link
Contributor

artpol84 commented Apr 16, 2021

Background information

I was helping folks from LANL to debug the issues with their Slurm/PMIx environment and we observed the bug related to the PMIx detection logic. Basically for PMIx v3.x and above the PMIx is not correctly detected by OMPI.

  • The ORTE/PRRTE configurations work because PMIx component doesn't disable itself completely, but rather reduces the priority from "100" down to "5". And in the absence of other alternatives, it's still selected.
  • For Slurm it went unnoticed because "s1" component is silently selected (it's always available in Slurm as it's part of the core).

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

In the experiments, OMPI 4.1.0 was used with internal PMIx v3.2.2.
Slurm was built with 2 versions of PMIx:

  • v2.2.4 (plugin name pmix_v2)
  • v3.1.5 (plugin name pmix_v3)

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

From tarball sources.

Details of the problem

It was noticed that when doing a direct launch with Slurm PMIx plugin built with PMIx v3.1.5 the s1 component is selected:

$ env OMPI_MCA_pmix_base_verbose=100 srun -n 2 --mpi=pmix_v3 ./ompi_hello_c
...
[node001:43165] mca:base:select:( pmix) Querying component [isolated]
[node001:43165] mca:base:select:( pmix) Query of component [isolated] set priority to 0
[node001:43165] mca:base:select:( pmix) Querying component [pmix3x]
[node001:43165] mca:base:select:( pmix) Query of component [pmix3x] set priority to 5
[node001:43165] mca:base:select:( pmix) Querying component [s1]
[node001:43165] mca:base:select:( pmix) Query of component [s1] set priority to 10
[node001:43165] mca:base:select:( pmix) Querying component [s2]
[node001:43165] mca:base:select:( pmix) Selected component [s1]
...

The issue is not observed with the PMIx v2.x based Slurm PMIx plugin.

The selection logic in PMIx is based on the following environment variables:

    if (NULL != (t = getenv("PMIX_SERVER_URI")) ||
        NULL != (id = getenv("PMIX_ID"))) {
        /* if PMIx is present, then we are a client and need to use it */
        *priority = 100;
    } else {
        /* we could be a server, so we still need to be considered */
        *priority = 5;
    }

So priority "5" is set if the PMIx presence is not detected.

Inspecting the environment observed by the application processes for PMIx v3.x and 2.x with the following command:

$ env | grep PMIX

shows the following:

PMIX v2

PMIX_BFROP_BUFFER_TYPE=PMIX_BFROP_BUFFER_FULLY_DESC
PMIX_DSTORE_21_BASE_PATH=/var/spool/slurmd/pmix.1591154.7//pmix_dstor_ds21_41512
PMIX_DSTORE_ESH_BASE_PATH=/var/spool/slurmd/pmix.1591154.7//pmix_dstor_ds12_41512
PMIX_GDS_MODULE=ds21,ds12,hash
PMIX_NAMESPACE=slurm.pmix.1591154.7
PMIX_PTL_MODULE=tcp,usock
PMIX_RANK=0
PMIX_SECURITY_MODE=native,none
PMIX_SERVER_TMPDIR=/var/spool/slurmd/pmix.1591154.7/
PMIX_SERVER_URI21=pmix-server.41512;tcp4://127.0.0.1:60120
PMIX_SERVER_URI2=pmix-server.41512;tcp4://127.0.0.1:60120
PMIX_SERVER_URI2USOCK=pmix-server:41512:/tmp/pmix-41512
PMIX_SERVER_URI=pmix-server:41512:/tmp/pmix-41512
PMIX_SYSTEM_TMPDIR=/tmp
SLURM_PMIX_MAPPING_SERV=(vector,(0,2,1))

PMIx v3

PMIX_BFROP_BUFFER_TYPE=PMIX_BFROP_BUFFER_FULLY_DESC
PMIX_DSTORE_21_BASE_PATH=/var/spool/slurmd/pmix.1591154.6//pmix_dstor_ds21_40985
PMIX_DSTORE_ESH_BASE_PATH=/var/spool/slurmd/pmix.1591154.6//pmix_dstor_ds12_40985
PMIX_GDS_MODULE=ds21,ds12,hash
PMIX_HOSTNAME=ko014.localdomain
PMIX_NAMESPACE=slurm.pmix.1591154.6
PMIX_PTL_MODULE=tcp,usock
PMIX_RANK=0
PMIX_SECURITY_MODE=native
PMIX_SERVER_TMPDIR=/var/spool/slurmd/pmix.1591154.6/
PMIX_SERVER_URI21=pmix-server.40985;tcp4://127.0.0.1:48103
PMIX_SERVER_URI2=pmix-server.40985;tcp4://127.0.0.1:48103
PMIX_SERVER_URI3=pmix-server.40985;tcp4://127.0.0.1:48103
PMIX_SYSTEM_TMPDIR=/tmp
PMIX_VERSION=3.1.5rc4
SLURM_PMIX_MAPPING_SERV=(vector,(0,2,1))

This shows that PMIX_SERVER_URI envar that is used by PMIx selection logic is no longer present for PMIx v3.

The ORTE-based launch only works because no other component is available:

env OMPI_MCA_pmix_base_verbose=100 mpirun --map-by ppr:1:node ./ompi_hello_c
...
[ko014.localdomain:43337] mca:base:select: Auto-selecting pmix components
[ko014.localdomain:43337] mca:base:select:( pmix) Querying component [pmix3x]
[ko014.localdomain:43337] mca:base:select:( pmix) Query of component [pmix3x] set priority to 5
[ko014.localdomain:43337] mca:base:select:( pmix) Selected component [pmix3x]
...

As the fix, I suggest to check for env with the name PMIX_SERVER_URI*.
Maybe a better solution can be implemented on PMIx side.

@rhc54
Copy link
Contributor

rhc54 commented Apr 17, 2021

The PMIX_SERVER_URI envar was used by the old usock transport that was dropped starting with PMIx v3. PMIx, therefore, cannot restore it as that would falsely imply that the client could use usock to connect to the server. The better solution is, as you suggest, to have OMPI's component check for a generic envar that starts with PMIX_SERVER_URI. The PMIx library will then find and use the correct version of that envar to make the connection. Alternatively, you could just look for PMIX_NAMESPACE as that would also be generic across PMIx versions, and is a reliable indicator that the process was started by a PMIx server.

Unfortunately, we cannot just call PMIx_Init and declare to use PMIx if it succeeds as the PMIx client library will simply assume we are running as a singleton if no PMIx server is detected. This is fine in OMPI v5 as we only support PMIx there and going forward, but not acceptable in prior OMPI releases.

@artpol84
Copy link
Contributor Author

I see, thank you Ralph.
This is what I suspected about the old envar.
I think that looking for the PMIX_NAMESPACE is a good idea as this is something not specific to the PMIx component but rather a generic envar.
I'll open the PR and test it with my LANL colleagues.

@jsquyres jsquyres modified the milestone: v4.1.1 Apr 19, 2021
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for the details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for more details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for the details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for more details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for the details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for more details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for the details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for more details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for the details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for more details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for the details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
(cherry picked from commit 0b3c1d9)
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for more details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
(cherry picked from commit 2210251)
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for the details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for more details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for the details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
(cherry picked from commit 30b29b3)
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for more details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
(cherry picked from commit ea6e2d8)
artpol84 added a commit to artpol84/ompi that referenced this issue Apr 20, 2021
See open-mpi#8823 for more details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
@rhc54
Copy link
Contributor

rhc54 commented Apr 22, 2021

All committed

@rhc54 rhc54 closed this as completed Apr 22, 2021
nmorey pushed a commit to nmorey/ompi that referenced this issue Oct 17, 2022
See open-mpi#8823 for the details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
nmorey pushed a commit to nmorey/ompi that referenced this issue Oct 17, 2022
See open-mpi#8823 for more details.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants