-
Notifications
You must be signed in to change notification settings - Fork 848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PMIX_ERROR when MPI_Comm_spawn in multiple nodes #12601
Comments
Could you set this env variable in the shell where the parent process is started? export PMIX_MCA_gds=hash and rerun and see it the problem persists? |
I assume you used the PMIx that was included in OMPI v5.0.3? If so, then the above envar is unlikely to do any good. The bug it addressed is in the PMIx v5 series, and OMPI v5.0.3 uses PMIx v4. Looking at the error output, the problem lies in the RTE's handling of the That said, I know we can successfully comm_spawn across multiple nodes because I regularly do so. However, none of my codes follow your pattern, so I cannot say why your code fails. |
I test it and the problem persists: If you tell me that in the current versions it does not work and in the future it will be fixed, it is also valid as an answer for me. + export PMIX_MCA_gds=hash
+ PMIX_MCA_gds=hash
+ mpiexec -n 3 --hostfile /work/machines_mpi --map-by node:OVERSUBSCRIBE ./spawn
Warning: Permanently added '172.18.0.3' (ED25519) to the list of known hosts.
Warning: Permanently added '172.18.0.4' (ED25519) to the list of known hosts.
Parent from e6d86d2ea3a1: rank 0 out of 3
Parent from 1e80d68b9a8e: rank 2 out of 3
Parent from eba2089b8c2f: rank 1 out of 3
[e6d86d2ea3a1:00072] PMIX ERROR: PMIX_ERROR in file prted/pmix/pmix_server_dyn.c at line 1041
[e6d86d2ea3a1:00072] PMIX ERROR: PMIX_ERROR in file prted/pmix/pmix_server_dyn.c at line 1041
[e6d86d2ea3a1:00072] PMIX ERROR: PMIX_ERR_OUT_OF_RESOURCE in file base/bfrop_base_unpack.c at line 1843
Childfrom e6d86d2ea3a1: rank 0 out of 1
Parent broadcasted message: 42
Child received broadcasted message: 42
Parent broadcasted message: 42
Parent broadcasted message: 42
[e6d86d2ea3a1:00072] PMIX ERROR: PMIX_ERROR in file prted/pmix/pmix_server_dyn.c at line 1041
[e6d86d2ea3a1:00072] PMIX ERROR: PMIX_ERROR in file prted/pmix/pmix_server_dyn.c at line 1041
[e6d86d2ea3a1:00072] PMIX ERROR: PMIX_ERR_OUT_OF_RESOURCE in file base/bfrop_base_unpack.c at line 1843
Childfrom e6d86d2ea3a1: rank 0 out of 1
[1e80d68b9a8e][[41639,1],2][btl_tcp_proc.c:400:mca_btl_tcp_proc_create] opal_modex_recv: failed with return value=-46
[eba2089b8c2f][[41639,1],1][btl_tcp_proc.c:400:mca_btl_tcp_proc_create] opal_modex_recv: failed with return value=-46
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
Process 1 ([[41639,1],1]) is on host: eba2089b8c2f
Process 2 ([[41639,3],0]) is on host: unknown
BTLs attempted: self tcp
Your MPI job is now going to abort; sorry.
-------------------------------------------------------------------------- + mpiexec -n 3 --hostfile /work/machines_mpi -x PMIX_MCA_gds=hash --map-by node:OVERSUBSCRIBE ./spawn
Warning: Permanently added '172.18.0.4' (ED25519) to the list of known hosts.
Warning: Permanently added '172.18.0.3' (ED25519) to the list of known hosts.
Parent from 0f2fad4c2330: rank 0 out of 3
Parent from 9116950a090e: rank 2 out of 3
Parent from cd50ec8d6901: rank 1 out of 3
[0f2fad4c2330:00072] PMIX ERROR: PMIX_ERROR in file prted/pmix/pmix_server_dyn.c at line 1041
[0f2fad4c2330:00072] PMIX ERROR: PMIX_ERROR in file prted/pmix/pmix_server_dyn.c at line 1041
[0f2fad4c2330:00072] PMIX ERROR: PMIX_ERR_OUT_OF_RESOURCE in file base/bfrop_base_unpack.c at line 1843
Childfrom 0f2fad4c2330: rank 0 out of 1
Parent broadcasted message: 42
Child received broadcasted message: 42
Parent broadcasted message: 42
Parent broadcasted message: 42
[0f2fad4c2330:00072] PMIX ERROR: PMIX_ERROR in file prted/pmix/pmix_server_dyn.c at line 1041
[0f2fad4c2330:00072] PMIX ERROR: PMIX_ERROR in file prted/pmix/pmix_server_dyn.c at line 1041
[0f2fad4c2330:00072] PMIX ERROR: PMIX_ERR_OUT_OF_RESOURCE in file base/bfrop_base_unpack.c at line 1843
Childfrom 0f2fad4c2330: rank 0 out of 1
[cd50ec8d6901][[11691,1],1][btl_tcp_proc.c:400:mca_btl_tcp_proc_create] [9116950a090e][[11691,1],2][btl_tcp_proc.c:400:mca_btl_tcp_proc_create] opal_modex_recv: failed with return value=-46
opal_modex_recv: failed with return value=-46
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
Process 1 ([[11691,1],2]) is on host: 9116950a090e
Process 2 ([[11691,3],0]) is on host: unknown
BTLs attempted: self tcp
Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
|
Maybe it is the same problem as in this issue: #12599 |
As I said, it is a known problem and has probably been fixed, but I have no advice on when that will appear in a release. |
No, that is an entirely different issue. |
Using Open MPI main and PMIx at e32e0179 and PRRTE at d02ad07c3d I don't observe this behavior using 3 nodes of a slurm managed cluster. If i use the Open MPI internal pmix/prrte submodules the test case hangs when using multiple nodes. |
well i slightly amend my comment. it seems that if UCX is involved in anyway, Open MPI main with embedded openpmix/prrte hangs. If i configure open mpi with |
Could you try the 5.0.x nightly tarball? See https://www.open-mpi.org/nightly/v5.0.x/ |
I launch the test in docker simulating the nodes with containers. I installed the version you told me and I get the same error trace as above. wget https://download.open-mpi.org/nightly/open-mpi/v5.0.x/openmpi-v5.0.x-202406110241-2a43602.tar.gz
tar zxf openmpi-v5.0.x-202406110241-2a43602.tar.gz
ln -s openmpi-v5.0.x-202406110241-2a43602 openmpi
mkdir -p /home/lab/bin
cd ${DESTINATION_PATH}/openmpi
./configure --prefix=/home/lab/bin/openmpi
make -j $(nproc) all
make install Output of ompi_info:+ ompi_info
Package: Open MPI root@buildkitsandbox Distribution
Open MPI: 5.0.4a1
Open MPI repo revision: v5.0.3-56-g2a436023eb
Open MPI release date: Unreleased developer copy
MPI API: 3.1.0
Ident string: 5.0.4a1
Prefix: /home/lab/bin/openmpi
Configured architecture: x86_64-pc-linux-gnu
Configured by: root
Configured on: Tue Jun 11 08:51:34 UTC 2024
Configure host: buildkitsandbox
Configure command line: '--prefix=/home/lab/bin/openmpi'
Built by:
Built on: Tue Jun 11 09:02:45 UTC 2024
Built host: buildkitsandbox
C bindings: yes
Fort mpif.h: no
Fort use mpi: no
Fort use mpi size: deprecated-ompi-info-value
Fort use mpi_f08: no
Fort mpi_f08 compliance: The mpi_f08 module was not built
Fort mpi_f08 subarrays: no
Java bindings: no
Wrapper compiler rpath: runpath
C compiler: gcc
C compiler absolute: /bin/gcc
C compiler family name: GNU
C compiler version: 11.4.0
C++ compiler: g++
C++ compiler absolute: /bin/g++
Fort compiler: none
Fort compiler abs: none
Fort ignore TKR: no
Fort 08 assumed shape: no
Fort optional args: no
Fort INTERFACE: no
Fort ISO_FORTRAN_ENV: no
Fort STORAGE_SIZE: no
Fort BIND(C) (all): no
Fort ISO_C_BINDING: no
Fort SUBROUTINE BIND(C): no
Fort TYPE,BIND(C): no
Fort T,BIND(C,name="a"): no
Fort PRIVATE: no
Fort ABSTRACT: no
Fort ASYNCHRONOUS: no
Fort PROCEDURE: no
Fort USE...ONLY: no
Fort C_FUNLOC: no
Fort f08 using wrappers: no
Fort MPI_SIZEOF: no
C profiling: yes
Fort mpif.h profiling: no
Fort use mpi profiling: no
Fort use mpi_f08 prof: no
Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support: yes,
OMPI progress: no, Event lib: yes)
Sparse Groups: no
Internal debug support: no
MPI interface warnings: yes
MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
dl support: yes
Heterogeneous support: no
MPI_WTIME support: native
Symbol vis. support: yes
Host topology support: yes
IPv6 support: no
MPI extensions: affinity, cuda, ftmpi, rocm
Fault Tolerance support: yes
FT MPI support: yes
MPI_MAX_PROCESSOR_NAME: 256
MPI_MAX_ERROR_STRING: 256
MPI_MAX_OBJECT_NAME: 64
MPI_MAX_INFO_KEY: 36
MPI_MAX_INFO_VAL: 256
MPI_MAX_PORT_NAME: 1024
MPI_MAX_DATAREP_STRING: 128
MCA accelerator: null (MCA v2.1.0, API v1.0.0, Component v5.0.4)
MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA btl: self (MCA v2.1.0, API v3.3.0, Component v5.0.4)
MCA btl: sm (MCA v2.1.0, API v3.3.0, Component v5.0.4)
MCA btl: tcp (MCA v2.1.0, API v3.3.0, Component v5.0.4)
MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component v5.0.4)
MCA if: linux_ipv6 (MCA v2.1.0, API v2.0.0, Component
v5.0.4)
MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, Component
v5.0.4)
MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA memory: patcher (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA mpool: hugepage (MCA v2.1.0, API v3.1.0, Component v5.0.4)
MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, Component
v5.0.4)
MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component v5.0.4)
MCA reachable: weighted (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA smsc: cma (MCA v2.1.0, API v1.0.0, Component v5.0.4)
MCA threads: pthreads (MCA v2.1.0, API v1.0.0, Component v5.0.4)
MCA timer: linux (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA bml: r2 (MCA v2.1.0, API v2.1.0, Component v5.0.4)
MCA coll: adapt (MCA v2.1.0, API v2.4.0, Component v5.0.4)
MCA coll: basic (MCA v2.1.0, API v2.4.0, Component v5.0.4)
MCA coll: han (MCA v2.1.0, API v2.4.0, Component v5.0.4)
MCA coll: inter (MCA v2.1.0, API v2.4.0, Component v5.0.4)
MCA coll: libnbc (MCA v2.1.0, API v2.4.0, Component v5.0.4)
MCA coll: self (MCA v2.1.0, API v2.4.0, Component v5.0.4)
MCA coll: sync (MCA v2.1.0, API v2.4.0, Component v5.0.4)
MCA coll: tuned (MCA v2.1.0, API v2.4.0, Component v5.0.4)
MCA coll: ftagree (MCA v2.1.0, API v2.4.0, Component v5.0.4)
MCA coll: monitoring (MCA v2.1.0, API v2.4.0, Component
v5.0.4)
MCA coll: sm (MCA v2.1.0, API v2.4.0, Component v5.0.4)
MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0, Component
v5.0.4)
MCA fcoll: individual (MCA v2.1.0, API v2.0.0, Component
v5.0.4)
MCA fcoll: vulcan (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA hook: comm_method (MCA v2.1.0, API v1.0.0, Component
v5.0.4)
MCA io: ompio (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA io: romio341 (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA op: avx (MCA v2.1.0, API v1.0.0, Component v5.0.4)
MCA osc: sm (MCA v2.1.0, API v3.0.0, Component v5.0.4)
MCA osc: monitoring (MCA v2.1.0, API v3.0.0, Component
v5.0.4)
MCA osc: rdma (MCA v2.1.0, API v3.0.0, Component v5.0.4)
MCA part: persist (MCA v2.1.0, API v4.0.0, Component v5.0.4)
MCA pml: cm (MCA v2.1.0, API v2.1.0, Component v5.0.4)
MCA pml: monitoring (MCA v2.1.0, API v2.1.0, Component
v5.0.4)
MCA pml: ob1 (MCA v2.1.0, API v2.1.0, Component v5.0.4)
MCA pml: v (MCA v2.1.0, API v2.1.0, Component v5.0.4)
MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, Component
v5.0.4)
MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0, Component
v5.0.4)
MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v5.0.4)
MCA topo: basic (MCA v2.1.0, API v2.2.0, Component v5.0.4)
MCA topo: treematch (MCA v2.1.0, API v2.2.0, Component
v5.0.4)
MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component
v5.0.4) |
okay let's try one more thing. Could you try our nightly main tarball? I beginning to think that you are hitting a different problem on your system that I'm not able to duplicate. |
It looks like this issue is expecting a response, but hasn't gotten one yet. If there are no responses in the next 2 weeks, we'll assume that the issue has been abandoned and will close it. |
@dariomnz Are you able to test nightly main tarball as per @hppritcha suggestion? I would really hate for this issue to be auto-closed. |
It looks like this issue is expecting a response, but hasn't gotten one yet. If there are no responses in the next 2 weeks, we'll assume that the issue has been abandoned and will close it. |
Background information
What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
Open MPI v5.0.3
https://www.open-mpi.org/community/help/
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
output of ompi_info
Please describe the system on which you are running
Details of the problem
The spawn method gives error when you have several nodes. If it is done within the same node it works perfectly.
It is interesting because communications between the nodes do take place even if the error occurs.
Code spawn.c
Good execution in one node:
Bad execution in multiple nodes:
The text was updated successfully, but these errors were encountered: