-
Notifications
You must be signed in to change notification settings - Fork 931
Closed
Labels
Description
The following command line fails when running on 2 hosts with ompi master branch.
(one process on each node).
The same command line runs fine with ompi's v1.10 release branch.
mpirun -np 2 --bind-to core --display-map -mca pml ob1 -mca btl self,sm,openib --mca btl_openib_cpc_include rdmacm --map-by node ./IMB/src/IMB-MPI1 pingpong
The output is:
--------------------------------------------------------------------------
No OpenFabrics connection schemes reported that they were able to be
used on a specific port. As such, the openib BTL (OpenFabrics
support) will be disabled for this port.
Local host: vegas10
Local device: mlx5_0
Local port: 1
CPCs attempted: rdmacm
--------------------------------------------------------------------------
benchmarks to run pingpong
[vegas11:27048] mca_bml_base_btl_array_get_next: invalid array size
[vegas11:27048] *** Process received signal ***
[vegas11:27048] Signal: Segmentation fault (11)
[vegas11:27048] Signal code: Address not mapped (1)
[vegas11:27048] Failing at address: 0x8
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
Process 1 ([[65106,1],0]) is on host: vegas11
Process 2 ([[65106,1],1]) is on host: vegas10
BTLs attempted: self
Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
When using udcm the test passes.