How to debug OpenMPI remotely (i.e. if you're remote developing to begin with)

OpenMPI needs to compiled from source with ./configure --with-cuda

"For Linux 64, Open MPI is built with CUDA awareness but this support is disabled by default. To enable it, please set the environmental variable OMPI_MCA_opal_cuda_support=true before launching your MPI processes. Equivalently, you can set the MCA parameter in the command line: mpiexec --mca opal_cuda_support 1 ..."

We use mpi4py==3.1.0a0 features so you need to pip install https://github.com/mpi4py/mpi4py/archive/master.zip if 3.1.0 isn't released yet.

How to debug OpenMPI remotely (i.e. if you're remote developing to begin with)

https://stackoverflow.com/a/57938838

The short of it is

create a Python Debug Server run configuration
install the pydevd_pycharm pip package (matching your PyCharm version)
check allow parallel runs
run as many debug runs as mpi processes on the remote machine
reverse ssh tunnel the ports that the debug server on your local machine starts on e.g. ssh max@localhost -p2222 -R 65300:localhost:65300 -R 65303:localhost:65303
put the pydevd_pycharm.settrace("localhost", port=port_mapping[rank], stdoutToServer=True, stderrToServer=True) where you want the breakpoint
run the script using mpiexec
map the source in the run configuration (need to map the specific files)
you can only set one port in the run configuration and that prevents multiple parallel runs. better alternative (so you don't have to keep changing the ports in set_trace is to change the reverse tunnel instead)

if you get mpi4py.MPI.Exception: MPI_ERR_TRUNCATE: message truncated you forgot to add dtype to cupy.

https://www.open-mpi.org/faq/?category=runcuda#mpi-cuda-dev-opa for GPUDirect RDMA in OpenMPI

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
include		include
python		python
src		src
.clang-format		.clang-format
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
build.sh		build.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

include

include

python

python

src

src

.clang-format

.clang-format

.gitignore

.gitignore

CMakeLists.txt

CMakeLists.txt

README.md

README.md

build.sh

build.sh

Repository files navigation

How to debug OpenMPI remotely (i.e. if you're remote developing to begin with)

About

Releases

Packages

Languages

makslevental/cuda_blob

Folders and files

Latest commit

History

Repository files navigation

How to debug OpenMPI remotely (i.e. if you're remote developing to begin with)

About

Resources

Stars

Watchers

Forks

Languages