-
Couldn't load subscription status.
- Fork 928
Description
Thank you for taking the time to submit an issue!
Background information
What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
We use 4.0.2 but the same issue exists in master branch.
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Built from source of 4.0.2 release tarball
If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.
Please describe the system on which you are running
(confirmed that the issue is irrelevant to specific OS)
- Operating system/version:
- Computer hardware:
- Network type:
Details of the problem
We use libompitrace with NCCL-tests which internally uses MPI_DATATYPE_NULL with MPI_Allgather (see https://github.com/NVIDIA/nccl-tests/blob/master/src/common.cu#L1022). This is a legal use of MPI_DATATYPE_NULL. However, when we enable libompitrace with OpenMPI, it reports error below.
*** An error occurred in MPI_Type_get_name
*** reported by process [1728118785,0]
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_TYPE: invalid datatype
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
By checking the code of libompitrace/allgather.c, we found that it does not check whether the datatype is MPI_DATATYPE_NULL before calling PMPI_Type_get_name, thus caused this error.