-
Notifications
You must be signed in to change notification settings - Fork 936
Description
Thank you for taking the time to submit an issue!
Background information
What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
Version 4.1.5 built from the tarballs
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Via the Archlinux Repos (pacman), see the build instructions here: https://gitlab.archlinux.org/archlinux/packaging/packages/openmpi/-/blob/main/PKGBUILD?ref_type=heads
Please describe the system on which you are running
- Operating system/version: Archlinux system (rolling release)
- Computer hardware: ThinkPad E14 Gen 3 with AMD Ryzen 5 5500U CPU
- Network type: locally run
Details of the problem
When compiling and executing the following program twice one of the invocations hangs at "MPI_File_open" and has to be killed manually:
#include <stdio.h>
#include <mpi.h>
int main() {
int rank;
MPI_Init(NULL, NULL);
MPI_Comm comm = MPI_COMM_WORLD;
MPI_Comm_rank(comm, &rank);
MPI_File fh;
int err = MPI_File_open(comm, "test-file.txt", MPI_MODE_RDONLY, MPI_INFO_NULL, &fh);
if (err != MPI_SUCCESS) {
printf("Got error trying to open file\n");
}
MPI_File_close(&fh);
printf("Hello, I am rank %d in the merged comm\n", rank);
MPI_Barrier(comm);
MPI_Finalize();
return 0;
}shell$ mpicc test.c
shell$ mpirun -np 2 ./a.out
shell$ mpirun -np 2 ./a.outHowever when running version 4.1.4 it seems to work. I am willing to invest some time in debugging this, but so far my tries to find the underlying issue it (i.e. by bisecting between the revisions) have not found anything as I can currently only reproduce with the release tarballs. Is there any documentation on how I can debug something like this?
I also tried newer releases (i.e. 5.0.0rc10, 4.1.6rc2) without success.
The bug was first reported in the archlinux bugtracker: https://bugs.archlinux.org/task/79543