Skip to content

Reading from file (MPI_File_open) gets stuck #11913

@christian-heusel

Description

@christian-heusel

Thank you for taking the time to submit an issue!

Background information

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

Version 4.1.5 built from the tarballs

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Via the Archlinux Repos (pacman), see the build instructions here: https://gitlab.archlinux.org/archlinux/packaging/packages/openmpi/-/blob/main/PKGBUILD?ref_type=heads

Please describe the system on which you are running

  • Operating system/version: Archlinux system (rolling release)
  • Computer hardware: ThinkPad E14 Gen 3 with AMD Ryzen 5 5500U CPU
  • Network type: locally run

Details of the problem

When compiling and executing the following program twice one of the invocations hangs at "MPI_File_open" and has to be killed manually:

#include <stdio.h>
#include <mpi.h>

int main() {
    int rank;
    MPI_Init(NULL, NULL);
    MPI_Comm comm = MPI_COMM_WORLD;
    MPI_Comm_rank(comm, &rank);

    MPI_File fh;
    int err = MPI_File_open(comm, "test-file.txt", MPI_MODE_RDONLY, MPI_INFO_NULL, &fh);
    if (err != MPI_SUCCESS) {
        printf("Got error trying to open file\n");
    }
    MPI_File_close(&fh);

    printf("Hello, I am rank %d in the merged comm\n", rank);
    MPI_Barrier(comm);

    MPI_Finalize();
    return 0;
}
shell$ mpicc test.c
shell$ mpirun -np 2 ./a.out
shell$ mpirun -np 2 ./a.out

However when running version 4.1.4 it seems to work. I am willing to invest some time in debugging this, but so far my tries to find the underlying issue it (i.e. by bisecting between the revisions) have not found anything as I can currently only reproduce with the release tarballs. Is there any documentation on how I can debug something like this?

I also tried newer releases (i.e. 5.0.0rc10, 4.1.6rc2) without success.

The bug was first reported in the archlinux bugtracker: https://bugs.archlinux.org/task/79543

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions