Skip to content

[5.0.0rc13] application crash using OMPI with UCX #11974

@david-edwards-linaro

Description

@david-edwards-linaro

Thank you for taking the time to submit an issue!

Background information

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

5.0.0rc13 with UCX 1.15.0

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

tarball, using configure --with-ucx=ucx-1.15.0 --enable-oshmem

Please describe the system on which you are running

  • Operating system/version: Ubuntu 22.04
  • Computer hardware: Intel(R) Core(TM) i7-1260P

Details of the problem

Please describe, in detail, the problem that you are having, including the behavior you expect to see, the actual behavior that you are seeing, steps to reproduce the problem, etc. It is most helpful if you can attach a small program that a developer can use to reproduce your problem.

The following application crashes when built & run using Open MPI as configured above (it does not crash when OMPI is built without UCX):

#include "mpi.h"

int main(int argc, char *argv[])
{
    MPI_Init(&argc, &argv);

    MPI_Win win;
    MPI_Aint lowerbound;
    MPI_Aint sizeofreal;
    MPI_Type_get_extent(MPI_REAL, &lowerbound, &sizeofreal);
    float a[1] = {0};
    MPI_Win_create(a, sizeofreal, sizeofreal, MPI_INFO_NULL, MPI_COMM_WORLD, &win);
    MPI_Win_free(&win);

    MPI_Finalize();

    return 0;
}

The crash occurs within opal_common_ucx_support_level() due to dereference of a null pointer (*opal_common_ucx.tls).
The crash also occurs with 5.0.0rc13 using UCX 1.14.0.
The crash does not occur with 5.0.0rc12 using UCX 1.14.0.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions