Skip to content

PMIx_Abort not always aborting #3225

@jsquyres

Description

@jsquyres

@rhc54 and I have talked about this on the phone and IM. There are cases where PMIx_Abort will actually deadlock instead of completing aborting. Ralph knows what the problem is; he hasn't figured out how to fix it yet. Filing this issue to track it.

This is causing some MPI and OSHMEM tests to hang on the Cisco MTT cluster, and eat up lots of CPU.

I'm seeing this on master and 2.x (I haven't started testing 3.x yet, but I'm sure the problem is over there, too).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions