Skip to content

[Issue]: cannot run ibrc due to hardcode pkey_index #82

@QizhouZhang97

Description

@QizhouZhang97

How is this issue impacting you?

Application hang

Share Your Debug Logs

Hi NVSHMEM team,

I encountered a hang in IBRC on our internal cluster that was traced back to a hardcoded pkey_index = 0 during QP creation. In environments where the default partition key is not at index 0, this causes cq polling to fail silently and ultimately results in a hang.

Steps to Reproduce the Issue

I'd like to suggest replacing the hardcoded value with a configurable environment variable NVSHMEM_IB_PKEY_INDEX (defaulting to 0 to preserve existing behavior). The changes are minimal and touch three sites:

Happy to share the patch or discuss further. Thanks!

Image

NVSHMEM Version

v3.6.5

Your platform details

No response

Error Message & Behavior

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions