-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prov/opx: used by default instead of psm2 even though it's "beta" #7796
Comments
|
yes sorry, I conflated two issues. The issue with symbols occurs with 1.12.1 but not with 1.15.1 Still defaulting to opx over psm2 is surpising so I'll edit and leave that. |
I'll look into this. Can you assign this to me? |
Created PR |
This is probably fixed/closed |
@shefty another one that I think has been fixed |
@timothom64 - Do you need write access for the opx provider? (and ofiwg more broadly) |
Describe the bug
The opx provider is default even though it's labelled BETA, psm2 is only used if you disable opx or set FI_PROVIDER=psm2.
If opx is enabled it'll take priority over psm2 even if it's labelled BETA:
src/fabric.c
:shouldn't this be
"efa", "psm2", "opx", "psm", "usnic", "gni", "bgq", "verbs",
instead?Secondly if you force psm2 viaThis was fixed in libfabric 1.15.0 as far as I can see, commit 3f1d52d.FI_PROVIDER=psm2
all symbols dynamically linked fromlibpsm2.so
(e.g.psm2_mq_irecv2
) are duplicated by the psm3 provider insidelibfabric.so
, so not taken fromlibpsm2.so
. As a consequence all communication goes over the ethernet instead of omnipath.To Reproduce
Steps to reproduce the behavior:
FI_PROVIDER
set run a test withFI_LOG_LEVEL=trace
, and you see opx is used on omnipath.Expected behavior
If needed, a clear and concise description of what you expected to happen.
FI_PROVIDER
set run a test withFI_LOG_LEVEL=trace
, and you see psm2 is used on omnipath.Environment:
OS (if not Linux), provider, endpoint type, etc.
Additional context
Workaround: set FI_PROVIDER=psm2
The text was updated successfully, but these errors were encountered: