-
Notifications
You must be signed in to change notification settings - Fork 68
v1.10 cm priority #693
v1.10 cm priority #693
Conversation
This patch removes a priority check that disables cm if the previous pml had higher priority. The check was incorrect as coded and is unnecessary as we finalize all but one pml anyway. Fixes open-mpi/ompi#1035 (cherry picked from open-mpi/ompi@2fd176a) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
8d4bedd to
eaa495b
Compare
|
could you please describe what problem this PR solves? and if it change previous component selection logic - what is a new one? |
|
This PR solves a problem that only occurs during dynamic loading. If ob1 gets loaded before cm then cm never gets selected. Looks like I need to fix a few bugs in the PR before you can review this. |
|
@rhc54 Is it ok to change the arguments of an internal API in a release series? I added an argument to mca_base_select. If not I will rename the new version with _new and have the old version call with NULL as the priority argument. I can drop a bunch of changes from this patch if I do that. |
|
I believe so - @jsquyres: you have any qualms? |
|
We do try to avoid changing major internal APIs so that any developers outside of the OMPI tree (e.g., http://processors.wiki.ti.com/index.php?title=MCSDK_HPC_3.x_OpenMPI) doesn't get unexpectedly screwed with a new Open MPI minor release. I think mca_base_select() qualifies as a major API. So I'd be in favor of @hjelmn's proposed approach: add a |
|
I can live with that, but then let's do that as a custom patch only for 1.10.1 - I'd hate to have that workaround in the future code base. |
|
@rhc54 Agreed. Just want to get this functionality into 1.10.1. |
|
@rhc54 Agreed: this is specific for v1.10.x. Master / v2.x will continue to not have the |
eaa495b to
ff8b6eb
Compare
|
Shoot, lost a commit. Let me rebase this one more time. |
This commit removes code that checks the ob1 priority vs the previous priority. The previous priority is meaningless here and may only cause ob1 to disable itself when it shouldn't. (cherry picked from open-mpi/ompi@bedd802) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
The mca_base_select function uses returned priorities to select the best component/module. This priority may be of use to the caller so pass that information back in an optional argument. If the priority is not needed pass NULL. This is a modification of open-mpi/ompi@8b5810f . The primary difference is we left the original mca_base_select in place for compatibility. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit changes the priority of mtl components to be relative to pml/ob1 and updates the mtl interface to expose this priority. cm now sets its own priority based on the priority of the selected mtl component. (cherry picked from open-mpi/ompi@53f6b57) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
ff8b6eb to
0cecbbd
Compare
|
👍 |
|
Also need @regrant and @tkordenbrock to chime in to see what this whole thing does to portals BTL / MTL. |
|
Test PASSed. |
|
Test PASSed. |
|
I'm just going to mark this as pushed-back so I don't accidentally commit it before @regrant and @tkordenbrock can confirm no issue with portals, and @miked-mellanox confirms same for mxm |
(cherry picked from open-mpi/ompi@889c766)
|
Looks good to go with comments from @regrant in open-mpi/ompi#1058 :bot:nolabel:pushed-back |
|
@yburette Included the ofi priority change. |
|
Test PASSed. |
Additional java bindings
:bot🏷️bug
:bot:assign: @miked-mellanox
:bot:milestone:v1.10.1