-
Notifications
You must be signed in to change notification settings - Fork 934
Update ORTE to include all fixes since v3.x branching #3462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@bwbarrett @hppritcha This update brings the runtime and PMIx up-to-date, resolving a number of reported bugs in 3.x that have previously been fixed on master. The PMIx update takes us to v2.0 minus cross-version and self-notification. The latter is needed to support resilient operations and should probably be brought along. The choice of adding cross-version support is problematic. Be advised that OMPI 3.0 will not work with the SLURM PMIx plugin in any current SLURM release. The next release it could be supported by would be later this year, and there are adoption times to be added to that date. I don't have a precise date for complete PMIx v2.0 release. I'm going to work on resolving the remaining bugs over the next few days, and I anticipate having it functionally correct by mid-week. This would not include the shared memory support, and I don't have a timetable for that to be completed. The remaining ORTE changes are all going to be bug fixes with one exception - adding support for SLURM-based Cray operations where mpirun is executed on a compute node. |
|
Looks like something is wrong with the PR checker - it just keeps spewing the following over and over again: |
|
MTT results with this PR: |
Remove the RML/OFI component and the timing macros from ORTE Update PMIx to match that in master. Note this is not the full PMIx v2.0 as that code isn't complete Bring across remaining required changes in OPAL constants and cmd line processor Signed-off-by: Ralph Castain <rhc@open-mpi.org>
…aemon with mpirun when mpirun is executing on a compute node in that environment. This allows local application procs to inherit their security credential from the daemon as it will have been launched via SLURM Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit a143800)
ggouaillardet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you also need e101f2b
out of curiosity, why didn't you cherry-pick several commits individually ?
|
It's already in there. I didn't cherry-pick commits because I'd be here for the next month. Instead, I simply updated everything and then removed the handful of things we didn't want. So please don't add individual changes back into the PR list. I'll add anything that comes since the update to this one. |
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit e101f2b)
…y the HNP can know the backend topologies at that point. Any needed topologies will be sent along with the launch_apps command Do not pass param file MCA params if the user has requested that no param files be read - required when trying to avoid launch time penalties from large numbers of processes reading default param files. The daemon picks them up and passes them along anyway, so it isn't clear what value we gain from having them all read the defaults Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 180809f)
|
Sorry - I didn't realize that you had another commit to the ORTE area. I added it here, along with the last one I submitted. |
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit 026f3dd)
Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 442e307)
Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 911961e)
… sometimes fragile, feature Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 55f4b82)
…topologies involved, and that the HNP is not allocated. Give up on being cute and just search the darned list of topologies - there won't be that many, and if there are (so the scan takes awhile), then too bad. Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit f47124e)
… was trying to send because the message is at a lower priority than the termination event. Resolve this by putting the oob in its own progress thread. Also, use only that one thread by default - if someone needs more progress threads in the OOB, they can use the MCA param to get them. Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 9164afb)
|
Update MTT: |
On unmanaged allocations, we need to update the total_slots_allocated once the daemons have been launched and "discovered" their topology Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 29e083b)
Fix the --nolocal option by ensuring we always check/remove the HNP from the list of available nodes if the flag is set Ensure that the HNP node is included as available when nothing else is given Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 45bbd59)
Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit b527c40)
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit 16fc099)
|
@bwbarrett If you want to take this prior to renaming the branch, that's fine - I'll just start another rollup from that point. Otherwise, I have no issue refiling it and keeping the rollup here. |
|
The problems I reported have been fixed on master with #3524 |
|
@ggouaillardet I talked to @jsquyres about that very point this morning 😄 Our feeling was that we should commit this as-is since it has been around for awhile and @bwbarrett needs to commit it prior to renaming the release branch. The changes in #3524 were only committed to master in the last 24 hours and haven't gone thru MTT yet. We propose to therefore start a second ORTE rollup with those changes once the branch renaming is complete. Make sense? |
|
@bwbarrett I think we passed outside of @ggouaillardet time zone, so I'm dismissing the negative review as it will be handled in a separate PR |
…ions Start updating the various mappers to the new procedure. Remove the stale lama component as it is now very out-of-date. Bring round_robin and PPR online, and modify the mindist component (but cannot test/debug it). Remove unneeded test Fix memory corruption by re-initializing variable to NULL in loop Resolve the race condition identified by @ggouaillardet by resetting the mapped flag within the same event where it was set. There is no need to retain the flag beyond that point as it isn't used again. Add a new job attribute ORTE_JOB_FULLY_DESCRIBED to indicate that all the job information (including locations and binding) is included in the launch message. Thus, the backend daemons do not need to do any map computation for the job. Use this for the seq, rankfile, and mindist mappers until someone decides to update them. Note that this will maintain functionality, but means that users of those three mappers will see large launch messages and less performant scaling than those using the other mappers. Have the mindist module add procs to the job's proc array as it is a fully described module Protect the hnp-not-in-allocation case Per path suggested by Gilles - protect the HNP node when it gets added in the absence of any other allocation or hostfile Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 657e701)
Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 8c2a064)
…roper infrastructure and RTE support, including whether we have ompi-server available if the connect/accept spans multiple applications. Print pretty help messages in all cases where we do not have support Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 9f60cd0)
Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 87201a8)
|
Updated MTT with all commits: |
|
Comparison from master: |
|
MTT of current state: PERFECTLY CLEAN! |
…d of each job (and if MCA param is set), have each daemon compute the number of open fds and their characteristics and print a summary Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit f3ab326)
…once that job completes. Cleanup a few typos. Silence a Coverity warning Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 9a8811a)
Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit ad108ba)
Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 321abfc)
Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 5d990b5)
|
Updated MTT: |
…en't apparently fully portable Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 26e7515)
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
|
With latest changes, including updated PMIx: |
…t pretty error messages Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 9d6b929)
Remove the RML/OFI component and the timing macros from ORTE
Update PMIx to match that in master. Note this is not the full PMIx v2.0 as that code isn't complete
Bring across remaining required changes in OPAL constants and cmd line processor
Signed-off-by: Ralph Castain rhc@open-mpi.org