Initialize ESS before prterun fallback to prun_common()#2365
Initialize ESS before prterun fallback to prun_common()#2365rhc54 merged 1 commit intoopenpmix:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes an initialization issue where prte would fail when running with a pre-existing DVM. The fix ensures that the ESS framework is properly initialized before calling prun_common(), which is necessary for signal forwarding list initialization.
Key changes:
- Added ESS framework initialization before
prun_common()call - Added corresponding cleanup of ESS framework after
prun_common()completes
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
b56faa8 to
6a4542c
Compare
Signed-off-by: Jacek J. Lakis <mlody3k@gmail.com>
|
Ummmm....that feel weird. Opening and closing a framework just to call a base function shouldn't be the solution. I'll have to see how that can be done a little cleaner. Frankly, this is a strange use of "prterun" that someone added - never tested, and this is honestly the first time I've heard from someone using it. I wonder if we should just remove it? Is there some reason not to just use "prun"? |
|
I think one reason for |
|
I just noticed that prun.c#L339 also does something similar to initialize ESS before calling |
This is exactly my usecase, mpirun is a wrapper for prterun. @rhc54 definitely should be done cleaner, I have copy-pasted the same approach from prun.c |
|
@jjlakis BTW, I've never seen the DeepWiki you referenced. Interesting tool/site. |
|
I'll explain my concern - "prterun" absolutely does require the ess framework. "prun" does not. "prterun" opens the ess framework in "prte_init", which it calls after this switch to support using a persistent DVM but before it calls prun_common. So that code down low in "prterun" is incorrect and should be removed - it is double-opening the framework. Fortunately, the framework protects itself from that mistake. What needs to be explored is: does "prterun" in this mode actually setup the required infrastructure to forward signals? If not, then there is no point in creating the list of signals to forward, and we should relocate where that call occurs. If it does, or if we want it to do so, then we probably should do this a little differently. |
Could you elaborate more on this? How is that the signals are required for prterun and not for prun, what they are for? I was thinking, considering that the same maneuver is done for prun.c, this partial initialization perhaps could happen inside
Please don't remove support for persistent DVM from prterun 😅 . The ability to run MPI jobs to a pre-exsiting, container-based DVM infrastructure is critical in future native integration of MPI and Kubernetes that I dream about! |
|
I wouldn't remove it unilaterally - my comment was a question, not a declaration. Here's what continues to puzzle me. If you have an mpirun cmd line that works on its own, then to make it use a persistent DVM you must modify the cmd line - you have to at least add options that point it at the DVM to use, providing contact info or some other rendezvous method. So you had to manually make a change. So why is it such a big deal to also edit "mpirun" to be "prun" at the same time? Think about it for a minute. If you take an mpirun cmd that works independently, and you want to run that exact same application under Slurm - you have to (a) alter the cmd line to fit srun's requirements, and (b) replace "mpirun" with "srun". Same is true for running under PAALS - you have to change "mpirun" to "aprun" plus the other cmd line changes. In other words, replacing the launcher name to match the environment is status quo. So why are we asking PRRTE to do something different? Is it a misperception that you "require" mpirun to run an MPI job?? If so, that is incorrect - prun launches anything (MPI, SHMEM, etc). So long as your programming model uses PMIx, prun and PRRTE are fine. There are some minor differences between how mpirun and prun cmd line options are parsed by default - OMPI has its own personality module for parsing those. However, prun and prterun are happy to use that module if you simply tell them to do so - either on the command line (with Re-using mpirun this way requires a 100-line or so plug of code to duplicate parsing of the cmd line the way prun would have done, refactoring prun so prterun can effectively call it under the covers, etc. It also leads to problems such as this one where a change required by one causes a problem for the other. So again - I wonder if we are making this all harder than it has to be? Given that the user has to modify their cmd line anyway, should we not just require that they modify "mpirun" to "prun" as well? |
|
Thanks so much for raising the concerns! One important clarification:
Not true at all! I can already run The above works perfectly with upstream MPI installation and its bundled PRRTE. Confirmed with
I only ask PRRTE to do stuff it was capable to do!
I can and I will use
For me the situation is as simple as: This worked perfectly for me on the previous versions, now it's broken. This is how If you think though that's a wrong path, then removing the existing DVM support from |
Not my point - you had to add the We had to make a lot of change to allow that extended usage, just so you wouldn't have to use "prun". I'm questioning why that is necessary, given that the cmd line for working with an existing DVM must be different from the cmd line when working alone. |
How come there's a lot of changes to make if this works as expected on the previous versions? What's the solution you are leaning on then, remove the existing dvm support from prterun? |
Sigh - I clearly am not communicating well. I reiterate that we made (past tense) a lot of changes to enable this mode of operation. Those changes increased code complexity and made it more difficult to maintain. I retired years ago, but have continued serving the community in the absence of anyone else stepping forward. I am, however, reaching my limits and will be departing at some not-long-term point. I am therefore scrutinizing the code base, looking for ways to reduce code complexity and improve maintainability in the hope that someone someday will pick these things up - and hopefully make it somewhat easier for them to do so. So I'm simply asking the question of should we continue to support this mode, mostly from the standpoint of what it truly adds? Perhaps a better way to phrase it is: is there something else we need to do to make this truly of value? For example, is there an MCA param/envar that would allow someone to not have to modify the cmd line at all to switch from one-shot to operating against a persistent DVM? Should we link "prterun" to be "prun" instead of its own executable, maybe having prun operate as a one-shot if it doesn't see a DVM? Or...? Yeah, those questions are beyond this PR. But I'm wondering if there is a better way to fix this then just another patch. From my point-of-view, that's worth a little thought before proceeding. Not something for PRRTE v3.0, but maybe going forward. |
So sorry, that's me going in circles here and not reading carefully. I understand your point and motivations, agree if there's nobody stepping in to the project, there's no sense to make such ad-hoc changes if no wider strategy is established. Fingers crossed for the future of PRRTE. I do have some bigger plans around PRRTE and Kubernetes, so expect to see something from me in the next weeks / months. I hope to get your input before your actual retirement. Thank you so much. Closing this one. Leaving #2364 open as the issue persist, feel free to close. |
|
No, let's not close it - the patch is probably correct for v3.0. Just wanting to get people thinking a bit. |
Fix for #2364
The root cause was found with the help from DeepWiki - https://deepwiki.com/search/i-cant-run-anything-on-a-preex_1862a831-88d6-4520-8b2c-13bd1237119b?mode=deep
Works as expected with pre-existing DVM after applying the fix.
What's not clear to me though is which change had actually caused the problem (haven't seen it in e.g. mpi-bundled version of prrte)