-
Notifications
You must be signed in to change notification settings - Fork 843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Orted prolog and epilog hooks #35
Comments
Imported from trac issue 1269. Created by jsquyres on 2008-04-11T11:23:25, last modified: 2011-01-11T07:45:51 |
Trac comment by jsquyres on 2008-06-23 13:32:57: Yo Ralph -- I'm assuming there's no plans for this kind of feature in v1.3 (Terry and I were talking "pie in the sky" kinds of ideas when we came up with this one). Should we shift it to "Future"? |
Trac comment by rhc on 2008-06-23 13:58:28: As noted, it would be easy to implement, so I guess I don't care - could throw it into 1.3 or not. Kinda up to you guys as to how badly you want it. Ralph |
Trac comment by tdd on 2008-06-24 10:25:59: This feature is not absolutely necessary for 1.3 but I would like it in 1.3.1. I've discussed this with the RMs (Brad and George) and they are fine with this feature being added to 1.3.1. |
Trac comment by rhc on 2010-01-27 22:29:02: Damien has a somewhat related issue - what he needs is basically a "spawn agent" similar to our "launch agent". If provided, this would be a cmd that executes each app when spawned. In other words, you take the argv that is going to be fork/exec'd and prepend the spawn agent in it. Thus, the spawn agent is what actually executes the app. I'm not sure if Damien is doing this work or not - perhaps he could confirm? Otherwise, I'll implement it over the next week or two (actually rather trivial to do). |
Trac comment by jsquyres on 2010-02-02 18:27:17: Damien was having problems posting; he mailed his reply to me directly (see below). Damien: note that you can sign up for an account on our Trac and therefore be able to comment on tickets directly (our Trac does not currently accept emails as input). I have received a query from the Openmpi tracker. The ticket https://svn.open-mpi.org/trac/ompi/ticket/1269 is an This feature is not an issue for me. My problem is to have an "orted local to mpirun",this is the only one
Please refer to Ralph for history and sorry for confusion. |
Trac comment by jsquyres on 2011-01-11 07:45:51: I think it's pretty safe to say that this won't happen any time soon unless someone can free up some cycles to implement it. |
…-v1.8 OSHMEM: spml ikrit: complete puts b4 memheap destruction
…lease sync with ompi-release/v1.8
The IPMI plugin tries to read the bmc credentials in an endless loop in case it fails to read for some reason, and no other compute node tries to send the bmc credential data. Fixes open-mpi#35 The nodepower plugin uses the ipmi_cmdraw command to retrieve data from the bmc. It needs to pass the length of the response buffer to the API so that the library knows how much memory is allocated to it and can pack the data accordingly. The current implementation does pass the repose length. but it doesn't initialize the length, which could lead to non deterministic results. Initialized teh lenght to 1024 to reflect the size of the data buffer. Fixes open-mpi#33 If the ipmi_cmdraw call in the nodepower plugin fails for some reason, then the failure is currently being ignored and the responseData is copied out and passed to the application. Added the code to handle a return failure of the ipmi_cmdraw call.
When get_bmc_cred fails in an aggregator, and no other compute node sends the ipmi credential data, the aggregator tries to read its bmc credential in a loop. Implemented a timeout as well as printing out a debug message indicating the issue. Refs open-mpi#35
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
Terry and I were talking about the possibility of having per-job prolog and epilog steps in the orted. That is, an MCA parameter that identifies an argv to run before the first local proc of a job is launched on the node and after the last local proc of a job has completed. Typical argv would usually be a local script (perhaps to perform some site-specific administrative stuff). If the argv for the prolog/epilog is blank (which would be the default), then nothing would be launched for these steps. Hence, these would be hooks available to sysadmins if they want to use them.
I'm guessing/assuming that this would not be difficult to do -- it's mainly a matter of:
It ''might'' be useful to also have the same prolog/epilog hooks for each process in a job on the host as well. [shrug]
I'm initially marking this as a 1.3 milestone, but have no real requirement for it in v1.3 -- it seems like an easy / neat / useful idea, but there is no ''need'' to have it in v1.3. It could be pushed forward.
The text was updated successfully, but these errors were encountered: