-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pipeline example waits for more units than it submits? #113
Comments
I think that is from rp. The reporting seems to be cumulative.
|
|
PS.: don't bother fixing this for the tutorial... |
Every step has "N" CUs and every step waits for the "N" CUs to finish. So I On Wed, Oct 21, 2015 at 6:35 PM, Andre Merzky notifications@github.com
|
If you want to wait for N CUs, then you need to pass the UIDs for those N CUs to |
But the first N CUs have finished executing (reach 'Done' state). I cannot understand "waiting" for those completed CUs. |
consider (pseudo code):
Is the umgr checking one or two units? Both obviously, because the application submitted both, and would otherwise not know if the first one is done or not... Please use
if you only want to wait for the second... |
Oh ok. So even if the CUs are "Done" they are not flushed out of the unit On Thu, Oct 22, 2015 at 3:08 AM, Andre Merzky notifications@github.com
|
No - they are still manager by that unit manager. wait_units returns the states of the units the umgr - it would be inconsistent if it only returned DONE states of units which were not DONE when the call started - it would return a different number every time, and none at all if all are done. Uh... ;) |
I agree with the semantics, but I can also see that why it confused Vivek. We probably need to document that better. |
Maybe add 'that includes previously completed units'? |
Yeah, be more verbose about it. Especially given that we have this reporting now, it is not really intuitive. |
so wait_units() is not designed to be a Barrier function (as in MPI), we just happen to use it in such a mode. (?)
Even in this case, IF the time taken to check if the unit_1 is "Done" is small ... doesn't this serve the purpose of waiting for unit_2 ? Not complaining against using |
Lets not try to bring in non-applying analogies :-) Here is my attempt to a definition: wait_units(units, state) waits for all units under control of the UM (or a user-specified sub-set of them) to reach the specified state (or the default set of final states). |
I understand that. But isn't waiting for a unit which is already "Done" ~0 work.
|
if |
But I am not interested in what it returns, I simply want to wait till all CUs (say of the current iteration) are "Done". If I use wait_units(), it also returns CUs of the previous iterations (since they are "done") as well, agreed. But it achieves what I wanted.
If all I require is that "check" should be printed after the unit_1 is "Done", isn't the above script doing exactly that. |
That is ok that you are not interested in what it returns, but returning the states is what the call does :P |
the output shows:
so the second step submits 16 units, but waits for 32? I assume it waits for the units from step_1 again -- but that is not very intuitive. Also, not needed...
The text was updated successfully, but these errors were encountered: