-
Notifications
You must be signed in to change notification settings - Fork 101
Issue#160 Straggler due to list-after-write consistency #170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pywren/wait.py
Outdated
callids_done.update(callids_found) | ||
|
||
# break if not all N tasks completed | ||
if (len(fs_found) < len(fs_samples)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why should we break if not all N tasks completed ? (also where is fs_found defined)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a trade-off. Think this block runs when the tasks are far from finish. Then we are essentially scanning all tasks, and doing this many times before they actually complete. (yeah, that's a bug. I fixed it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I am still missing something. So if we break out of it during say the first few blocks we sample then how do we get back into this loop for the last few tasks ?
@ooq Do we have any plots or example tests to show how this mitigates stragglers? |
pywren/wait.py
Outdated
|
||
pool = ThreadPool(num_samples) | ||
# repeat util all futures are done | ||
while still_not_done_futures: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually the design contract of _wait
? I thought it would perform one pass of checking if things are done and downloading those; it would not block waiting for all to finish (note the original _wait
has no looping, that's handeled in wait()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not block waiting for all to finish, with if (len(callids_found) < len(fs_samples)): break
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see so it's still blocking, just not for all of them. Would it make sense to have num_samples
be passed in (and the policy managed at the layer above) and just have it make a single best-effort to get this number of samples? That is, is there a reason to have this control logic at this layer but the all_completed logic at a layer higher?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it'll be good to have a clean contract between the two functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So say all 100 tasks finish but they only show up with list() 30 seconds after.
With this alternative solution, it would take min(100/N * wait_interval, 30) to find them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think if we assume get_callset_status
and get_call_status
are instantaneous, then the contract is exactly same here, i.e., use whatever way to give one pass and returns.
@ericmjonas btw, I'll make some plot examples today. |
|
||
while query_count < max_queries: | ||
|
||
if len(done_call_ids) >= return_early_n: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with this, we assume that there are few stragglers? Let's say we have 100 stragglers (not showing up with list), then it will take 100/return_early_n * WAIT_DUR_SEC to finish. I personally have seen the case with a large number of stragglers yet, but just want to point out the possibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think in the future if we want and this becomes a performance pain point we can make return_early_n to be a user-facing parameter.
if len(done_call_ids) >= return_early_n: | ||
break | ||
num_to_query_at_once = THREADPOOL_SIZE | ||
fs_to_query = still_not_done_futures[query_count:query_count + num_to_query_at_once] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think query_count + num_to_query_at_once
could go out of bound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python list slices always return as much as they can:
In [3]: x = list(range(10))
In [4]: x
Out[4]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [5]: x[5:15]
Out[5]: [5, 6, 7, 8, 9]
so I think this should be ok?
Thanks for taking it over, @ericmjonas . The code is clean to follow! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
REview from @ooq in PR
In wait(), we signal the completion of tasks by probing status files. Because S3 does not provide list-after-write consistency (which might also apply to other storage backend). List() can be only used as an optimization but not a timely way to signal completion. Thus, our strategy is to:
1) do list()
2) use get() to signal N tasks that do not show up in 1)
3) repeat 2) if all N tasks completed, otherwise stop
Note: a small N is probably preferred here. N is set to 4.
#160