-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce planner_multi_t #349
Conversation
Codecov Report
@@ Coverage Diff @@
## master #349 +/- ##
=======================================
Coverage 73.35% 73.35%
=======================================
Files 59 59
Lines 9829 9829
=======================================
Hits 7210 7210
Misses 2619 2619 Continue to review full report at Codecov.
|
182db05
to
5335e23
Compare
@SteVwonder and I discussed this. We will try to land this with a minimal amount of work so that @SteVwonder can start to integrate the hwloc reader into the new graph-based infrastructure. |
@dongahn can you rebase onto master? I can merge afterwards. Also, you mentioned in the PR description
Do you happen to remember what those corner cases were? If so, can you document them in an issue so that we do not forget to circle back around to them? |
Yes, I will do those. Thanks! |
@SteVwonder: OK. I did rebase this to the upstream master. I think the main thing was to change some method names in traverser classes and etc to increase readability and add generally more testing. Both can come later. f99c21e is so that users can program prune filters more effectively. But this can also come much later. Perhaps I can change this commit as "add a placeholder command line option" This way, completing this functionality by me don't have to be in the way of your progress. |
Sounds like a good idea |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this abstraction into single and multi! Most of the complexity has been isolated to the "single" case and the "multi" case is a straightforward use of the "single" case.
My main review comments are really just nitpicking the naming of the "resource counts". I think the phrase request
is overloaded and causes a bit of confusion in places. resource_counts
and request_counts
seem more clear to me. There are also a few inconsistencies in the naming across the "single" and "multi" interfaces as well as within the "multi" interfaces.
The testing looks good for now. It seems to cover the common cases. As per the discussion earlier today, the extended testing of the corner cases will be saved for later. For example, I don't think passing a duration of 0
is tested.
resource/planner/planner.c
Outdated
|
||
typedef struct request { | ||
int64_t on_or_after; | ||
uint64_t duration; | ||
resource_array_t resources; | ||
size_t dimension; | ||
int64_t request; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been tripping over the use of request
in the planner a little bit. It is an internal struct, a member of that struct, and an argument in many of the public functions. So you end up with things like request = ctx->current_request->request
at line 794. I think the struct name makes a lot of sense, but maybe size
or count
would make more sense here and in the function arguments. So that would turn the previous example into size = ctx->current_request->size
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep in line with what is done in planner_multi.h
maybe resource_count
would be best?
*/ | ||
int planner_multi_avail_during (planner_multi_t *ctx, int64_t at, | ||
uint64_t duration, | ||
const uint64_t *resource_requests, size_t len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the comments and other functions, resource_requests -> resource_counts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, on further thought, maybe for the functions that consume the counts, resource_requests
makes more sense. And for functions that copy the counts into the array, resource_counts
makes more sense.
I am not exactly sure, and I don't have super strong feelings either way. But there is an inconsistency between the function argument and the comments.
resource/planner/planner_multi.h
Outdated
* range, e.g., reserving more than available. | ||
*/ | ||
int64_t planner_multi_add_span (planner_multi_t *ctx, int64_t start_time, | ||
uint64_t duration, const uint64_t *request, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least should be changed to the plural (requests
), but I think like in planner_multi_avail_time_first
and planner_multi_avail_during
, the naming request_counts
would fit too.
@SteVwonder: Thank you for the review. Yes, I will have one more pass on those names and at least have those names consistent. |
Does this PR fix #319? |
Yes. |
Solve a problem where the mintime resource tree within planner does not work for multi-resource types. The problem is described in Issue flux-framework#319. Change the API and implementation of the planner layer so that it now only concerns only a single resource type. With this change, the upper layer (resource-query etc) will now have to use multiple planner objects if it needs to track multiple resource types. Note that we considered other alternatives: e.g., introducing kd tree or making each scheduled point node in the resource binary search tree to maintain sub-resource BST for the next significant resource types. But these alternatives will significantly increase the software complexity for a single software layer, and we decide to push the complexity to the upper layer. Finally, adjust the unit test case as well.
Add an option to allow users to configure the planner-based caches to speed up graph walks in the future.
@SteVwonder: I think I addressed all of your suggestions. Please take a look and merge if this looks good to you. Thanks. |
Looks like one of the Travis jobs stalled out and didn't even produce any output. I just restart it. Once Travis turns green, I will merge. Thanks @dongahn! |
This PR is still experimental: I thought I should post this now as I will be busy with travel for awhile.
This PR mainly addresses a minimum time resource tree problem within the planner layer. This reverse search tree is designed to satisify a query like "when is the earliest scheduled point at which I can schedule n amounts of resources" in
O(lg N)
time. However, this advanced algorithm only works when it deals with a single resource type and doesn't work correctly when it deals with multiple resource types.In response, this PR limits the planner layer such that that it only manages a single resource type. Instead, it adds
planner_multi_t
layer to handle multiple resources. It also adjusted the rest of the resource code for this change.Finally a bunch of test cases has also been added.
When I circle back to this, I will do further cleanup and tighten some loosen ends to ensure correctness for a few more corner cases.