Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for checking a job's satisfiability #503

Merged
merged 20 commits into from Aug 8, 2019

Conversation

@dongahn
Copy link
Contributor

dongahn commented Jul 30, 2019

This PR adds the following support:

  • Detect unsatisfiable jobs
    As discussed in Issue #478, the match RPC within resource only supports two match operations: allocate and allocate_orelse_reserve. The former doesn't tell whether the allocation request failed because of the current resource state (i.e., no resource available now) or the jobspec cannot be satisfied at all. By contrast, if the latter fails, that means the jobspec is not satisfiable: no matter how far in the timeline I move my schedule point, the jobspec cannot be matched.
    The main feature of this PR is to add allocate_with_satisfiability to overcome the challenge with allocate. It first attempts to allocate. If succeeds, it returns the matching info as before. If fails, however, it sets the scheduled time to a point as late as possible within the time box of the planner and try the match. Then, if it still can't find matching resources at that scheduled point, the jobspec is deemed unsatisfiable. This add no additional overhead for those "qualified" jobspec. However, for "unqualified" jobspecs, checking satisfiability requires an entire tree walk. Going forward, we will more rigorously validate the input job specs before they come to resource, so we will be able to manage this cost better.

  • Tighten up qmanager's error handling
    Integrate this feature to the rest of the systems including qmanager and its queue policy layer. This tightens up error handling of qmanager to become more robust.

  • Trim message logging for resource and qmanager
    Significantly reduce the number of per-job messages in particular.

Resolve #495, #478, and #489.

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Jul 30, 2019

Codecov Report

Merging #503 into master will increase coverage by 0.09%.
The diff coverage is 74.7%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #503      +/-   ##
==========================================
+ Coverage   75.52%   75.61%   +0.09%     
==========================================
  Files          60       60              
  Lines        6161     6210      +49     
==========================================
+ Hits         4653     4696      +43     
- Misses       1508     1514       +6
Impacted Files Coverage Δ
qmanager/policies/base/queue_policy_base.hpp 100% <ø> (ø) ⬆️
resource/policies/base/matcher.hpp 100% <ø> (ø) ⬆️
resource/hlapi/bindings/c++/reapi_module_impl.hpp 43.75% <ø> (ø) ⬆️
resource/traversers/dfu_impl.hpp 100% <ø> (ø) ⬆️
resource/utilities/command.cpp 68.75% <100%> (+1.25%) ⬆️
resource/traversers/dfu.cpp 81.41% <100%> (+4.49%) ⬆️
resource/traversers/dfu_impl.cpp 83.72% <100%> (+1.42%) ⬆️
qmanager/policies/base/queue_policy_base_impl.hpp 73.57% <23.8%> (-8.79%) ⬇️
qmanager/modules/qmanager.cpp 70.64% <50%> (-5.09%) ⬇️
resource/modules/resource_match.cpp 72.65% <75.67%> (+1.19%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 376b82a...9083658. Read the comment docs.

@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Jul 30, 2019

@SteVwonder, @grondo and @garlick: this is also ready to be reviewed again. I had to comment out one test because this builds on the by_excl() bug fix PR. If the bug fix PR lands first, I will turn it back on.

@dongahn dongahn force-pushed the dongahn:sat2 branch from 66531ce to a86beb7 Jul 31, 2019
@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Jul 31, 2019

Rebased to the master after PR #502 and forced a push.

Copy link
Member

SteVwonder left a comment

Thanks @dongahn! LGTM! The way the commits were broken up made this really easy to follow and review. So thanks for that!

One minor nit and one open question on best practices listed below. I'm happy to merge this once Travis passes, and we can pontificate on best practices in an issue rather than holding this PR up.

Per the Travis failure, it looks like you just need to rename a dangling error label to restore_errno.

errno = EINVAL;
flux_log_error (h, "existent job (%jd).", (intmax_t)jobid);

This comment has been minimized.

Copy link
@SteVwonder

SteVwonder Aug 6, 2019

Member
Suggested change
flux_log_error (h, "existent job (%jd).", (intmax_t)jobid);
flux_log_error (h, "nonexistent job (%jd).", (intmax_t)jobid);

Further down in info_request_cb, the same nonexistent job error results in an ENOENT. I wonder if that fits better here too.

This comment has been minimized.

Copy link
@dongahn

dongahn Aug 7, 2019

Author Contributor

Ah. Thanks. Yes, ENOENT should be used instead unless it has a side effect (I don't believe so). I will follow up.

This comment has been minimized.

Copy link
@dongahn

dongahn Aug 7, 2019

Author Contributor

Looking at this once again, this is actually intended. In this case, the code is checking whether the job id already exists and if so it refuses the match request. So EINVAL should correct as well. So I will just push one commit that fixes the done label issue.

I was confused myself as well. So thank you for brining this up!

job->schedule.R.c_str (), NULL) < 0) {
flux_log_error (ctx->h, "%s: schedutil_alloc_respond_R",
__FUNCTION__);
goto out;

This comment has been minimized.

Copy link
@SteVwonder

SteVwonder Aug 6, 2019

Member

This is more thinking out loud than it is giving concrete review feedback, but I wonder what the best practice is for handling an error like this. Should the first failure cause the qmanager to bail out of trying to respond to the remaining messages? Or should it continue on trying to respond to the other messages, even if they too error out. I'm not sure. I'm kind of leaning towards the latter if only because then the alloced and rejected queues are empty at the end of the scheduling loop and every request has gotten a response (or a response has at least been attempted rather than delayed indeterminately).

This comment has been minimized.

Copy link
@dongahn

dongahn Aug 7, 2019

Author Contributor

Good point! I am not sure either. I think what is best would depend on what is the most failure mode of schedutil_alloc_respond_R. If the mode is a network/protocol error or job-manager is in a limbo state, it doesn't seem like issuing more respond would help... Even if we error out right away, the allocated and rejected jobs will still be in the corresponding queues, the responds will be sent at the next invocation of the scheduler loop. So I'm kind of leading toward the former.

I'd suggest we merge this as-is. And then, have a ticket created so that if this becomes an issue, we can easily go back to this discussion.

@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Aug 7, 2019

I pushed the commit and Travis is happy now. @SteVwonder: once you look at it briefly, I will squash that. Then, this PR is good to be merged.

@SteVwonder

This comment has been minimized.

Copy link
Member

SteVwonder commented Aug 7, 2019

once you look at it briefly, I will squash that. Then, this PR is good to be merged.

LGTM! Squash away!

dongahn added 20 commits Jul 15, 2019
Add MATCH_ALLOCATE_W_SATISFIABILITY.

This operation attempts to allocate first.
And if succeeds, it returns the matching info as before.

If fails, however, it sets the scheduled time to a point
as late as possible within the time box of the planner
and try the match.
Then, if it can't still find matching resources
at that scheduled point, the jobspec is deemed
unsatisfiable.
This satisfiability query must not change the visible
state of the graph data store. To support this,
overload the update method that only cleans up
the state changes created by the satisfiability
checking walk.

dfu_impl_t::schedule() (and therefore also run())
returns this information with -1 return code
with errno=ENODEV.
This is different than EBUSY, which is set when
dfu_impl_t::schedule() (and therefore also run())
cannot find the matching resources
at a given scheduled point (e.g., now).

Cleaned up how various errnos are propagated
in dfu_impl.cpp to simplify how errno=ENODEV
and errno=EBUSY are determined at dfu_impl_t::schedule().

Update in-line documentation with error numbers
for the public dfu_traverser_t::run() method.
Incorporate the newly added
MATCH_ALLOCATE_W_SATISFIABILITY capability within
the matcher and traverser layer into resource-query.
Add support for the rejected-queue API.

Refactor the common post run_sched_loop routines into the
post_sched_loop function.
Also Drop checking user's id with flux_msg_get_userid().
Only instance owners are permitted to invoke RPC by default
so we don't need to check the user id at this callback level.
Trim INFO level log messages.

Drop checking user's id with flux_msg_get_userid().
Only instance owners are permitted to invoke RPC by default
so we don't need to check the user id at this callback level.
Pertaining to module intialization routines.
Make better use of errno returned from the matcher's
run () method.

 - ENODEV means the jobspec cannot be satisfied at all.

 - EBUSY means the jobspec MAY be satisified but the matcher
cannot find matching resources given the current resource
states.

 - Other errno codes means internal errors occurred with
the matching logic.
Trim per-job-based INFO level log messages.

Use flux_log_error.

Drop checking user's id with flux_msg_get_userid().
Only instance owners are permitted to invoke RPC by default
so we don't need to check the user id at this callback level.
Better handle errors that can emerge from remove() method
within the traverser.
Trim per-job-based INFO level log messages.

Drop checking user's id with flux_msg_get_userid().
Only instance owners are permitted to invoke RPC by default
so we don't need to check the user id at this callback level.
Add resource-query commands and jobspecs that will
be used by an upcoming satisfiabilty sharness tests.
Test 1: match allocate_with_satisfiability and
match allocate_orelse_reserve work correctly
on unsatisfiable jobspecs with respect to
low-level resource contraints.

Test 2: match allocate_with_satisfiability and
match allocate_orelse_reserve work correctly
on unsatisfiable jobspecs with respect to
high-level resource contraints.

Test 3: match allocate_with_satisfiability and
match allocate_orelse_reserve correctly distinguish
unsatisfiable jobspecs vs. ultimately satisfiable
jobspecs.
Introduce a rejected-job queue that tracks those jobs
that must be denied (e.g., due to jobspec unsatisfiability or
internal matching error) into the queue policy layer.

These jobs are dequeued from the pending job queue
and moved to the rejected queue with a note that explains
the reason for denial.

The upper layer such as qmanager will be able to
pop jobs from this queue and act on it (e.g., use
a job's flux message field to respond to job-manager's
alloc request with "alloc denied").
Incorporat the newly added rejected queue support
APIs into the FCFS policy's schedule loop implementation.

Clean up the ways in which errors returned from
the high level resource API are handled.
Extend the flux-resource script to handle
    $ flux resource match allocate_with_satisfiability jobspec

exit with 19 when errno=ENODEV is returned from
match.allocate RPC of the resource matching service.
Test 1: Use match allocate_with_satisfiability on
satisfiable jobspecs to make sure that the new RPC works
exactly the same way as match-allocate on satisfiable
jobs.

Test 2: Now that no resources are available, use
match allocate_with_satisfiability on the same
satisfiable jobspecs to test errno=EBUSY
is returned.

Test 3: Use match allocate_with_satisfiability on
unsatisfiable jobspecs to make sure the RPC will
return errno=ENODEV this time instead.
Add the expected output from resource-query that will
be used by an upcoming satisfiabilty sharness tests.
@dongahn dongahn force-pushed the dongahn:sat2 branch from 7b2f57f to 9083658 Aug 7, 2019
@dongahn

This comment has been minimized.

Copy link
Contributor Author

dongahn commented Aug 7, 2019

Ok. Squashed. Good to go. Thanks!

@SteVwonder SteVwonder merged commit 59c2417 into flux-framework:master Aug 8, 2019
2 of 3 checks passed
2 of 3 checks passed
codecov/patch 74.7% of diff hit (target 75.52%)
Details
codecov/project 75.61% (+0.09%) compared to 376b82a
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@SteVwonder

This comment has been minimized.

Copy link
Member

SteVwonder commented Aug 8, 2019

Thanks @dongahn!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.