Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libjob and job-ingest: improve submit API and submit/queue synchronization #1734

Merged
merged 19 commits into from Oct 23, 2018

Conversation

Projects
None yet
4 participants
@garlick
Copy link
Member

garlick commented Oct 12, 2018

This PR solves a few problems with the flux_job_submit() API call and the job-ingest module identified in discussions and while building the job-manager module.

First, flux_job_submit() requires the user to sign jobspec before submitting it, which means the user must interact with the flux-security API. This is extra work for little gain, since most of the time the default config file and signing method is appropriate. In addition, flux-core can be built without flux-security, and then flux_job_submit() expects jobspec to be sent in the clear. This means the user has to know how flux-core was built.

To clean this up, add a FLUX_JOB_PRE_SIGNED flag to indicate that jobspec is already signed when passed to flux_job_submit() for the uncommon case. In the common case, this flag is not used, and the flux-security API is used internally to sign using the default configuration When flux-core is not built with flux-security, a dumbed down wrap/unwrap interface is used internally to wrap the jobspec with the "none" method using the same encoding as flux-security.

Next, the job-ingest module announces new jobs with an event message, and the event payload contains only an array of jobid's. The job-manager module must immediately put these into a queue ordered by priority and regulate access to them based on the userid, so it must fetch the userid and priority from the KVS. To fix this, add userid and priority to the payload so instead of an array of integers, we send an array of objects containing id, userid, and priority.

Finally, the event created an "eventually consistent" relationship between the submit command/API and the queue which at minimum makes testing the job manager difficult, and at worst, might confuse users. Change the announcing event to an RPC to the job manager, and add a dummy job manager module for testing.

@garlick garlick force-pushed the garlick:ingest_rework branch from 8235e0a to 99150af Oct 12, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 12, 2018

I went ahead and rebased this on current master, and also improved some commit messages. Apologies if I did that while someone was looking at code!

@SteVwonder

This comment has been minimized.

Copy link
Member

SteVwonder commented Oct 13, 2018

After going through the code, it looks good to me. I'll try and play around with this over the weekend. See if I can throw some jobs at it with my simple python initial program.

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Oct 13, 2018

Codecov Report

Merging #1734 into master will decrease coverage by 0.03%.
The diff coverage is 75.57%.

@@            Coverage Diff             @@
##           master    #1734      +/-   ##
==========================================
- Coverage   79.36%   79.32%   -0.04%     
==========================================
  Files         184      185       +1     
  Lines       34552    34728     +176     
==========================================
+ Hits        27421    27548     +127     
- Misses       7131     7180      +49
Impacted Files Coverage Δ
src/common/libjob/job.c 39.53% <21.21%> (-60.47%) ⬇️
src/cmd/flux-job.c 87.89% <76.19%> (-0.11%) ⬇️
src/modules/job-ingest/job-ingest.c 70.83% <78.75%> (+7.48%) ⬆️
src/common/libjob/sign_none.c 93.97% <93.97%> (ø)
src/common/libflux/mrpc.c 85.55% <0%> (-1.15%) ⬇️
src/bindings/lua/flux-lua.c 82.12% <0%> (-0.7%) ⬇️
src/common/libsubprocess/server.c 69.07% <0%> (-0.69%) ⬇️
src/modules/kvs/kvs.c 65.81% <0%> (-0.16%) ⬇️
src/modules/content-sqlite/content-sqlite.c 56.35% <0%> (+0.01%) ⬆️
... and 2 more

@garlick garlick force-pushed the garlick:ingest_rework branch 2 times, most recently from b79a18b to 5594ec1 Oct 15, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 15, 2018

I tacked on some commits to fill in testing gaps and fix up some nits. @SteVwonder and @grondo: OK to squash the incremental development and rebase on master?

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Oct 15, 2018

That is fine with me!

@garlick garlick force-pushed the garlick:ingest_rework branch from 181c45b to 3bee63e Oct 15, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 15, 2018

OK, I went ahead and forced a push.

@garlick garlick force-pushed the garlick:ingest_rework branch from 3bee63e to b453194 Oct 15, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 15, 2018

Rebased again.

@garlick garlick force-pushed the garlick:ingest_rework branch 2 times, most recently from c86a062 to 49dc84e Oct 16, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 16, 2018

@SteVwonder brought up the KVS eventlog yesterday in our discussion, and it gradually sunk in afterward that probably the KVS eventlog should be created by the job-ingest module after all, since it's already doing a KVS commit on the job's behalf, and the job-manager module shouldn't need to do anything until the next event. So I'll add that on here.

@garlick garlick force-pushed the garlick:ingest_rework branch from 59e0234 to d1cf0e5 Oct 16, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 16, 2018

Rebased.

@grondo
Copy link
Contributor

grondo left a comment

This looks very well done to me! Just a couple comments/questions came up as I was poking at the implementation:

  • flux_job_submit fails with an error when job-ingest is loaded without job-manager. However, since the ingest module commits its KVS transaction before announcing jobs to the job-manager module, the job for which submit failed has an entry in the KVS and its eventlog has a submit event logged. Is the intent to have the job-manager module pick up these jobs from the KVS when it is loaded? If so, would a warning be more appropriate? (As I write that, maybe not since the job can't be "waited" on until a job-manager module exists). In any event, loading job-ingest without job-manager would not likely ever happen (once we have a job-manager module), so this is not high priority.

  • Side note: my job eventlogs seem to have an extra line?

$ flux kvs get job.active.0000.0012.3500.0000.eventlog
1539792319.625607 submit

$ flux kvs get job.active.0000.0012.3500.0000.eventlog | wc -l
2
  • For some reason, on instance shutdown I'm always getting a warning that job-ingest is not cleanly shutdown. I didn't investigate any further:
lt-flux-broker: module 'job-ingest' was not cleanly shutdown

That is all I have for now

@@ -214,7 +218,7 @@ void submitbench_continuation (flux_future_t *f, void *arg)

if (flux_job_submit_get_id (f, &id) < 0) {
if (errno == ENOSYS)
log_msg_exit ("submit: job-ingest module is not loaded");
log_msg_exit ("submit: job-ingest or job-manager module(s) not loaded");
else if ((errmsg = flux_future_error_string (f)))

This comment has been minimized.

@grondo

grondo Oct 17, 2018

Contributor

I would actually argue for flipping the order of processing the error message here.

First check for a valid flux_future_error_string(), and print with

log_msg_exit ("sumbit: %s: %s", errmsg, flux_strerror (errno))

this handles specifically the case of the job-manager module not loaded, since (if I read code correctly), batch_respond_error does include a specific "job-manager request failed" error message.

Then, the ENOSYS error means specifically that the job-ingest module is not loaded as before.

This comment has been minimized.

@garlick

garlick Oct 17, 2018

Author Member

Sounds good, I'll fix that.

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 17, 2018

Side note: my job eventlogs seem to have an extra line?

I think that's an artifact of flux kvs get. Try --raw or use flux kvs eventlog get.

For some reason, on instance shutdown I'm always getting a warning that job-ingest is not cleanly shutdown.

Oh, I had noticed that but had forgotten that instance shutdown is supposed to shut down modules as well. I'll have a look.

the job for which submit failed has an entry in the KVS and its eventlog has a submit event logged

Er, I think it should be cleaned out as though it never existed if the jobid hasn't been returned to the user yet. Will fix that.

Thanks, good feedback!

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Oct 17, 2018

I think that's an artifact of flux kvs get. Try --raw or use flux kvs eventlog get.

Oh yeah, duh!

if (!sec)
goto error;
if (!(J = flux_sign_wrap (sec, jobspec, strlen (jobspec), NULL, 0)))
goto error;

This comment has been minimized.

@SteVwonder

SteVwonder Oct 17, 2018

Member

If this fails, is it appropriate to grab/set the errnum and grab/log the errmsg from the security context? Given the enclosure of the security context, I don't think it is possible to get that information otherwise.

       if (!(J = flux_sign_wrap (sec, jobspec, strlen (jobspec), NULL, 0))) {
           flux_log (h, LOG_ERR, "%s: %s", __FUNCTION__, flux_security_last_error (sec));
           errno = flux_security_last_errnum (sec);
           goto error;
       }

This comment has been minimized.

@garlick

garlick Oct 18, 2018

Author Member

Good point. Certainly the errno should pass through if it doesn't already. We shouldn't log from a library and there is currently no way to return an error string to the user since we are not returning a future. I suppose we could go ahead and return a future and immediately fulfill it with an error and error string in that case (also if setting up the security context fails e.g. due to bad config file).

Hmm pondering...

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 18, 2018

For some reason, on instance shutdown I'm always getting a warning that job-ingest is not cleanly shutdown.

Oh, I had noticed that but had forgotten that instance shutdown is supposed to shut down modules as well. I'll have a look.

Sorry, I was confused and needed to review that code in the broker. Normally we have balanced load and remove calls in rc1 and rc3 and any loaded module not explicitly removed will trigger that error. (rmmod sends a <name>.shutdown request to the module, and the default handler for this causes the reactor to exit cleanly; if this doesn't happen by the time ctx->modhash is destroyed, then the module thread is pthread_cancel()ed, and this causes the warning about not being cleanly shutdown).

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Oct 18, 2018

Sorry, I was confused and needed to review that code in the broker. Normally we have balanced load and remove calls in rc1 and rc3 and any loaded module not explicitly removed will trigger that error. (rmmod sends a .shutdown request to the module, and the default handler for this causes the reactor to exit cleanly; if this doesn't happen by the time ctx->modhash is destroyed, then the module thread is pthread_cancel()ed, and this causes the warning about not being cleanly shutdown).

Oh yeah, I had forgotten about that too and now it seems obvious. Sorry!

@garlick garlick force-pushed the garlick:ingest_rework branch from 6b03c08 to 6f61ba9 Oct 19, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 19, 2018

I think I've addressed all the review comments thus far. I did throw some flux-job cleanup just now also.

The coverage is below the threshold, though much of the missed coverage is in error handling that's hard to hit.

One bit of missing coverage is allowing flux_job_submit() to internally sign jobspec. That's somewhat tricky since (as @SteVwonder pointed out elsewhere) we don't have munge running in travis yet, and the installed config file selects munge as the default signing mechanism. Maybe that can come later?

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 19, 2018

Unless there are objections, i'm going to squash down the incremental development here.

garlick added some commits Oct 9, 2018

libjob: add priority to flux_job_submit()
Add a priority argument to flux_job_submit().
The priority range is [0:31], default=16.

Update flux-job submitbench with -p,--priority=N
option.  Update libjob unit test.
modules/job-ingest: fix memory leak
Problem: batch->txn isn't freed when the batch is
destroyed, thus causing a memory leak.

Add a flux_kvs_txn_destroy() to batch_destroy().
modules/job-ingest: move batch response functions
Move batch_respond_success() and batch_respond_error()
up next to batch_create() and batch_destroy().
modules/job-ingest: announce more job data
Expand the payload of the new job announcement event
sent by job-ingest to include the userid, and prepare
to add other attributes, such as priority, in the future.

The job-manager must immediately fetch this information
from the KVS to populate its queue.  This will be more
efficient.

garlick added some commits Oct 9, 2018

modules/job-ingest: handle submit priority
Accept priority as input from submit request.
Verify range of [0:31].
Allow instance owner full range, guest [0:16].

Include priority in job-ingest.submit event.
Add priority key to KVS job schema.
modules/job-ingest: add submit event to eventlog
Create job eventlog per RFC 16.
modules/job-ingest: call job-manager.submit
Instead of publishing an open-loop event when new active
jobs are accepted, send a request to the job manager to
add the new job to the queue.  Wait for the response before
responding to the original ingest.submit request(s).
cmd/flux-job: change ENOSYS explanation
Problem: flux-job tells user that job-ingest is
not loaded if it gets ENOSYS from a job submission.
However, ENOSYS is also returned if job-manager
isn't loaded.

Since job-ingest returns a textual error message when
it fails to submit to job-manager, first print the
textual error if available, then if errno == ENOSYS,
suggest job-ingest might not be loaded, otherwise print
the flux_strerror() error message.
libjob: add sign_none_wrap(), sign_none_unwrap()
Problem: when flux-core is built without flux-security,
we send jobspec in the clear (unwrapped).  The
flux_job_submit() API call works differently, the
job-ingest.submit RPC behaves differently, and the
KVS job schema is populated differently.

Add functions for wrapping and unwrapping payloads with
the no-op "none" signing method, using the same encoding
used in flux-security so these interfaces can work
the same with or without flux-security.
libjob: include job.h in <flux/core.h>
Problem: job.h was not included by the installed <flux/core.h>
or the in-tree version, yet this will be a primary API
component.

Add job.h to those includes.
modules/job-ingest: unlink KVS entry on submit error
Problem: job entries remain in the KVS in three cases
where job-ingest.submit fails.

Unlink the KVS entries associated with a batch of submit
requests if the job-manager.submit RPC fails (two cases)
or if we fail to set up the continuation for the KVS commit.

Perform the unlink in parallel with responding to the user.
This makes writing a test specifically for this case difficult
in the standalone ingest sharness test.  However, a future
job-manager test that loads the job manager after some ingest
ENOSYS failures should find unexpected jobs in its queue if
the KVS entries remain afterwards.

Rather than make the cleanup code more complicated so it
completes before responding to the user, keep it simple and
defer testing to the job manager.
cmd/flux-job: avoid unnecessary jobspec copy
Problem: when not pre-signing jobspec, flux job submitbench
copies the jobspec read by read_all() in order to add a
terminating NULL for flux_job_submit().  Recently read_all()
was modified to add a terminating NULL that is not reflected
in the size, so this extra copy is no longer needed.

Remove the extra copy.
libjob: add FLUX_JOB_PRE_SIGNED flag
Problem: users of flux_job_submit() must know whether
flux-core was built --with-flux-security or not.

Add a FLUX_JOB_PRE_SIGNED flag that allows jobspec to
be signed before submission, but change the default
so flux_job_submit() signs the jobspec internally.

If built --with-flux-security, configure a security context using
default config file path and cache it in the flux_t handle.
Each time flux_job_submit() is called, the context is
reused, and the jobspec is signed with the default mechanism.

Take special care to enable the user to access a textual
error message from flux-security upon failure.  If a textual
error message is available, instead of returning NULL,
arrange for flux_job_submit() to return a future pre-fulfilled
with the error string.

If built without --with-flux-security, sign unconditionally
with the simplified local "sign_none" class.

Update flux-job to use the FLUX_JOB_PRE_SIGNED flag if
built --with-flux-security and the user requests non-default
security options that can only be set by signing the jobspec
with full access to the flux-security API.

Fixes #1715
modules/job-ingest: always expect J to be signed
In the case where flux was built without --with-flux-security,
decode J using the sign_none class.
testsuite: add dummy job-manager module
Problem: the job-ingest module cannot respond to
a submit request until the job-manager has accepted
the new job, but we need to test job-ingest in
isolation.

Create a dummy job-manager module that handles the
job-manager.submit RPC, logging each new job to a
KVS eventlog before responding.
testsuite: load dummy job-manager during ingest test
Load the 'job-manager-dummy' module so that jobs
can be submitted to 'job-ingest' in test.

Add tests to:
- verify that submission fails if the job-manager
  module is not loaded.
- verify that job data is accurately submitted to the
  job-manager.

@garlick garlick force-pushed the garlick:ingest_rework branch from 6d28612 to 1a8c69b Oct 19, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 19, 2018

Squashed and rebased on current master.

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 22, 2018

I think we agreed this could go in - could somebody please press the button? :-)

@SteVwonder SteVwonder merged commit 27d2efc into flux-framework:master Oct 23, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 23, 2018

Thanks!

@garlick garlick deleted the garlick:ingest_rework branch Oct 23, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.