-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
job-ingest: add ingest module, flux job command, libjob library #1626
Conversation
Just curious why not |
Honestly I can't remember. Happy to change it if that makes more sense. |
Well, it is only my opinion but a |
Yes will change. Not sure what I was on about there. |
Codecov Report
@@ Coverage Diff @@
## master #1626 +/- ##
========================================
Coverage ? 79.4%
========================================
Files ? 179
Lines ? 32512
Branches ? 0
========================================
Hits ? 25816
Misses ? 6696
Partials ? 0
|
OK, just renamed "add" to "submit" both in the API, the RPC request topic, and the event topic. I went ahead and squashed that down. |
The "program" to test for It does feel like maybe we'd want a way outside of compiling a C program to determine if an installed flux-core was built with flux-security support though... maybe we should annotate
or similar? Edit: meant to add this can be done in a later PR. Just wanted to make a general comment here. |
Also, if built without flux-security, should the |
Error from
Sorry about these trivial comments coming in disjoint like this, just trying to note small issues before I leave for the day. |
Thanks! I opened #1632 to track the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @garlick!
Now that options are suppressed in |
Problem: DOTHEX-encoded FLUIDs can be used to map FLUIDs onto the KVS namespace, but without zero padding, flux-kvs ls doesn't display or sort them nicely. Pad each dotted hex number out to four digits.
Problem: fluid_decode (type=MNEMONIC) does not fail if presented with words not in its dictionary or a phrase too short to represent a uint64_t. mn_decode()'s inline documentation is incorrect: /* Return value: * This function may return all the value returned by mn_decode_word_index * plus the following result code: * * MN_EWORD - Unrecognized word. */ It actually returns the number of bytes successfully written to the output. Require mn_decode() to return 8 in order for fluid_decode() to be successful.
Problem: jobspec sharness test was not run. Add jobspec inputs to EXTRA_DIST and run jobspec test if ENABLE_JOBSPEC.
Problem: while there is an ENABLE_JOBSPEC Makefile conditional, there is nothing that can be tested in config.h to determine if jobspec is being built. Add HAVE_JOBSPEC to config.h
If user requests it by specifying --with-flux-security, have configure locate flux-security using pkg-config. Defines HAVE_FLUX_SECURITY in config.h Defines HAVE_FLUX_SECURITY for Makefile.am's
Add a library will provides an API for job creation, monitoring, and control. For now it contains only an interface for job submission.
Add a module that handles job-ingest.add RPCs to add new jobs to the KVS using a rudimentary form of the RFC 16 job schema foramt. The user is returned a jobid based on the FLUID proposal, which allows 64-bit id generation to occur in parallel across ranks, while retaining a loose ordering of ids based in the time submitted. KVS per-job directories are generated using the DOTHEX FLUID encoding, e.g. job.active.0000.011c.ae00.0002 The instance owner and any user with ROLE_USER may submit jobs. Jobs must be signed, and the user authenticated as the submitter (by the connector) must match the signature, but the job signature is not authenticated at ingest time. The connector-authenticated userid is recored in the KVS under the "userid" key. The signed blob is recorded under the "J-signed" key. The job submission consists of signed RFC 14 jobspec, which is validated by temporarily instantiating a C++ Jobspec object and recording any parse errors. The parsed Jobspec may be further validated in the future, for example to find resource requests that can not be fulfilled, but for now we ingest all valid jobspec. (Validation is performed in a standalone .cpp file linked against libjobspec.la. The standalone C++, which exports a validate function callable from C, is compiled with the C++ compiler; automake then knows to link job-ingest against libstdc++). The unwrapped jobspec is written to the KVS under the "jobspec" key. The module is completely event driven, and KVS overhead is reduced and ingest rate increased by batching job-ingest.submit requests that arrive toether within 10ms. A "job-ingest.submit" event is generated after the KVS commit which contains an array of new jobids. This can be consumed by the job-manager module in the future, which will handle listing jobs for users, and informing the scheduler when new active jobs have been submitted.
Good point! OK, I dropped the little proggie and updated the sharness test to look at the output of I also took the opportunity to rebase on current master. If this is looking close, LMK and I'll squash the incremental development. |
I'm going to go ahead and squash. |
oops, just remembered there are two sharness scripts that need updating to check |
Add a front end command that will eventually contain the primary user interfaces for submitting and managing jobs such as list, run, or cancel. For now, it contains two subcommands for testing the job-ingest module: submitbench - test ingest throughput, maintaining a minimum number of outstanding RPCs id - convert jobid's between representations.
Ad a test_under_flux "profile" for testing the new execution system, starting with job-ingest module.
Pull in flux-security-0.2.0 via the travis-dep-builder script, then add --with-flux-security to some builders in the travis build matrix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went through this quickly and gave it a spin w/ and w/out flux-security and I don't have any other comments. LGTM.
Thanks, @garlick!
OK, I think this can go in - could somebody press the button? |
Thanks @garlick! |
Thanks! |
This PR resurrects #1543.
The interface in job.h (under libjob) for submitting work is:
The
flux job
command currently has two subcommands:and
There is an opt-in configure option for building with flux-security. If building without, jobs can be submitted and they are tagged with the submitting user, but they are not signed and thus will only be usable if we implement a short-circuit that allows the instance owner to launch its own work without signature verification.