New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
job-info & job-manager& flux-jobs: Support job annotations (annotations part 3) #3065
job-info & job-manager& flux-jobs: Support job annotations (annotations part 3) #3065
Conversation
That might be a good idea. We could also have meaningful default headings for those cases. |
Have anything in mind? I could see a couple of special cases, such as |
5730bec
to
95b2338
Compare
rebased on master & fixed up some conflicts given the unit tests added in PR #3062 |
Yeah, |
95b2338
to
8f358f9
Compare
Went ahead and squashed since my tweaks were a bit jumbled up, I hope thats ok :P Edit: simple output example
|
What am I doing wrong here?
Error messages are not helping me figure this out :-) |
The first two error messages are b/c "xyz" isn't legal json. Should be The latter is b/c you're trying to annotate a completed job. I should definitely put in a better error message in that case. |
Thanks, sorry, first naive experience! Note that jansson functions don't set errno, so the No error is because log_err_exit() is being used to log errors. Better messages would be good. Edit: looks like the service sends back useful string error messages, but |
My similarly naive first implementation :-) I implemented the
Realized it too :-) Pushed a fixup for some better error messages.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, here's a first round of suggestions for a bit of cleanup.
9bd4db0
to
044bb75
Compare
rebased on master & re-pushed addressing the stuff @garlick mentioend above. There's a lot of fixups and a new cleanup commit that I should re-order in the commit series. Would it be easier to review by just squashing everything? |
Thanks! Squashing would probably be preferable. |
c6aeb11
to
d5f9f57
Compare
Squashed and fixed a few nits along the way. Rebased while I was at it. |
How about one more rebase to resolve the conflict and then I'll do another review pass? |
Already working on it :-) but hit #3071 |
d5f9f57
to
c209ef9
Compare
c209ef9
to
88e7c59
Compare
rebased and repushed. I removed the #3074 fix since that won't affect travis. I'll put that fix in a separate PR. Edit: I also added a |
88e7c59
to
687b620
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the annotations added with flux job annotate
are not sticky across a restart. Is that intended? Should there be an annotate event in the job eventlog that is recovered by the job manager?
Possibly that could just be opened as an issue for now and we could circle back on that later if it's the desired behavior (I'm not super clear if it is?)
src/modules/job-manager/annotate.c
Outdated
if (update_annotation_recursive (job, | ||
job->annotations, | ||
annotations) < 0) { | ||
flux_log_error (h, "update_annotation_recursive"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since both places annotate_update()
is called log a message on error, this log message is redundant.
src/modules/job-manager/annotate.c
Outdated
if (annotations_update (ctx->h, job, annotations) < 0) { | ||
flux_log_error (h, "%s: annotations_update", __FUNCTION__); | ||
goto error; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe logging could be skipped here since the user is going to get a failure response?
src/modules/job-info/job_state.c
Outdated
json_t *value; | ||
|
||
if (!json_is_array (annotations)) { | ||
flux_log (ctx->h, LOG_ERR, "%s: annotations EPROTO", __FUNCTION__); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve error message? Maybe "annotations event is not an array" and skip the FUNCTION?
flux_jobid_t id; | ||
json_t *aValue; | ||
|
||
if (parse_annotation (value, &id, &aValue) < 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "job %ju annotation parse error"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I won't know the ID :-) (its parsed in this line), i'll adjust to "annotation parse error", to differentiate from the error above it
src/cmd/flux-job.c
Outdated
if (!(h = flux_open (NULL, 0))) | ||
log_err_exit ("flux_open"); | ||
|
||
id = parse_arg_unsigned (argv[optindex], "id"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
job ID's should now be parsed with parse_jobid()
to support F58 and other representations.
Sounds good. We'll have to update RFC27 as well. The key points in there:
|
Shall I squash? |
sure! |
e473fd6
to
b5688d0
Compare
squashed and re-pushed |
I get a reproducible failure in
Maybe some synchronization needed between the cancelation of jobs and the next test? |
test_expect_success 'job-manager: cancel all jobs' ' | ||
flux job cancel $(cat job1.id) && | ||
flux job cancel $(cat job4.id) && | ||
flux job cancel $(cat job3.id) && | ||
flux job cancel $(cat job4.id) | ||
flux job cancel $(cat job1.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a suggestion, I think flux job cancelall
takes a --states
argument, so this test could be shortened and made self documenting with something like (untested):
flux job cancelall --states=SCHED -f &&
flux job cancelall -f
I think you're right. I removed the "job states" tests b/c I thought them superfluous, but they are probably necessary for synchronization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to push @chu11's latest work here. All I did was run |
b5688d0
to
f6e7e56
Compare
Setting MWP. Thanks! |
jansson functions do not reliable set errno, so call flux_log() instead of flux_log_error(). In addition, to reduce excessive calls to flux_log(), place parsing of job state transitions into its own function.
Support a new job-annotations event so other modules, such as job-info, can be aware of changes to job annotations.
Listen to the newly created job-annotations event and make job annotations available to be returned by job listing services.
Add flux job list annotation tests to t2203-job-manager-dummysched-single and t2204-job-manager-dummysched-unlimited. Add list attribute tests to t/t2230-job-info-list.
Add job-manager.annotate RPC target, allowing users to be able to annotate their own jobs. Move several convenience functions from alloc.c to annotate.c, adjusting interfaces. Update unit tests accordingly.
Support new annotate command to annotate a specific job id.
Add new tests for job-manager.annotate service.
Add support to be able to handle arbitrary annotations setup by the scheduler / user by checking for the special prefix "annotations." in a formatting field. Ensure that any arbitrary missing field will return an empty string through a special AnnotationsInfo class created based on the annotations dictionary returned from job-info. Closes flux-framework#2608
Add flux jobs annotation tests to t2203-job-manager-dummysched-single t2204-job-manager-dummysched-unlimited t2205-job-manager-annotate Add output header and corner case tests to t2800-job-cmds.
For a few specially defined annotations in RFC27, output special case headers for those cases. Update header tests in t/t2800-jobs-cmd.t accordingly.
As a convenience to users, support the "sched" field as the same as "annotations.sched" and the "user" field as the same as "annotations.user". Update tests and documentation accordingly.
f6e7e56
to
7da96d2
Compare
Codecov Report
@@ Coverage Diff @@
## master #3065 +/- ##
==========================================
+ Coverage 81.18% 81.19% +0.01%
==========================================
Files 285 286 +1
Lines 44233 44398 +165
==========================================
+ Hits 35911 36051 +140
- Misses 8322 8347 +25
|
Built on top of PR #3062, this PR completes phase 1 job annotation support in
flux-core
.job-manager - support a new annotations event, which will publish annotations
job-info - read in annotations events from the above for querying by job-list services
job-manager - support a new user annotation service, allowing users to annotate their jobs
flux job annotate
flux job annotate id key value
syntax, andvalue
can be-
for stdin.user
object in the annotations.flux-jobs - support listing any annotations
flux-jobs
has a rule that any annotation that doesn't exist results in an empty string. So this is even if the user inputs bogus annotations. The reason we have to do this is b/c there is no way to know what annotations are legal or illegal, given any scheduler or user can come up with any annotations they want.sched
and/oruser
? So the user doesn't have to type "annotations." for each of them?An example output for:
example w/ output headers (sorta ugly for this example)