Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

job-list: do not assume alloc event context always contains annotations #4907

Merged

Conversation

chu11
Copy link
Member

@chu11 chu11 commented Feb 2, 2023

Per discussion in #4906

I only added a regression test, no additional wider coverage tests yet per discussion in #4906.

Hopefully will have those wider extra coverage tests in soon. But just in case that extra wider coverage can't be completed in a timely manner, just wanted to make sure we atleast get this fix in before the next release, we can always add the extra tests later.

@chu11 chu11 force-pushed the issue4906_job_list_alloc_bypass branch 2 times, most recently from e764a97 to 1d0386c Compare February 2, 2023 16:45
@grondo
Copy link
Contributor

grondo commented Feb 2, 2023

Sounds good to me!

@chu11
Copy link
Member Author

chu11 commented Feb 2, 2023

added a coverage test on top, instead of the dump file or jobtap plugin approach as mentioned in #4904, I just "hand" wrote an eventlog over an existing one. That way we can easily add extra events in the future (vs the dump file approach).

@@ -0,0 +1,25 @@
#!/bin/bash -e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you have the follow on commit which adds a regression test to the job-list module, maybe this separate regression test isn't required? The reason I ask is that the t4000-issues test runs all the reproducers serially and at this point is really slow, so avoiding another flux start test therein would keep the testsuite a little leaner.

Copy link
Member Author

@chu11 chu11 Feb 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually the follow on test doesn't test it perfectly, as the regression that tests the event alloc {"bypass":1}, whereas the new test added alloc {"annotations":<something>, "etc": 1}.

But I could tweak it accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that might be good just to save some unnecessary work for the issues test driver.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhh this was a good idea, ended up there was a corner case with exception notes (which are optional).

@grondo grondo added this to the flux-core v0.47.0 milestone Feb 3, 2023
@chu11 chu11 force-pushed the issue4906_job_list_alloc_bypass branch from 2575eb7 to 38dc79f Compare February 3, 2023 23:17
@chu11
Copy link
Member Author

chu11 commented Feb 3, 2023

re-pushed, removing the regression test and adding a new test in t2260-job-list.t instead. Decided to add another test to check for optional context fields. The only other one I found was userid/note in exceptions, and it ended up there was a corner case in job-list for that case, so fixed that too.

Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. Nice job catching the exception note corner case!

Problem: When replaying the eventlog from the KVS, the job-list module
assumes an alloc event context must have an annotations field.  This is
invalid, an alloc context may not contain an annoations field.  It may
contain other fields.

Solution: Do not return EPROTO error if an alloc event context does not
contain annotations.

Fixes flux-framework#4906
Problem: Exception notes are optional, but this is not handled
when retrieving an exception note for a job-list request.

Solution: Ensure that an exception note is non-NULL when returning
it to the user.
Problem: There are several eventlog parsing corner cases that are not
covered.

Solution: Add several eventlog parsing corner case tests.  Specifically
add:

- missing annotations in alloc event context
- missing note in exception event context
- extraneous extra keys added to all event contexts
@codecov
Copy link

codecov bot commented Feb 4, 2023

Codecov Report

Merging #4907 (c72f485) into master (c72f485) will not change coverage.
The diff coverage is n/a.

❗ Current head c72f485 differs from pull request most recent head c498ad8. Consider uploading reports for the commit c498ad8 to get more accurate results

@@           Coverage Diff           @@
##           master    #4907   +/-   ##
=======================================
  Coverage   82.90%   82.90%           
=======================================
  Files         426      426           
  Lines       75251    75251           
=======================================
  Hits        62386    62386           
  Misses      12865    12865           

@mergify mergify bot merged commit 04d8826 into flux-framework:master Feb 4, 2023
@chu11 chu11 deleted the issue4906_job_list_alloc_bypass branch February 4, 2023 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants