Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

job-info: Return ENODATA when reaching the end of eventlogs #2374

Merged
merged 14 commits into from Sep 20, 2019

Conversation

@chu11
Copy link
Contributor

commented Sep 13, 2019

Per discussion #2338. This is built on top of #2366.

  • It is easy to "reach the end" of

    • the main job eventlog (event "clean")
    • a guest eventlog in the guest namespace
    • a guest eventlog copied into the primary KVS namespace
  • It is hard to know how to "reach the end" of a hypothetical namespace in the primary KVS that we don't know about. So I did not support that case. In that situation, the wait-event watches until it receives a cancellation notice.

  • I don't return any events after "clean" in the main job eventlog. So if the user were to do something stupid like, "flux kvs eventlog append" on it, that event won't show up via an eventlog watch.

  • As a side effect of this work #2356 and #2361 are basically closed too.

  • Once finishing this work, removed the now unncessary calls to "eventlog_watch_cancel" in cmd/flux-job.

  • I realized there is a race condition (although never seen) where an entry placed into a guest eventlog will not be seen by job-info. The race is:

    • user write to eventlog, kvs completes write, sends setroot event

    • user deletes namespace, kvs sends namespace-remove event kvs-watch

    • kvs-watch receives setroot event, does lookupat on key

    • kvs-watch receives namespace-remove event BEFORE lookupat returns, sends ENOTSUP to watchers

    • response from lookupat arrives after ENOTSUP sent

    • I elected to solve this by having watched eventlogs in the guest namespace ALWAYS fallthrough to the main KVS namespace afterwards. Some offsets checks ensure that all data in the guest eventlog has been read, and if it hasn't, send it over.

@codecov-io

This comment has been minimized.

Copy link

commented Sep 13, 2019

Codecov Report

Merging #2374 into master will decrease coverage by <.01%.
The diff coverage is 79.51%.

@@            Coverage Diff             @@
##           master    #2374      +/-   ##
==========================================
- Coverage   80.84%   80.83%   -0.01%     
==========================================
  Files         218      218              
  Lines       34540    34645     +105     
==========================================
+ Hits        27925    28007      +82     
- Misses       6615     6638      +23
Impacted Files Coverage Δ
src/shell/output.c 73.61% <79.2%> (-3.02%) ⬇️
src/cmd/flux-job.c 84.79% <79.26%> (-0.34%) ⬇️
src/modules/job-info/guest_watch.c 74.14% <79.48%> (+2.55%) ⬆️
src/modules/job-info/watch.c 69.94% <81.48%> (+0.9%) ⬆️
src/modules/connector-local/local.c 73.11% <0%> (-1.17%) ⬇️
src/common/libutil/veb.c 98.42% <0%> (-0.53%) ⬇️
src/common/libflux/message.c 81.02% <0%> (ø) ⬆️
... and 5 more
@chu11 chu11 referenced this pull request Sep 19, 2019
@chu11 chu11 force-pushed the chu11:shell_live_output_issue2338 branch from 983cd4a to 6cc3ac3 Sep 19, 2019
chu11 added 4 commits Sep 12, 2019
Fix naming of output files in tests, which were out of order and
sometimes incorrectly named.
Remove check for w->path being NULL, as it is always required to
be non-NULL.
@chu11 chu11 force-pushed the chu11:shell_live_output_issue2338 branch from 6cc3ac3 to 1320444 Sep 19, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 19, 2019

rebased on master and re-organized so that this can go in without #2366.

@garlick

This comment has been minimized.

Copy link
Member

commented Sep 19, 2019

It is hard to know how to "reach the end" of a hypothetical namespace in the primary KVS that we don't know about. So I did not support that case. In that situation, the wait-event watches until it receives a cancellation notice.

To clarify: the rule from RFC 21 is that the job directory is no longer written to after the job reaches INACTIVE state, so in theory you could return ENODATA on any eventlog under the job directory once INACTIVE has been reached.

@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 19, 2019

To clarify: the rule from RFC 21 is that the job directory is no longer written to after the job reaches INACTIVE state, so in theory you could return ENODATA on any eventlog under the job directory once INACTIVE has been reached.

Understood. "hard to know" wasn't probably the right word, perhaps I should have said, "it's a lot of work to know": monitor job state transitions, but that's only if job is active, so gotta check main eventlog first (I think?) if job active, hopefully nothing too racy there, etc.

@garlick

This comment has been minimized.

Copy link
Member

commented Sep 20, 2019

Pondering if there's anything job manager can do to make life easier here...

Maybe that doesn't have to hold this up though.

@garlick

This comment has been minimized.

Copy link
Member

commented Sep 20, 2019

I Have to head home but I'll plan to review some more tonight/tomorrow AM.

@chu11 chu11 force-pushed the chu11:shell_live_output_issue2338 branch from 1320444 to 4c66e7b Sep 20, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 20, 2019

checked diff coverage, 79% seems good. Missed "impossible" errors or very racy bits.

re-pushed, just noticed a typo in a comment.

@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 20, 2019

Pondering if there's anything job manager can do to make life easier here...

could job state transitions be put into the eventlog? or perhaps another eventlog with a known end event?

Maybe that doesn't have to hold this up though.

That was my thought. Adding a lot of complexity for a feature we have no use case for.

Copy link
Member

left a comment

First pass review. I'm a bit lost in the weeds so I'll have another pass when I get to work.

# yet. Then we cancel the second job before we know it has started.

test_expect_success NO_CHAIN_LINT 'flux job wait-event -p guest.exec.eventlog works (never start job)' '
jobidall=$(submit_job_live test-all.json)

This comment has been minimized.

Copy link
@garlick

garlick Sep 20, 2019

Member

Couple suggestions on submit_job_live() and other functions at the top of this test script:

  • check for failure in commands, e.g. use && chain internally
  • rename test.json to sleeplong.json or similar and use integer sleep seconds rather than inf (portability)
  • declare local variables to avoid unintended side-effects

(I double checked that tests are still passing with && chain in submit_job() and submit_job_wait())

* events in the guest eventlog, assume job was moved into
* the main namespace before we began watching in the
* guest namespace.
/* Guest namespace has been removed. If we have read

This comment has been minimized.

Copy link
@garlick

garlick Sep 20, 2019

Member

Commit a3967b7 is mostly overwritten by later commits (squash?)

This comment has been minimized.

Copy link
@garlick

garlick Sep 20, 2019

Member

The overwritten comment talks about handling the case of an empty eventlog. I don't think that can happen. Eventlogs are created when the first event is appended (perhaps that needs calling out in RFC 18?)

This comment has been minimized.

Copy link
@chu11

chu11 Sep 20, 2019

Author Contributor

Yeah, it is overwritten within this commit series. I decided to leave the commit series b/c of how some individual issues were fixed along the way. Could squash.

I assumed there could be cases of mkdir without an append. If that's not possible, could close that issue and squash together. Let me see about how to re-order squash the commits.

This comment has been minimized.

Copy link
@garlick

garlick Sep 20, 2019

Member

I assumed there could be cases of mkdir without an append.

Did you mean an empty put?

This comment has been minimized.

Copy link
@chu11

chu11 Sep 20, 2019

Author Contributor

Did you mean an empty put?

What I meant to say was "mkdir and nothing written afterwards". But yeah, an empty put.

/* Guest namespace has been removed and eventlog has been
* moved to primary KVS namespace. Fallthrough to primary
* KVS namespace to ensure that we have output all data to
* the user. This is very small race possibility where

This comment has been minimized.

Copy link
@garlick

garlick Sep 20, 2019

Member

Is the "very small race possibility" documenting a known race or describing code that fixes the race? It's right on top of another comment block describing what looks like a fix for a different race.

This comment has been minimized.

Copy link
@chu11

chu11 Sep 20, 2019

Author Contributor

let me see how fixing up those comments. There are really two races, and both are fixed in the same place.

@garlick

This comment has been minimized.

Copy link
Member

commented Sep 20, 2019

could job state transitions be put into the eventlog? or perhaps another eventlog with a known end event?

You mean so we aren't replicating state machine details across modules? At one point I had extracted the state machine code in libjob but I think that was part of a PR that didn't go in...

@garlick

This comment has been minimized.

Copy link
Member

commented Sep 20, 2019

I never got back to #1665, which was supposed to be a single job "state watch" service offered by the job manager. It seemed like between the eventlog and the event state transition events, we had it covered. Now I wonder if this would be helpful to reduce complexity in job-info?

(sorry, definitely off topic for this PR. Maybe just something to talk through next week)

chu11 added 8 commits Sep 20, 2019
In t2204-job-info.t and t2205-job-info-security.t, fixup helper
functions in tests for correctness and clarity.

Includes:

- For portability, use "sleep 300" instead of "sleep inf"
- Rename test jobspecs from "test" and "test-all", to "sleeplong" and
  "sleeplong-all-rsrc".
- In helper functions, declare local variables local
- In helper functions, check for failures by chaining commands
When watching the main job eventlog, when reaching the end of
the eventlog (i.e. "clean" event), return ENODATA to the user.
Update job-info watch tests for behavior change to watch, in which
ENODATA is immediately returned upon reaching the end of the main
job eventlog.  Add new tests for coverage as well.
As a consequence of the main eventlog returning ENODATA at the
of the eventlog, a hang can no longer occur while waiting for
a guest namespace to be created.

Update comments to note this fact.

Closes #2361
After it has been determined that a guest eventlog has been migrated
to the primary KVS namespace, there is no need to "watch" the eventlog
in the namespace.  We know that the eventlog is complete, so instead
"lookup" the eventlog and send the eventlog to the caller.
Add test for monitoring a guest namespace eventlog, but the guest
namespace will never be created.
When watching a guest eventlog in the guest namespace, return ENODATA
when the guest eventlog has reached its end.  We know the guest
eventlog has reached the end when the guest namespace has been removed
and ENOTSUP is received.

Closes #2338
Adjust comments on tests, as wait-event no longer "hangs" when an
event doesn't happen, but rather it "times out" if the event doesn't
occur in the time period expected.
chu11 added 2 commits Sep 13, 2019
In the kvs-watch module, there is a small possibility the namespace
remove event could be received before the last bits of data from
a watch are sent to a watcher.

- user write to eventlog, kvs commits write, sends setroot event
- user deletes namespace, kvs sends namespace-remove event
- kvs-watch receives setroot event, does lookupat on key
- kvs-watch receives namespace-remove event BEFORE lookupat returns,
  sends ENOTSUP to watchers
- response from lookupat arrives after ENOTSUP sent

To workaround this, in the job-info module, always track how much data
from the guest eventlog has been sent.  When the namespace has been
removed, fallthrough from the guest namespace to the primary KVS
namespace, and check that all data that could be sent to the caller
has been sent.

Fixes #2386
With the primary job eventlog and guest eventlogs now automatically
sending ENODATA upon completion, attach no longer needs to call
flux_job_event_watch_cancel() under several circumstances.

In particular:

- it does not need to call flux_job_event_watch_cancel()
  after receiving the "clean" event in the primary job eventlog.
- it does not need to monitor EOF counts in guest.output, the
  output will complete when it reaches the end.
@chu11 chu11 force-pushed the chu11:shell_live_output_issue2338 branch from 4c66e7b to 68ea5be Sep 20, 2019
@chu11

This comment has been minimized.

Copy link
Contributor Author

commented Sep 20, 2019

Re-pushed with fixes based on comments.

  • Added a new commit in the series that cleans up helper functions in t2204-job-info and t2205-job-info-security.

  • I nixed issue #2356, since we can assume no eventlog can be empty. The previous commit I had that fixed that issue is now gone.

  • I created issue #2386 to describe the race between kvs-watch and job-info.

  • The fix for #2338 (send ENODATA when eventlog at end) is spread out over multiple commits in this PR. The fix for issue #2386 fixes a specific issue that happens to (almost entirely) overwrite one of the commits for #2338. Since I fixed #2338 first and #2386 second, it didn't seem right to squash those commits together, so I left them separate. That keeps the progression of fixes in the commit log.

  • I modified the comments so they are hopefully more clear about racy bits in the code.

@codecov-io

This comment has been minimized.

Copy link

commented Sep 20, 2019

Codecov Report

Merging #2374 into master will increase coverage by 0.05%.
The diff coverage is 79.41%.

@@            Coverage Diff             @@
##           master    #2374      +/-   ##
==========================================
+ Coverage   80.87%   80.93%   +0.05%     
==========================================
  Files         222      222              
  Lines       34892    34901       +9     
==========================================
+ Hits        28218    28246      +28     
+ Misses       6674     6655      -19
Impacted Files Coverage Δ
src/cmd/flux-job.c 85.52% <100%> (+0.39%) ⬆️
src/modules/job-info/guest_watch.c 73.29% <76.92%> (+1.7%) ⬆️
src/modules/job-info/watch.c 69.94% <81.48%> (+0.9%) ⬆️
src/common/libflux/message.c 80.36% <0%> (-0.14%) ⬇️
src/cmd/flux-module.c 84.19% <0%> (+0.23%) ⬆️
src/broker/module.c 75.17% <0%> (+0.23%) ⬆️
src/broker/modservice.c 71.42% <0%> (+0.75%) ⬆️
src/modules/job-info/lookup.c 68.54% <0%> (+0.8%) ⬆️
... and 2 more
Copy link
Member

left a comment

This looks good to me now - approving. Thanks for the changes!

I'll go ahead and set merge-when-passing.

@mergify mergify bot merged commit e03e2c8 into flux-framework:master Sep 20, 2019
3 of 4 checks passed
3 of 4 checks passed
codecov/patch 79.41% of diff hit (target 80.87%)
Details
Summary 1 rule matches
Details
codecov/project 80.93% (+0.05%) compared to a494a7f
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@garlick garlick referenced this pull request Sep 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.