Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

job-info: Return ENODATA when reaching the end of eventlogs #2374

Merged
merged 14 commits into from Sep 20, 2019

Commits on Sep 19, 2019

  1. t/t2204-job-info: Fix output file renaming

    Fix naming of output files in tests, which were out of order and
    sometimes incorrectly named.
    chu11 committed Sep 19, 2019
    Copy the full SHA
    7f88756 View commit details
    Browse the repository at this point in the history
  2. job-info: Fix tabbing

    chu11 committed Sep 19, 2019
    Copy the full SHA
    2826322 View commit details
    Browse the repository at this point in the history
  3. Copy the full SHA
    5dfc85d View commit details
    Browse the repository at this point in the history
  4. job-info: Cleanup w->path checks

    Remove check for w->path being NULL, as it is always required to
    be non-NULL.
    chu11 committed Sep 19, 2019
    Copy the full SHA
    6b2aa1b View commit details
    Browse the repository at this point in the history

Commits on Sep 20, 2019

  1. t: Update helper functions in job-info tests

    In t2204-job-info.t and t2205-job-info-security.t, fixup helper
    functions in tests for correctness and clarity.
    
    Includes:
    
    - For portability, use "sleep 300" instead of "sleep inf"
    - Rename test jobspecs from "test" and "test-all", to "sleeplong" and
      "sleeplong-all-rsrc".
    - In helper functions, declare local variables local
    - In helper functions, check for failures by chaining commands
    chu11 committed Sep 20, 2019
    Copy the full SHA
    0489c38 View commit details
    Browse the repository at this point in the history
  2. job-info: Return ENODATA at end of main eventlog

    When watching the main job eventlog, when reaching the end of
    the eventlog (i.e. "clean" event), return ENODATA to the user.
    chu11 committed Sep 20, 2019
    Copy the full SHA
    78e5431 View commit details
    Browse the repository at this point in the history
  3. t/t2204-job-info: Update tests for watch change

    Update job-info watch tests for behavior change to watch, in which
    ENODATA is immediately returned upon reaching the end of the main
    job eventlog.  Add new tests for coverage as well.
    chu11 committed Sep 20, 2019
    Copy the full SHA
    225e870 View commit details
    Browse the repository at this point in the history
  4. job-info: Update comments for no-hang possibility

    As a consequence of the main eventlog returning ENODATA at the
    of the eventlog, a hang can no longer occur while waiting for
    a guest namespace to be created.
    
    Update comments to note this fact.
    
    Closes flux-framework#2361
    chu11 committed Sep 20, 2019
    Copy the full SHA
    6ca76ed View commit details
    Browse the repository at this point in the history
  5. t/t2204-job-info: Test never started job

    Add test for monitoring a guest namespace eventlog, but the guest
    namespace will never be created.
    chu11 committed Sep 20, 2019
    Copy the full SHA
    0a51df7 View commit details
    Browse the repository at this point in the history
  6. job-info: Do not watch guest eventlog in main KVS

    After it has been determined that a guest eventlog has been migrated
    to the primary KVS namespace, there is no need to "watch" the eventlog
    in the namespace.  We know that the eventlog is complete, so instead
    "lookup" the eventlog and send the eventlog to the caller.
    chu11 committed Sep 20, 2019
    Copy the full SHA
    fb915e0 View commit details
    Browse the repository at this point in the history
  7. job-info: Return ENODATA at end of guest eventlog

    When watching a guest eventlog in the guest namespace, return ENODATA
    when the guest eventlog has reached its end.  We know the guest
    eventlog has reached the end when the guest namespace has been removed
    and ENOTSUP is received.
    
    Closes flux-framework#2338
    chu11 committed Sep 20, 2019
    Copy the full SHA
    be8819d View commit details
    Browse the repository at this point in the history
  8. t/t2204-job-info: Update test comments for "hangs"

    Adjust comments on tests, as wait-event no longer "hangs" when an
    event doesn't happen, but rather it "times out" if the event doesn't
    occur in the time period expected.
    chu11 committed Sep 20, 2019
    Copy the full SHA
    8f7c974 View commit details
    Browse the repository at this point in the history
  9. job-info: Workaround namespace remove race

    In the kvs-watch module, there is a small possibility the namespace
    remove event could be received before the last bits of data from
    a watch are sent to a watcher.
    
    - user write to eventlog, kvs commits write, sends setroot event
    - user deletes namespace, kvs sends namespace-remove event
    - kvs-watch receives setroot event, does lookupat on key
    - kvs-watch receives namespace-remove event BEFORE lookupat returns,
      sends ENOTSUP to watchers
    - response from lookupat arrives after ENOTSUP sent
    
    To workaround this, in the job-info module, always track how much data
    from the guest eventlog has been sent.  When the namespace has been
    removed, fallthrough from the guest namespace to the primary KVS
    namespace, and check that all data that could be sent to the caller
    has been sent.
    
    Fixes flux-framework#2386
    chu11 committed Sep 20, 2019
    Copy the full SHA
    78cd709 View commit details
    Browse the repository at this point in the history
  10. cmd/flux-job: Update to not use event-watch-cancel

    With the primary job eventlog and guest eventlogs now automatically
    sending ENODATA upon completion, attach no longer needs to call
    flux_job_event_watch_cancel() under several circumstances.
    
    In particular:
    
    - it does not need to call flux_job_event_watch_cancel()
      after receiving the "clean" event in the primary job eventlog.
    - it does not need to monitor EOF counts in guest.output, the
      output will complete when it reaches the end.
    chu11 committed Sep 20, 2019
    Copy the full SHA
    68ea5be View commit details
    Browse the repository at this point in the history