Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libflux: change flux_respond(), flux_respond_error() plus misc fixes #2120

Merged
merged 14 commits into from Apr 17, 2019

Commits on Apr 17, 2019

  1. testsuite: fix missing hwloc test prereq

    Problem: t2005-hwloc-basic.t fails the "hwloc: internal
    aggregate-load cmd works" test if jq is not installed
    
    Add missing HAVE_JQ.
    
    Fixes flux-framework#2116
    garlick committed Apr 17, 2019
    Copy the full SHA
    2f9b67f View commit details
    Browse the repository at this point in the history
  2. libflux: drop va string from flux_respond_error()

    Problem: the varargs error string results in extra
    lines of code in common situations where an error
    string is only set conditionally.
    
    Switch to a (const char *errstr) argument.
    
    Fixes flux-framework#2088
    garlick committed Apr 17, 2019
    Copy the full SHA
    e98f00c View commit details
    Browse the repository at this point in the history
  3. broker: add human-readable message to ENOSYS response

    Problem: when a service is not registered, the
    ENOSYS (Function not implemented) response is a bit
    unhelpful.
    
    When the broker cannot find a service to route a request
    to, add a textual error message to the ENOSYS response, e.g.
      No service matching foo.bar is registered
    garlick committed Apr 17, 2019
    Copy the full SHA
    524adc9 View commit details
    Browse the repository at this point in the history
  4. modules/job-ingest: simplify submit error handling

    Problem: the job-manager.submit error path includes
    code to fill in the future error string if missing,
    but the future code already fills in with flux_strerror().
    
    Simplify this code to pass through the error string
    unconditionally.
    garlick committed Apr 17, 2019
    Copy the full SHA
    e536e0d View commit details
    Browse the repository at this point in the history
  5. cmd/flux-job: display error string on submit failure

    Problem: if job-manager is not loaded, flux job submit
    complains that the job-ingest is not loaded.
    
    Pass through the error string without interpretation,
    instead of special casing ENOSYS.
    
    Fixes flux-framework#2113
    garlick committed Apr 17, 2019
    Copy the full SHA
    969d336 View commit details
    Browse the repository at this point in the history
  6. modules/kvs: only unique keys in setroot event

    Problem: duplicate keys can appear in the KVS setroot
    event message, needlessly increasing message size.
    
    This was noted when multiple events to the same
    job eventlog were batched by the job-manager.
    
    Ensure key list is unique.
    
    Fixes flux-framework#2108
    garlick committed Apr 17, 2019
    Copy the full SHA
    2e8de03 View commit details
    Browse the repository at this point in the history
  7. libflux: drop errnum from flux_respond()

    Problem: flux_respond() can return errors as well as
    success, which sometimes leads to awkward and unsafe
    code, such as code that depends on errno being set to zero,
    when it is undefined *unless* there is an error.
    
    Drop errnum from flux_respond().  Now it should only
    be used to return success.
    
    Update tests and users.
    
    Fixes flux-framework#2089
    garlick committed Apr 17, 2019
    Copy the full SHA
    f946e4a View commit details
    Browse the repository at this point in the history
  8. doc/flux_respond(3): drop errnum from flux_respond()

    Update man page to match changes made to the function.
    garlick committed Apr 17, 2019
    Copy the full SHA
    beafb15 View commit details
    Browse the repository at this point in the history
  9. doc/flux_respond(3): drop va from flux_respond_error

    Update man page to match changes made to function
    prototype.
    garlick committed Apr 17, 2019
    Copy the full SHA
    dae0ae3 View commit details
    Browse the repository at this point in the history
  10. don't retry failed response

    Problem: When flux_respond_*() fails in a service's
    request handler, the error should not be handled by
    calling flux_respond_error(), since that is just as
    likely to fail as the first call.
    
    This unlikely error will probably be the result of a
    coding error, or a fatal error on the flux_t handle.
    Reasonable coping mechanisms are stopping the reactor,
    and/or by logging it at LOG_ERR level.
    
    This pattern occurs in the broker and the KVS.
    Log the error instead of retrying.
    garlick committed Apr 17, 2019
    Copy the full SHA
    c9e7bd9 View commit details
    Browse the repository at this point in the history
  11. rework response error handling

    Problem: the new footprint of flux_respond() and
    flux_respond_error() favor a pattern for response
    handlers that separates error and success code paths.
    The error path ends in flux_respond_error() and the
    success path ends in flux_respond().  However, some
    services use a different pattern that combines error
    and success code paths.  This pattern is now a poor
    fit for the respond functions.
    
    Replace this pattern in the content-cache, the broker
    log service, kvs, sched-simple, cron, and aggregator.
    garlick committed Apr 17, 2019
    Copy the full SHA
    3641c19 View commit details
    Browse the repository at this point in the history
  12. testsuite: add test prog for arbitrary rpc payload

    Add a test helper that sends a single RPC, using topic
    string from command line and payload from stdin.  Any
    response payload is displayed on stdout.
    
    $ jq -j -c -n  "{name:\"rank\"}" | t/request/rpc attr.get
    
    If a numerical argument follows the topic string, it indicates
    that a failure expected with errno == the number.  In that case,
    a failure response with the expected errno is treated as success,
    and a success response, or a failure response with the wrong
    errno is treated as failure.
    garlick committed Apr 17, 2019
    Copy the full SHA
    94f0b39 View commit details
    Browse the repository at this point in the history
  13. Copy the full SHA
    b47673f View commit details
    Browse the repository at this point in the history
  14. testsuite: add coverage of malformed requests

    Add test coverage to some request handlers for
    malformed request payloads, focusing on those
    whose response handling logic was just reworked.
    garlick committed Apr 17, 2019
    Copy the full SHA
    379b360 View commit details
    Browse the repository at this point in the history