Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KVS: refactor kvs lookup replays #1696

Merged
merged 9 commits into from Oct 5, 2018

Conversation

Projects
None yet
4 participants
@chu11
Copy link
Contributor

chu11 commented Oct 3, 2018

Per discussion in #1318, refactor replays of lookups to use flux msg aux instead of rpc arg parameter.

This cleans up the code a bit and IMO makes it more clear, but the big win is the removal of the msg_cb_handler mini-library in the KVS module.

This PR follows #1694 and should be merged after it.

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Oct 3, 2018

Codecov Report

Merging #1696 into master will decrease coverage by 0.01%.
The diff coverage is 86.88%.

@@            Coverage Diff             @@
##           master    #1696      +/-   ##
==========================================
- Coverage    79.3%   79.28%   -0.02%     
==========================================
  Files         186      185       -1     
  Lines       35059    35037      -22     
==========================================
- Hits        27804    27780      -24     
- Misses       7255     7257       +2
Impacted Files Coverage Δ
src/modules/kvs/lookup.c 82.01% <ø> (-0.23%) ⬇️
src/modules/kvs/kvs.c 66% <82.75%> (-0.11%) ⬇️
src/common/libflux/handle.c 83.68% <86.66%> (+0.07%) ⬆️
src/modules/kvs/waitqueue.c 85.84% <94.11%> (+1.84%) ⬆️
src/common/libflux/mrpc.c 85.49% <0%> (-1.15%) ⬇️
src/broker/module.c 83.78% <0%> (-0.28%) ⬇️
src/common/libflux/message.c 80.95% <0%> (-0.24%) ⬇️
src/broker/modservice.c 79.8% <0%> (+0.96%) ⬆️

chu11 added some commits Oct 2, 2018

common/libflux: Add flux_requeue_nocopy
Add flux_requeue_nocopy(), equivalent to flux_requeue() but takes
ownership of the message passed in by the caller.
modules/kvs: Add aux set/get to msg_cb_handler
Add msg_cb_handler_msg_aux_set() and msg_cb_handler_msg_aux_get()
to msg_cb_handler to allow user to set/get aux data to the message
stored within the msg cb handler.
modules/kvs: Add aux set/get to wait_t
Add wait_msg_aux_set() and wait_msg_aux_get() to internal wait_t
structure to allow user to set/get aux data to the message
stored within a wait_t.
modules/kvs: Refactor lookup/watch replay
When a lookup request is replayed, instead of passing the lookup
handle via the additional 'arg' parameter, pass it via auxiliary
data in the rpc message.
modules/kvs: replay msg instead of calling cb
On getroot stalls, instead of calling the original callback
function directly (via msg_cb_handler), instead simply replay
the original message via flux_requeue_nocopy().

Fixes #1318
modules/kvs: Remove lookup_t aux data support
Remove aux data support in a lookup_t during create time and
remove lookup_get_aux_data().  aux data in lookup is no longer
used.
modules/kvs: Refactor waitqueue msg handler use
Refactor waitqueue to internally manage msg callback handler
instead of using msg_cb_handler.

This commit effectively reverts
0c3dfce

@chu11 chu11 force-pushed the chu11:issue1318 branch from 096d1dc to f4ae229 Oct 5, 2018

@chu11

This comment has been minimized.

Copy link
Contributor Author

chu11 commented Oct 5, 2018

rebased on master after the #1694 merge

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Oct 5, 2018

Nice cleanup!

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Oct 5, 2018

Hm, your build hit the same hangs as @SteVwonder saw in #1699 at

 python/t0008-jsc.py:  PASS: N=4   PASS=4   FAIL=0 SKIP=0 XPASS=0 XFAIL=0

I'll try restarting builders for now, but we may have to reopen #1699 and follow up (now that we know it can be reproduced in a PR I guess)

@chu11

This comment has been minimized.

Copy link
Contributor Author

chu11 commented Oct 5, 2018

It seems to have hung again:

       python/t0008-jsc.py:  PASS: N=4   PASS=4   FAIL=0 SKIP=0 XPASS=0 XFAIL=0

No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
The build has been terminated

Time to re-open #1699?

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Oct 5, 2018

This looks great @chu11!

@chu11

This comment has been minimized.

Copy link
Contributor Author

chu11 commented Oct 5, 2018

stephen seems to have his debug logs, so restarting the builder hoping it'll pass this time

@chu11

This comment has been minimized.

Copy link
Contributor Author

chu11 commented Oct 5, 2018

looks like it passed this time

@grondo grondo merged commit 3802e87 into flux-framework:master Oct 5, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.