Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libflux: fix composite future implementation #1791

Merged
merged 4 commits into from Nov 1, 2018

Conversation

Projects
None yet
3 participants
@grondo
Copy link
Contributor

grondo commented Nov 1, 2018

Unfortunately, the composite future implementation was committed with only sanity unit tests, and as was determined by @garlick in #1750, the current code was broken for use with RPCs (and probably any other real-world usage).

The problem is that current executing context of the future on which the user may call flux_future_then(3) or flux_future_wait_for(3) was not properly propagated to all futures that may be involved in the composite. E.g. if a user composes a future from several back-to-back RPCs, then calls a blocking flux_future_get(3) on the composite, the get() will be sleeping on the "now" context reactor and associated "cloned" flux_t handle, but the predecessor futures will by default use the "main" flux_t handle, and thus those responses will be parked until flux_future_get(3) returns and the chain deadlocks.

The "fix" is to be sure to propagate the current context (reactor and flux_t handle, if any) to other composites children or prev/next futures at key points (creation, init callback, and flux_future_continue).

Also, it was apparent that the reference handling for flux_future_continue(3) wasn't clear from the docs, so this was clarified in the manpage and the future.h header.

There's some incremental development on t/rpc/chained.c here to make it easy to review. If the changes are ok, then we can squash that before merging.

Finally, it would be nice to flesh out the rest of the tests here (still missing tests for flux_future_or_then(3) and flux_future_wait_any/all(3). I can do that in a future PR if we just want to get these fixes in now.

@grondo grondo requested a review from garlick Nov 1, 2018

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Nov 1, 2018

Codecov Report

Merging #1791 into master will decrease coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff            @@
##           master   #1791      +/-   ##
=========================================
- Coverage    79.6%   79.6%   -0.01%     
=========================================
  Files         186     186              
  Lines       34560   34563       +3     
=========================================
+ Hits        27512   27514       +2     
- Misses       7048    7049       +1
Impacted Files Coverage Δ
src/common/libflux/composite_future.c 80.95% <100%> (+0.39%) ⬆️
src/common/libflux/future.c 86.87% <100%> (+0.93%) ⬆️
src/modules/connector-local/local.c 73.93% <0%> (-1.03%) ⬇️
src/modules/kvs/kvs.c 65.55% <0%> (-0.16%) ⬇️
src/common/libflux/message.c 81.14% <0%> (+0.12%) ⬆️
src/cmd/flux-module.c 85.58% <0%> (+0.3%) ⬆️
src/modules/barrier/barrier.c 78.62% <0%> (+2.06%) ⬆️

@grondo grondo force-pushed the grondo:composite-future-fixes branch from 4b444e1 to 0baf6f7 Nov 1, 2018

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Nov 1, 2018

Nice! Thanks so much for fixing this. I was really perplexed when I tried earlier.

Feel free to squash those test additions into the original commit.

Commit message for 97f5d5b contains a spelling error: becaause

I would vote we get this in ASAP and add more tests later since this is damned useful.

grondo added some commits Nov 1, 2018

doc: clarify flux_future_continue(3)
Problem: There was some evident confusion in the memory handling
for the futures passed to flux_future_continue(3), resulting in
memory leaks.

Clarify by adjusting the documentation in both the man page and
the comments on the function prototype in future.h.
libflux: fix segfault in flux_future_set_flux
Fix segfault in flux_future_set_flux(3) when it is passed a NULL handle.

@grondo grondo force-pushed the grondo:composite-future-fixes branch from 0baf6f7 to 85d8764 Nov 1, 2018

garlick and others added some commits Oct 23, 2018

testsuite: add rpc test for chained futures
Add a unit test t/rpc/chained.t to test chained composite futures
using flux_future_and_then(3) and flux_future_or_then(3).

Co-authored-by: Mark A. Grondona <mgrondona@llnl.gov>
libflux: fix composite futures for rpcs
Problem: composite futures, especially chained composites, do not
currently work for RPCs and probably in many other situations.
This is because the internal reactor and possibly cloned flux_t
handle for a chained future may not match those in the "previous"
future. Thus, a user calling flux_future_get() on a "next" or
parent future may be blocked on a flux_reactor_run() using the
wrong handle, which means a message required by a previous future
in the chain will never arrive.

Fix by ensuring *both* reactor and flux_t handle are propagated
to next, previous, or child futures as necessary during creation and
initialization of composite futures.

Fixes #1750.

@grondo grondo force-pushed the grondo:composite-future-fixes branch from 85d8764 to dbeecbe Nov 1, 2018

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Nov 1, 2018

Ok, squashed, fixed typo, and rebased on current master.

I also had time to at least add sanity check in "now" context for flux_future_or_then(3) to the new unit test.

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Nov 1, 2018

Nice! Thanks so much for fixing this.

I am very sorry for authoring the original code that didn't have any hope of working correctly in the first place!

@garlick

garlick approved these changes Nov 1, 2018

Copy link
Member

garlick left a comment

LGTM! Will merge once green.

@garlick garlick merged commit af34809 into flux-framework:master Nov 1, 2018

3 checks passed

codecov/patch 100% of diff hit (target 79.6%)
Details
codecov/project Absolute coverage decreased by -<.01% but relative coverage increased by +20.39% compared to a84b55e
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@grondo grondo deleted the grondo:composite-future-fixes branch Nov 1, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.