Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement KVS transaction object #1107

Merged
merged 18 commits into from Jul 19, 2017

Conversation

Projects
None yet
5 participants
@garlick
Copy link
Member

garlick commented Jul 14, 2017

This PR implements the proposed flux_kvs_txn_t objects for the KVS discussed in #1094:

flux_kvs_txn_t *flux_kvs_txn_create (void);
void flux_kvs_txn_destroy (flux_kvs_txn_t *txn);

int flux_kvs_txn_put (flux_kvs_txn_t *txn, int flags,
                      const char *key, const char *json_str);

int flux_kvs_txn_pack (flux_kvs_txn_t *txn, int flags,
                       const char *key, const char *fmt, ...);

int flux_kvs_txn_link (flux_kvs_txn_t *txn, int flags,
                       const char *key, const char *json_str);

int flux_kvs_txn_mkdir (flux_kvs_txn_t *txn, int flags,
                        const char *key);

int flux_kvs_txn_unlink (flux_kvs_txn_t *txn, int flags,
                         const char *key);

int flux_kvs_txn_symlink (flux_kvs_txn_t *txn, int flags,
                          const char *key, const char *target);

flux_future_t *flux_kvs_commit (flux_t *h, int flags, flux_kvs_txn_t *txn);

flux_future_t *flux_kvs_fence (flux_t *h, int flags, const char *name,
                               int nprocs, flux_kvs_txn_t *txn);

As discussed in #1094 we'll want some convenience functions since having to create/put/commit/destroy to change a single value is a bit cumbersome. For now the old interfaces are still there, rewritten in terms of the above.

This PR also internally converts the KVS client API to use jansson.

@garlick garlick changed the title implement flux_kvs_txn_t implement KVS transaction object Jul 14, 2017

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Jul 14, 2017

Codecov Report

Merging #1107 into master will decrease coverage by 0.31%.
The diff coverage is 74.62%.

@@            Coverage Diff             @@
##           master    #1107      +/-   ##
==========================================
- Coverage   78.34%   78.03%   -0.32%     
==========================================
  Files         157      159       +2     
  Lines       26132    26232     +100     
==========================================
- Hits        20474    20469       -5     
- Misses       5658     5763     +105
Impacted Files Coverage Δ
src/common/libkvs/kvs.c 100% <ø> (+19.93%) ⬆️
src/modules/kvs/fence.c 85.24% <100%> (-0.24%) ⬇️
src/modules/kvs/commit.c 87.65% <100%> (ø) ⬆️
src/modules/wreck/wrexecd.c 75.08% <50%> (-0.3%) ⬇️
src/common/libkvs/kvs_txn.c 61% <61%> (ø)
src/common/libkvs/jansson_dirent.c 73.33% <73.33%> (ø)
src/common/libkvs/kvs_classic.c 84.54% <82.97%> (-3.34%) ⬇️
src/common/libkvs/kvs_watch.c 86.76% <84.45%> (-6.21%) ⬇️
src/common/libkvs/json_dirent.c 95.45% <0%> (-2.28%) ⬇️
src/common/libflux/handle.c 83.67% <0%> (-2.08%) ⬇️
... and 20 more

@garlick garlick force-pushed the garlick:kvs_api_cleanup2 branch from 339bc17 to 10d3961 Jul 14, 2017

@coveralls

This comment has been minimized.

Copy link

coveralls commented Jul 14, 2017

Coverage Status

Coverage decreased (-0.2%) to 78.318% when pulling e605530 on garlick:kvs_api_cleanup2 into a3add43 on flux-framework:master.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Jul 14, 2017

Coverage Status

Coverage decreased (-0.2%) to 78.309% when pulling 10d3961 on garlick:kvs_api_cleanup2 into a3add43 on flux-framework:master.

@garlick garlick force-pushed the garlick:kvs_api_cleanup2 branch from 10d3961 to 9b389d7 Jul 14, 2017

@coveralls

This comment has been minimized.

Copy link

coveralls commented Jul 15, 2017

Coverage Status

Coverage decreased (-0.1%) to 78.344% when pulling 9b389d7 on garlick:kvs_api_cleanup2 into a3add43 on flux-framework:master.

@garlick garlick force-pushed the garlick:kvs_api_cleanup2 branch from 9b389d7 to 1096b57 Jul 17, 2017

@coveralls

This comment has been minimized.

Copy link

coveralls commented Jul 17, 2017

Coverage Status

Coverage decreased (-0.1%) to 78.338% when pulling 1096b57 on garlick:kvs_api_cleanup2 into a3add43 on flux-framework:master.

@chu11 chu11 referenced this pull request Jul 18, 2017

Merged

KVS Jansson Updates #1108

@chu11

This comment has been minimized.

Copy link
Contributor

chu11 commented Jul 18, 2017

At a high level, things look good to me. I think it was a good idea to document the json "null" vs C NULL, which can be confusing.

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Jul 18, 2017

Doing a bit of tidying up on this PR. Hopefully should be ready for another look by mid morning, then if no objection's I'll squash down the incremental changes and ask for merge consideration.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Jul 18, 2017

Coverage Status

Coverage decreased (-0.2%) to 78.299% when pulling ca21560 on garlick:kvs_api_cleanup2 into a3add43 on flux-framework:master.

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Jul 18, 2017

OK, done with cleanup. This is ready for review IMHO.

chu11 and others added some commits Jul 13, 2017

common/libkvs: Create jansson_dirent
Create jansson dirent, a duplicate of json_dirent but using jansson.

This set of functions is used for conversion of json-c to jansson
throughout flux.
test/kvs: reduce number of simultaneous watches
t1000-kvs-basic.t calls
  t/kvs/watch simulwatch <key> 8192

This is a stress test of sorts for the kvs_watch client
code.  Drop the count from 8192 to 256 to avoid a future problem
at exactly 500 (not 499) concurrent watches of the same key
where the test hangs.  The problem begins occurring after
the jansson rework of kvs_watch - modify the test here to
avoid breaking git bisect.

This may be an actual bug, but since kvs_watch() has significant
other changes pending, and many simulateous watches is not a real
use case anyway, leave it for another time.
test/python: drop kvs test_set_none() check
This test appears to be wrong - putting a null key is
equivalent to an unlink, but the test fails when that
actually happens.  The test may have been relying on
an earlier bug where the value of a key became the JSON
string "null" instead of the json value null.

Drop the test for now.

@garlick garlick force-pushed the garlick:kvs_api_cleanup2 branch from 1914b4e to 33042b8 Jul 18, 2017

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Jul 18, 2017

rebased and added one last test.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Jul 18, 2017

Coverage Status

Coverage decreased (-0.3%) to 78.333% when pulling 33042b8 on garlick:kvs_api_cleanup2 into 5776aeb on flux-framework:master.

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Jul 18, 2017

Had one travis builder stall here:

PASS: t2000-wreck.t 5 - wreckrun: propagates current working directory
PASS: t2000-wreck.t 6 - wreckrun: propagates current environment
*** glibc detected *** lua: corrupted double-linked list: 0x0000000000dc2590 ***
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
The build has been terminated

Hmm. Restarting.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Jul 18, 2017

Coverage Status

Coverage decreased (-0.3%) to 78.303% when pulling 9e5416d on garlick:kvs_api_cleanup2 into 5776aeb on flux-framework:master.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Jul 18, 2017

Maybe it should be obvious, but perhaps the documentation for flux_kvs_txn_link could be improved.

flux_kvs_txn_link() sets key to be a hard link to a KVS tree object looked up with FLUX_KVS_TREEOBJ.

Perhaps two suggestions to make this clearer:

  • Replace the generic const char *json_str argument name in the prototype with const char *treeobj for symmetry with other calls that take treeobj JSON arguments.
  • Instead of referencing only the FLUX_KVS_TREEOBJ enum, would it be ok to reword the above like:

flux_kvs_txn_link() sets key to be a hard link to a KVS tree object obtained with flux_kvs_lookup() or flux_kvs_lookupat() using the FLUX_KVS_TREEOBJ flag.

(or similar)

Also, how does this function compare to the "classic" call flux_kvs_put_treeobj?

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Jul 18, 2017

Those changes sound good, patch coming.

Also, how does this function compare to the "classic" call flux_kvs_put_treeobj?

Identical except that kvs_put_treeobj() operates on the default transaction rather than an explict one.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Jul 18, 2017

Identical except that kvs_put_treeobj() operates on the default transaction rather than an explict one.

Thanks. In a way the name of kvs_put_treeobj was more clear, so I was wondering about the change to simply "link" (sorry if this was already discussed and I forgot!). flux_kvs_txn_link() makes me think of the link(2) syscall so I was expecting the arguments to be similar -- e.g. link (const char *oldpath, const char *newpath), which might have added to the confusion.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Jul 18, 2017

The only other question I'll pose is whether the _txn_ in the function names was dropped purposefully for flux_kvs_commit and flux_kvs_commit. If there is any chance "implied transaction" versions of these functions would be needed in the future, we might want to reserve the non-txn names.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Jul 18, 2017

Other than those minor questions, I did test out this PR on some of our TOSS2 and 3 systems and no gotchas so I'm willing to merge for you once you're ready.

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Jul 18, 2017

There wasn't any discussion that I recall. I sort of made that change on my own initiative, and I'm fine with changing it back. I find my thinking on interface design is often a bit warped when I've had my head in the implementation details.

I'll go ahead and change that, and squash the incremental changes. Thanks for the testing and review!

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Jul 18, 2017

Actually, maybe for symmetry with flux_kvs_lookup(key, flags=FLUX_KVS_TREEOBJ), we should drop the extra function and use a FLUX_KVS_TREEOBJ flag with flux_kvs_txn_put()?

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Jul 18, 2017

Actually, maybe for symmetry with flux_kvs_lookup(key, flags=FLUX_KVS_TREEOBJ), we should drop the extra function and use a FLUX_KVS_TREEOBJ flag with flux_kvs_txn_put()?

That works for me if it is OK with you!

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Jul 19, 2017

Hm, the above might not be related.. it looks like some possible jansson refcounting issue though.

@chu11

This comment has been minimized.

Copy link
Contributor

chu11 commented Jul 19, 2017

Was rebasing on your branch to prepare my PR #1108. Noticed you axed j_dirent_append() in d0125f3. I use this in my unit tests in PR #1108. AFAICT it'll be the last location it can be used, so I could just create a helper function inside the unit tests. But since I do use it when converting over to jansson, perhaps we can keep j_dirent_append() for now. Then I can refactor it out later on?

@garlick garlick force-pushed the garlick:kvs_api_cleanup2 branch from 9b2c239 to d1c16c4 Jul 19, 2017

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Jul 19, 2017

@grondo found the bug - tried to decref a json_t obtained with json_unpack ("s:o").

Just pushed the fix and dropped the proposed lua fix (opened #1112 for future consideration, in case it has merit on its own). I'll run the PMI test that was failing in a loop during our group meeting and if it doesn't fail in an hour, we're probably past that one. Whew! Thanks @grondo!

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Jul 19, 2017

@chu11: ok I'll drop the commit that removes that function if it's more convenient to leave it in for now.

@garlick garlick force-pushed the garlick:kvs_api_cleanup2 branch from d1c16c4 to 8f77614 Jul 19, 2017

garlick added some commits Jul 17, 2017

libkvs/txn: add KVS transaction object
Rework the "write" side of the KVS API in terms of an
explicit transaction object as discussed in #1094.
The interface is essentially
- create txn
- append ops to txn:  put, pack, mkdir, unlink, symlink
- commit/fence txn
- destroy txn

Also: rename the KVS_NOMERGE flag to FLUX_KVS_NOMERGE.

Provide a private interface so that unit tests can
examine internal contents of a commit.
test/kvs: add unit test for kvs_txn
Add unit test for kvs_txn.

Also: ensure that unit tests link against the static libkvs.la
rather than ../libflux.so so they pick up changes during
development, and the unit test hook in kvs_txn.c can be
kept private.
libkvs/kvs: drop kvs_fence_set_context()
Drop the functions
  kvs_fence_set_context()
  kvs_fence_clear_context()

These were added so that a program could build up a fence
transaction while simultaneously performing other KVS commits.
That use case is now handled through explicit transactions.

Drop those functions and convert wrexecd and tests.
libkvs/classic: reimplement kvs_put* functions
Reimplement kvs_put() and kvsdir_put() and associated
functions in terms of txn's.  Create a "default transaction"
that these functions add to, and have kvs_commit() and kvs_fence()
commit the default transaction.

Update tests and wrexecd.

Also: drop some unnecessary includes from kvs.c.
libkvs/watch: avoid exiting on out of memory error
Replace calls to xstrdup() and xzmalloc() that exit on out
of memory error with strdup() and calloc(), and return ENOMEM
errors to the caller if they occur.

Drop some internal #includes that were there for no
apparent reason.
libkvs/watch: convert to jansson
Convert kvs_watch() internals from json-c to jansson.

This conversion may have introduced a bug that causes
t/kvs/watch simulwatch with ntimes > 499 to hang.

For code simplification, one new thing here is that each
watch is assigned a future.  After the first response, the
future is fulfilled, but is kept around to prevent the matchtag
from being freed.  This simplified the code for performing the
initial RPC, but does increase the amount of state kept per
watch.  We really need a more well thought out way to handle
the "one request - multiple response" use case.

After the first response that fulfills the future, a generic
response message handler is used for subsequent responses.
test/kvs: put key=null and ensure valid dirent
A flux_kvs_txn_put (key, "null") should result in an EINVAL.
The t/kvs/basic put subcommand retries an EINVAL with the value
converted to a string.

Add a test that ensures the value is set to a string rather
than a JSON null, which is an invalid directory entry.
test/watch_disconnect: avoid proto.[ch]
Avoid using proto.c::kp_twatch_enc(), and instead use
flux_mrpcf() to encode the mrpc request.  This prepares
for removal of proto.[ch] in a future commit.

Also, drop remaining json-c use in this test, decoding mrpc
stats response using flux_mrpc_getf().
@coveralls

This comment has been minimized.

Copy link

coveralls commented Jul 19, 2017

Coverage Status

Coverage decreased (-0.3%) to 78.317% when pulling d1c16c4 on garlick:kvs_api_cleanup2 into 5776aeb on flux-framework:master.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Jul 19, 2017

Coverage Status

Coverage decreased (-0.3%) to 78.297% when pulling 8f77614 on garlick:kvs_api_cleanup2 into 5776aeb on flux-framework:master.

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Jul 19, 2017

I dropped the commit @chu11 wanted dropped, and verified the PMI test has been running in a loop for about 45m with no failures (where before it usually failed within a few minutes). I think this one's perhaps ready?

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Jul 19, 2017

Yeah, looks good to me!

@grondo grondo merged commit 500e366 into flux-framework:master Jul 19, 2017

2 of 4 checks passed

codecov/patch 74.62% of diff hit (target 78.34%)
Details
codecov/project 78.03% (-0.32%) compared to 5776aeb
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage decreased (-0.3%) to 78.297%
Details

@garlick garlick deleted the garlick:kvs_api_cleanup2 branch Jul 19, 2017

@grondo grondo referenced this pull request Aug 23, 2017

Closed

0.8.0 Release #1160

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.