Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvs: add kvs_getat() and related functions #824

Merged
merged 25 commits into from Oct 3, 2016

Conversation

Projects
None yet
4 participants
@garlick
Copy link
Member

garlick commented Sep 29, 2016

Resubmitting work from the withdrawn PR #820, cleaned up:

This PR adds the following new KVS interfaces:

/* Like kvs_get() but lookup is relative to 'treeobj'.
 */
int kvs_getat (flux_t h, const char *treeobj,
               const char *key, char **json_str);
int kvs_get_dirat (flux_t h, const char *treeobj,
                   const char *key, kvsdir_t **dirp);
int kvs_get_symlinkat (flux_t h, const char *treeobj,
                               const char *key, char **val);

and modifies the kvsdir_get_*() implementations so that a kvsdir_t created by kvs_get_dirat() can be walked recursively relative to the original treeobj, regardless of what is changing in the KVS, which finally addresses #64.

I also implemented @grondo's suggestion of making kvsdir_put_*() fail on a "snapshot" kvsdir_t.

The flux-kvs command has a couple of new sub-commands (getat, dirat, readlinkat), and the kvs sharness test was modified to exercise them.

I spent a long time agonizing over the API here, and came to the conclusion (again) that a sweeping KVS API overhaul is necessary and that these additions should stick to the original theme for the time being.

@garlick garlick added the review label Sep 29, 2016

@garlick garlick force-pushed the garlick:kvs_getat branch 2 times, most recently from 448d2e6 to d22d9e4 Sep 29, 2016

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Sep 29, 2016

Sorry, one of the builds stuck in the libfaketime cron tests (#731 I think). I restarted it.

This all looks great to me -- I'll take a moment to poke at it a little bit before making any further comment.

Thanks!

@coveralls

This comment has been minimized.

Copy link

coveralls commented Sep 29, 2016

Coverage Status

Coverage decreased (-0.2%) to 75.116% when pulling d22d9e4 on garlick:kvs_getat into 927ac17 on flux-framework:master.

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Sep 29, 2016

Current coverage is 75.05% (diff: 72.96%)

Merging #824 into master will increase coverage by 0.14%

@@             master       #824   diff @@
==========================================
  Files           146        146          
  Lines         25287      25410   +123   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          18943      19071   +128   
+ Misses         6344       6339     -5   
  Partials          0          0          
Diff Coverage File Path
•••••• 65% src/modules/kvs/libkvs.c
•••••••• 81% src/cmd/flux-kvs.c
•••••••••• 100% src/common/libpmi/single.c
•••••••••• 100% src/modules/kvs/proto.c
•••••••••• 100% src/broker/hello.c
•••••••••• 100% src/modules/kvs/kvs.c

Powered by Codecov. Last update 069c18d...fe5c0a0

garlick added some commits Sep 15, 2016

modules/kvs: add kvs_getat()
Add a new function kvs_getat() which is like kvs_get() but has
a new "treeobj" argument that specifies the specific root
(or other directory) "snapshot" that the key will be looked
up in.

Partially fixes #64
test/kvs: add getat tests
Exercise flux-kvs getat via t1000-kvs-basic.t sharness test.
cmd/flux-kvs: add dirat subcommand
Add dirat subcommand:

Usage:  flux-kvs dirat [-r] treeobj key

This command demonstrates walking a snapshot.
modules/kvs: kvs.get codec: treeobj in/out
Allow a treeobj to be sent as an optional argument to a
kvs.get request, so that the key can be looked up relative
to a snapshot.

Echo this back in the response, or if it was not provided,
include a treeobj that identifies the current root that
was used for the lookup.

Update codec test

Adjust callers to compile with the new codec signature.
New functionality based on the codec change will be added in
a subsequent commit.
modules/kvs: handle treeobj args in kvs.get request
If a treeobj is provided in a kvs.get request, look up
the key relative to this "snapshot".

Return the treeobj used to look up the key in the response,
whether it be the one provided in the request, or one
representing the current root.

This may be useful in a future API that allows the current
version of a directory to be fetched and then walked as
a snapshot.
modules/kvs: add treeobj param to internal getobj
Add treeobj argument to getobj(), an internal client function,
so that we can implement kvs_getat() using it.
modules/kvs: drop internal working directory abstraction
The "cwd" abstraction in the KVS user facing API code was
only used to store a directory prefix for the kvsdir_*()
functions.  Get rid of it and just use kvsdir_key_at()
which is much clearer.
modules/kvs: add kvs_get_dirat()
Add kvs_get_dirat(), which is like kvs_get_dir() but looks
up key relative to a snapshot reference.

The snapshot reference is stored in the returned kvsdir_t object,
and is then used in subsequent kvsdir_get_*() operations on that
object.  It is thus possible to recursively walk a snapshot of a
section of the namespace by starting with kvs_get_dirat() and
looking up subdirectories with kvsdir_get_dir().

Fixes #64.
modules/kvs: kvsdir_put_*() fail on snapshot
If kvsdir_put_*() is used on a kvsdir_t created by
kvsdir_get_at(), or kvsdir_get_dir() if the original directory
was created with kvsdir_get_at(), then immediatley fail
with errno == EROFS.

Although allowing the put would have no effect on snapshot
integrity, this at least discourages users from assuming that
kvsdir_get() tracks kvsdir_put() due to read-your-writes consistency.
Since a kvsdir_t's internal root reference does not change once fixed,
kvsdir_get_*() will never see new data.

@garlick garlick force-pushed the garlick:kvs_getat branch from d22d9e4 to 8f5d983 Sep 29, 2016

@coveralls

This comment has been minimized.

Copy link

coveralls commented Sep 29, 2016

Coverage Status

Coverage decreased (-0.2%) to 75.07% when pulling 8f5d983 on garlick:kvs_getat into 069c18d on flux-framework:master.

garlick added some commits Sep 29, 2016

libpmi/single: use getpid for appnum not -1
Problem: a single rank session sets the "session id" to -1,
and when this is used to create the scratch directory, breaks
flux list-instances.

Use getpid() for the appnum instead of -1 in the libpmi "single"
(standalone) personality.  Fixes #797
modules/kvs: fix memory leak
Add a missing free() found by grondo.  Fixes #826.
broker/hello: fix memory leak
Add missing json free found by grondo.  Fixes #827.
@coveralls

This comment has been minimized.

Copy link

coveralls commented Sep 29, 2016

Coverage Status

Coverage decreased (-0.2%) to 75.059% when pulling e64d7ac on garlick:kvs_getat into 069c18d on flux-framework:master.

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Sep 29, 2016

Hmm, I don't recall seeing a jsc error that looks like this. Will have a look tomorrow.

ok 1 - jstat 1: notification works for 1 wreckrun
ok 2 - jstat 2: jstat back-to-back works
ok 3 - jstat 3: notification works for multiple wreckruns
ok 4 # skip jstat 4: notification works under lock-step stress (missing LONGTEST)
ok 5 - jstat 5: notification works for overlapping wreckruns
ok 6 # skip jstat 6: notification works for overlapping stress (missing LONGTEST)
ok 7 - jstat 7.1: run hostname to create some state
ok 8 - jstat 7.2: basic query works: jobid
ok 9 - jstat 7.3: basic query works: state-pair
ok 10 - jstat 7.4: basic query works: rdesc
ok 11 - jstat 7.5: basic query works: pdesc
2016-09-29T23:42:35.542230Z jstat.err[0]: jsc_query_jcb reported an error
2016-09-29T23:42:35.568706Z jstat.err[0]: jsc_query_jcb reported an error
2016-09-29T23:42:35.597947Z jstat.err[0]: key (unknown) not understood
2016-09-29T23:42:35.598404Z jstat.err[0]: jsc_query_jcb reported an error
2016-09-29T23:42:35.627888Z jstat.err[0]: jsc_query_jcb reported an error
ok 12 - jstat 8: query detects bad inputs
ok 13 - jstat 9: update state-pair
ok 14 - jstat 10: update procdescs
ok 15 - jstat 11: update rdesc
ok 16 - jstat 12: update rdl
ok 17 - jstat 13: update rdl_alloc
2016-09-29T23:42:36.016688Z jstat.err[0]: jobid attr cannot be updated
2016-09-29T23:42:36.074213Z jstat.err[0]: key (rdesctypo) not understood
ok 18 - jstat 14: update detects bad inputs
2016-09-29T23:42:36.585315Z lwj.10.err[1]: Error in initialization, terminating job
2016-09-29T23:42:36.585243Z lwj.10.err[0]: Error in initialization, terminating job
not ok 19 - jstat 15: jstat detects failed state2016-09-29T23:42:36.589350Z lwj.10.err[3]: Error in initialization, terminating job
2016-09-29T23:42:36.590889Z lwj.10.err[2]: Error in initialization, terminating job
#   
#       p=$(run_flux_jstat 15) &&
#       test_must_fail run_timeout 4 flux wreckrun -i /bad/input -n4 -N4 hostname &&
#       cat >expected15 <<-EOF &&
#       null->null
#       null->reserved
#       reserved->starting
#       starting->failed
#       EOF
#       cp output.15 output.15.cp &&
#       kill -INT $p &&
#       test_cmp expected15 output.15.cp
#   
# failed 1 among 19 test(s)
1..19
2016-09-29T23:42:36.609083Z broker.err[0]: Run level 2 Exited with non-zero status (rc=1) 5.4s
flux-start: 0 (pid 89620) exited with rc=1
FAIL: t2001-jsc.t
modules/kvs: fix memory leak when commit fails
If a commit fails before the working copy of the root
directory is passed to store(), it needs to be freed
when the fence struct is freed.

Fixes #825
@coveralls

This comment has been minimized.

Copy link

coveralls commented Sep 30, 2016

Coverage Status

Coverage decreased (-0.2%) to 75.061% when pulling 1dc799d on garlick:kvs_getat into 069c18d on flux-framework:master.

test/kvs: test flux kvs dirat
Use the 'flux kvs dirat' command to walk a directory by
reference after it has been unlinked.  This adds minimal
test coverage for kvs_get_dirat().
@coveralls

This comment has been minimized.

Copy link

coveralls commented Oct 3, 2016

Coverage Status

Coverage decreased (-0.07%) to 75.157% when pulling 5ae0b72 on garlick:kvs_getat into 069c18d on flux-framework:master.

test/kvs: test kvsdir accessors
Walk a directory with "flux kvs dir" that includes a symlink,
booelan, and double to increase coverage for kvsdir_get_<type>()
functions.
@coveralls

This comment has been minimized.

Copy link

coveralls commented Oct 3, 2016

Coverage Status

Coverage increased (+0.01%) to 75.237% when pulling 9d570df on garlick:kvs_getat into 069c18d on flux-framework:master.

test/kvs: add coverage for kvsdir_put_*() functions
Add a --mkdir option to t/kvs/dtree which creates the
directory tree recursively using kvsdir_put() functions,
with the intention of increasing coverage for hte kvsidr_put_*()
functions.  Drive dtree --mkdir via the kvs-basic sharness test.
@coveralls

This comment has been minimized.

Copy link

coveralls commented Oct 3, 2016

Coverage Status

Coverage increased (+0.02%) to 75.245% when pulling d6e98d4 on garlick:kvs_getat into 069c18d on flux-framework:master.

modules/kvs: fix memory leak in commit_merge_all()
An extra JSON reference was being taken in the
commit merging code on rank 0.  Drop the extra Jget().

Fixes #825
@coveralls

This comment has been minimized.

Copy link

coveralls commented Oct 3, 2016

Coverage Status

Coverage increased (+0.02%) to 75.241% when pulling fc7f229 on garlick:kvs_getat into 069c18d on flux-framework:master.

test/kvs: add coverage for kvs_watch_once_dir()
Add a count argument to flux-kvs watch-dir, then drive it
from t1000-kvs-basic.t.  This is one of the areas that coveralls
noted had no test coverage.
@coveralls

This comment has been minimized.

Copy link

coveralls commented Oct 3, 2016

Coverage Status

Coverage increased (+0.1%) to 75.356% when pulling fe5c0a0 on garlick:kvs_getat into 069c18d on flux-framework:master.

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 3, 2016

I think improving coverage further here will have diminishing returns as the KVS API is due to for a complete rethink soon, and most of the coverage issues are in the "client" portion of the code. In addition, coverage overall has improved in this PR.

Merge candidate?

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Oct 3, 2016

Yes, this looks great to me.

@grondo grondo merged commit a8793ce into flux-framework:master Oct 3, 2016

3 of 4 checks passed

codecov/patch 72.96% of diff hit (target 74.91%)
Details
codecov/project 75.05% (+0.14%) compared to 069c18d
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.1%) to 75.356%
Details

@grondo grondo removed the review label Oct 3, 2016

@garlick garlick deleted the garlick:kvs_getat branch Oct 3, 2016

@garlick garlick referenced this pull request Oct 26, 2016

Closed

0.5.0 release notes #879

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.