Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store rank 0 instance URIs in enclosing instance KVS #1429

Merged
merged 5 commits into from Apr 6, 2018

Conversation

Projects
None yet
6 participants
@garlick
Copy link
Member

garlick commented Apr 5, 2018

This PR is an attempt to address #1422 by writing an instance's rank 0 FLUX_URI and a derived ssh:// URI to lwj.X.X.X.flux.local-uri and lwj.X.X.X.flux.remote-uri in the enclosing instance.

I need to provide a test but thought I'd post early to get any feedback from @trws.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Apr 5, 2018

Coverage Status

Coverage decreased (-0.02%) to 79.016% when pulling d4d1090 on garlick:ssh_uri into 95f8d8f on flux-framework:master.

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Apr 5, 2018

Codecov Report

Merging #1429 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #1429      +/-   ##
==========================================
- Coverage   78.73%   78.71%   -0.02%     
==========================================
  Files         163      163              
  Lines       30113    30114       +1     
==========================================
- Hits        23708    23704       -4     
- Misses       6405     6410       +5
Impacted Files Coverage Δ
src/modules/wreck/wrexecd.c 75.7% <100%> (+0.02%) ⬆️
src/bindings/lua/flux-lua.c 80.97% <0%> (-0.79%) ⬇️
src/broker/overlay.c 74.45% <0%> (-0.32%) ⬇️
src/modules/connector-local/local.c 74.38% <0%> (-0.21%) ⬇️
src/broker/module.c 84.07% <0%> (+0.27%) ⬆️
src/common/libflux/message.c 81.6% <0%> (+0.58%) ⬆️
@trws

This comment has been minimized.

Copy link
Member

trws commented Apr 5, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Apr 5, 2018

Great. I'll add a flux wreck uri command to make it easier to access.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Apr 5, 2018

Nice! Is there a requirement here to add to our requirements/use case doc?

@trws

This comment has been minimized.

Copy link
Member

trws commented Apr 5, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Apr 5, 2018

OK, subcommand looks like this:

$ flux wreck uri
    ID NTASKS STATE     FLUX_URI                                 COMMAND
     1      1 exited                                             hostname
     2      1 exited                                             hostname
     3      1 exited    ssh://jimbo//tmp/flux-qCbPz5             flux
@trws

This comment has been minimized.

Copy link
Member

trws commented Apr 5, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Apr 5, 2018

Yes, good idea.

@garlick garlick force-pushed the garlick:ssh_uri branch 2 times, most recently from 89eb4c1 to c8f627e Apr 5, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Apr 5, 2018

OK, I added a --bare option to print just the URI for a single job.

Added a couple tests also. This is probably close to ready if travis agrees.

Nice! Is there a requirement here to add to our requirements/use case doc?

I've made a note to return to this after the splash fire drill - probably! And it may be that we'll want something that works for deeper recursion long term...

@garlick garlick force-pushed the garlick:ssh_uri branch from c8f627e to 08741d5 Apr 6, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Apr 6, 2018

Dropped the lua f:kvs_rawget() function, which had no tests of its own, because it's not really necessary. I just put the values in the kvs with flux kvs put --json instead and the the old f:kvs_get() works and this PR becomes smaller.

@garlick garlick force-pushed the garlick:ssh_uri branch from 08741d5 to d4d1090 Apr 6, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Apr 6, 2018

Rebased.

garlick added some commits Apr 5, 2018

wreck: set FLUX_JOB_KVSPATH
Problem: a child instance of Flux doesn't know the path
to its KVS directory in the enclosing instance.

Set the path in the env variable FLUX_JOB_KVSPATH for each
job.  For example, in the current exec implementation, this
might have the value "lwj.0.0.1".
rc1: inform enclosing instance of useful URIs
Problem: it is inconvenient to determine the URI to
use to connect to a sub-instance.

Write URIs to the job's KVS directory in the enclosing
instance:

lwj.X.X.X.flux.local_uri=local://...
lwj.X.X.X.flux.remote_uri=ssh://...

Fixes #1422
cmd/flux-wreck: add "uri" subcommand
Add a subcommand that lists the lwj.X.Y.Z.flux.remote_uri
value, if available.

Usage is similar to flux wreck ls, e.g.
  flux wreck uri [-n, --max=count] [b, --bare] [JOBIDS...]

If called with --bare, only the URI for exactly one job is
listed, by itself for easy parsing.  It is an error if there
is not exactly one job specified, or if the job is not a
Flux instance.

If called without --bare, each job is listed with minimal
state information.  The FLUX_URI fields is left blank for
jobs that are not Flux instances.

$ flux wreck uri
ID NTASKS STATE     FLUX_URI                                 COMMAND
 1      1 exited                                             hostname
 2      1 exited                                             hostname
 3      1 exited    ssh://jimbo//tmp/flux-qCbPz5             flux
t2003-recurse.t: verify URI in enclosing instance
Add a few tests to ensure that Flux running Flux results
in KVS content that can be found by "flux wreck uri".

N.B. this sharness script needed to drop the "wreck"
personality and use the default full personality
in order to execute the rc1.d script that updates
the enclosing KVS.

@chu11 chu11 merged commit 0e0f83f into flux-framework:master Apr 6, 2018

4 checks passed

codecov/patch 100% of diff hit (target 78.73%)
Details
codecov/project Absolute coverage decreased by -0.01% but relative coverage increased by +21.26% compared to 95f8d8f
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage decreased (-0.02%) to 79.016%
Details
@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Apr 6, 2018

Thanks!

@garlick garlick deleted the garlick:ssh_uri branch Apr 7, 2018

@trws

This comment has been minimized.

Copy link
Member

trws commented Apr 8, 2018

This looks really great, looking forward to testing it, thanks!

@grondo grondo referenced this pull request May 10, 2018

Closed

0.9.0 Release #1479

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.