Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libflux: clean up module API #1856

Merged
merged 20 commits into from Nov 30, 2018

Conversation

Projects
None yet
5 participants
@garlick
Copy link
Member

garlick commented Nov 27, 2018

This PR culls some module management functions from the API that were not all that helpful, as proposed in #1854. It also scraps a problematic unit test and replaces it with one that ought to work better. Finally, functions are made more robust to bad arguments.

This module stuff requires more work to eliminate the log to stderr in flux_modname() and to add dynamically registered service names. I started to work on both of those and both turned into a much bigger change than I was comfortable with taking on right now.

This should not be merged until a flux-sched PR goes in, to wean it from functions that are being removed here.

garlick added some commits Nov 25, 2018

testsuite: drop module/basic command
Problem: t/module/basic duplicates flux-module command
functionality, and is the only user of some module
API functions.

Drop t/module/basic utility and convert t0003-module.t
to use flux-module instead.
libflux/module: drop flux_insmod, _rmmod, _lsmod
Problem: flux_insmod(), flux_rmmod(), and flux_lsmod()
have no users.

Drop these functions and transfer the flux_lsmod_f typedef
over to cmd/flux-module.c which was also using it.
libflux/module: drop rmmod codec
Problem: libflux/module defines flux_rmmod_json_decode()
and flux_rmmod_json_encode() functions.  Since the rmmod
message format is defined in RFC 5, and a user can just use
flux_rpc_pack() vs rmmod_json_encode() + flux_rpc(); or
flux_request_unpack() vs flux_request_decode() + json_decode(),
these functions are not very helpful.

Drop flux_rmmod_json_encode(), flux_rmmod_json_decode().

Update users and unit test.
libflux/module: drop insmod codec
Problem: libflux/module defines flux_insmod_json_decode()
and flux_insmod_json_encode() functions.  Since the insmod
message format is defined in RFC 5, and a user can just use
flux_rpc_pack() vs json_encode() + flux_rpc(); or
flux_request_unpack() vs flux_request_decode() + json_decode(),
these functions are not very helpful.

Drop flux_insmod_json_encode(), flux_insmod_json_decode().

Update users and unit test.
libflux/module: drop lsmod codec
Problem: libflux/module defines functions for encoding
and decoding the lsmod response defined in RFC 5, and adds
a typedef to represent a list of module records.  The
abstraction is not really all that helpful with the other
API functions that are available and unnecessarily complicates
our API.

Drop these functions.

Update tests and users.
libflux/module: flux_modfind() handle invalid args
Problem: flux_modfind segfaults if given NULL arguments.

Test for this and fail with EINVAL.
@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Nov 28, 2018

Codecov Report

Merging #1856 into master will increase coverage by <.01%.
The diff coverage is 79.14%.

@@            Coverage Diff             @@
##           master    #1856      +/-   ##
==========================================
+ Coverage   79.91%   79.92%   +<.01%     
==========================================
  Files         196      196              
  Lines       35298    35198     -100     
==========================================
- Hits        28209    28131      -78     
+ Misses       7089     7067      -22
Impacted Files Coverage Δ
src/broker/module.c 82.47% <47.05%> (-1.09%) ⬇️
src/broker/broker.c 77.41% <57.69%> (-0.18%) ⬇️
src/broker/service.c 82.47% <64.28%> (-3.07%) ⬇️
src/cmd/flux-module.c 84.69% <87.35%> (-0.67%) ⬇️
src/common/libflux/module.c 94% <93.02%> (+16.81%) ⬆️
src/modules/connector-local/local.c 73.77% <0%> (-1.04%) ⬇️
src/modules/kvs/kvs.c 65.65% <0%> (-0.16%) ⬇️
src/common/libflux/message.c 81.51% <0%> (-0.13%) ⬇️
src/bindings/lua/flux-lua.c 82.37% <0%> (+0.08%) ⬆️
src/broker/modservice.c 79.8% <0%> (+0.96%) ⬆️
... and 2 more

@garlick garlick force-pushed the garlick:modapi_cleanup branch from 6a9282c to 2d1fe66 Nov 28, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Nov 28, 2018

On listing dynamically registered service names - actually it wasn't that big of a deal so I added it.
flux module list now shows services, minus the implicit service that matches the module name.

$ flux module list -r all
Module               Size    Digest  Idle  S   Nodeset Service
job                  1182184 41FE11D    0  S     [0-3] 
resource-hwloc       1127680 03725D7    0  S     [0-3] 
userdb               1108216 6FD7788    0  S         0 
cron                 1193656 D2E75FE    0  S         0 
barrier              1106088 18A39F2    0  S     [0-3] 
content-sqlite       1115896 297E6EE    0  S         0 content-backing
kvs-watch            1338368 10B4AD1    0  S     [0-3] 
aggregator           1125232 77C3D23    0  S     [0-3] 
connector-local      1155136 CCDB76A    0  R     [0-3] 
kvs                  1613360 B1DC8F4    0  S     [0-3] 

In addition to the sched update, RFC 5 will need to add a services array to the lsmod response. It can be empty if the base module does not offer dynamic service registration.

garlick added some commits Nov 28, 2018

testsuite: drop module/codec test
Problem: the "codec" test has sensitivity to the
test environment and is now misnamed as it is only
contains unit tests for flux_modname() and flux_modfind().

Remove the test (or what's left of it).

Fixes #1849
broker/service: add service_list_byuuid()
Add a function to retrieve a JSON array of service
names for a given module uuid.  This can be used to
prepare a list of services for each module, for
eventual display by "flux module list", now that
services can be dynamically registered.
broker/module: include service list in lsmod
Amend module_get_modlist() to include an array
of service names, now that services can be dynamically
registered.

A service registered by a client on the local connector
will appear to have been registered by the connector-local
module.
@chu11
Copy link
Contributor

chu11 left a comment

overall looks good, one nit below. Two comments.

  1. Should there be a new test for listing the module services? AFAICT there isn't.

  2. should this be two PRs? One for the cleanup and one for the new service listing feature?

Show resolved Hide resolved src/common/libflux/test/module.c
@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Nov 28, 2018

Should there be a new test for listing the module services? AFAICT there isn't.

Yup, will add that.

should this be two PRs? One for the cleanup and one for the new service listing feature?

Well since both changes involved editing the same code at point of use, it seemed good to combine and get it over with, especially since sched needs to be synced.

I thought of a way to address the stderr logging in flux_modname(), flux_modfind() too, so I'll go ahead and push that and return to address your comments. Thanks!

@garlick garlick force-pushed the garlick:modapi_cleanup branch from 2d1fe66 to 2bd5842 Nov 28, 2018

garlick added some commits Nov 28, 2018

cmd/flux-module: drop gratuitous mod_t typedef
Problem: the mod_t typedef in flux-module is too
terse to be informative, and not justified given
the structure's local scope.

To improve clarity:
- rename 'mod_t' to 'struct module_record'.
- rename mod_create() to module_record_create().
- rename mod_destroy() to module_record_destroy().
cmd/flux-module: change zhash_t to zhashx_t
Problem: zhash_t requires the item destructor to
be specified at each insert.

Switch to zhashx_t and specify destructor at hash
creation.
cmd/flux-module: parse lsmod services array
Parse the services array returend in an lsmod response.
Employ a zhashx_t to build a set of unique services registered
for a given module across multiple ranks.
testsuite: return services in lsmod response
t/module/parent is an example extended base component.
Amend its response to an lsmod request to include
a "services" array with two entries which can be checked
in a sharness test for flux module list.

@garlick garlick force-pushed the garlick:modapi_cleanup branch from 2bd5842 to 5f0e2e8 Nov 28, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Nov 29, 2018

Just forced a push after adding a test that services are listed in flux module list, adding a comment in the unit test about the FAKE1 and FAKE2 defines, and adding the "dlerror callback" idea to replace stderr logging in flux_modname(), which seems to have worked out OK.

@chu11

This comment has been minimized.

Copy link
Contributor

chu11 commented Nov 29, 2018

lgtm, all ready to merge?

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Nov 29, 2018

lgtm, all ready to merge?

I think we'll need a PR in flux-sched before this PR can safely be merged.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Nov 29, 2018

flux module list now shows services, minus the implicit service that matches the module name.

Hm, output from flux module list -r all seems to attribute services registered only on one rank to all ranks, which might be confusing. E.g. in this example I've registered myservice on rank 0 using connector-local:

grondo@peep:/tmp/f.git$ flux module list -r 0 | grep local
connector-local      1154992 4E3BF6D    0  R         0 myservice
grondo@peep:/tmp/f.git$ flux module list -r 1 | grep local
connector-local      1154992 4E3BF6D    5  S         1 
grondo@peep:/tmp/f.git$ flux module list -r all | grep local
connector-local      1154992 4E3BF6D    0  R     [0-3] myservice

I'm not sure how big of a deal this is, and from your comment on lsmod_services_string this seems to be intentional. However, it might be less misleading if flux module list -r all split lines with different services, e.g. in this case:

connector-local      1154992 4E3BF6D    0  R         0 myservice
connector-local      1154992 4E3BF6D    0  R     [1-3]
@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Nov 29, 2018

Thanks, I was being lazy there I guess, but what you propose certainly makes more sense. Just pushed a change that should produce the results you described. Will need to add some tests.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Nov 29, 2018

Thanks! Looks like the change works:

grondo@peep:/tmp/f.git$ flux module list -r all
Module               Size    Digest  Idle  S   Nodeset Service
barrier              1105944 FBC6484    4  S     [0-3] 
aggregator           1125088 E4BE0FF    4  S     [0-3] 
resource-hwloc       1127536 889334B    4  S     [0-3] 
content-sqlite       1115752 C6F1B31    4  S         0 content-backing
userdb               1108072 8B57216    4  S         0 
connector-local      1154992 4E3BF6D    0  R         0 myservice
kvs-watch            1338200 48C9F9F    4  S     [0-3] 
cron                 1193512 E1D745C    0  S         0 
job                  1182040 D55EAAD    4  S     [0-3] 
connector-local      1154992 4E3BF6D    4  S     [1-3] 
kvs                  1613192 9DD0B6E    0  S     [0-3] 

garlick added some commits Nov 28, 2018

cmd/flux-module: list services in output
Add a "Services" column in flux module list output.
The column consists of any services returned from
the lsmod request that are not identical to the module
name, delimited by commas.

If a module registers different services on different ranks,
display each unique module + service as a separate line
of output.

garlick added some commits Nov 28, 2018

libflux/module: add callback for dlopen/dlsym errors
Problem: flux_modname() logs to stderr when there is
a dlopen() or dlsym() error.

When searching a series of directories for a module to
load, it is helpful to not lose any dlopen() or dlsym()
errors that might provide clues as to why a particular
module cannot be loaded.

Since logging to stderr is not appropriate in a libflux
API function, extend flux_modname() and flux_modfind()
to allow an optional callback for dlopen error information,
if available.  The callback is simply informational,
and might be called via flux_modname() multiple times during
search, even if flux_modname() ultimately succeeds.

Update users: flux-module, broker, testsuite.
testsuite: verify module list shows services
Check that test module 'parent' shows fake
service names for 'child' module.

Each unique module + service set should produce its
own line of output in lsmod.

@garlick garlick force-pushed the garlick:modapi_cleanup branch from 99da053 to 0111ec8 Nov 29, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Nov 29, 2018

Some incremental devel sqaushed. Ready I think?

@grondo grondo merged commit 3466daa into flux-framework:master Nov 30, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Nov 30, 2018

Thanks!


if (!(mods = module_get_modlist (ctx->modhash)))
if (flux_request_decode (msg, NULL, NULL) < 0)

This comment has been minimized.

@SteVwonder

SteVwonder Dec 2, 2018

Member

Out of curiosity more than anything: since both the topic and payload are NULL, is this being decoded to validate that the message is a request? Or can this decoding be skipped?

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Dec 2, 2018

@garlick garlick deleted the garlick:modapi_cleanup branch Dec 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.