Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libpmi: clean up simple server wire protocol code #2185

merged 3 commits into from Jun 13, 2019


Copy link

commented Jun 12, 2019

The "server" end of the PMI-1 code in libpmi was pretty crufty, lacked documentation, and was sparse on tests.

This PR implements a little refactoring that hopefully makes it clearer, and adds tests for code that wasn't being exercised.

It also removes the server code from the installed because it was causing that library to have dependencies on zeromq and other libs. (I don't know if this is really a problem, but since it was unnecessary and I did go to some trouble early on to make the library standalone, I went ahead and fixed it).

garlick added 3 commits Jun 11, 2019
Problem: ldd shows installed depends
on libczmq, libzmq, and their dependedencies, but
libpmi was designed to be standalone.

These dependencies are due to inclusion of
simple_server.[ch] in the "noinst" internal,
but the simple_server code is for internal use only and
is not needed in the installed client-only library.

In libpmi, build two noinst libraries:
and, and only include the former in
the installed
Problem: the libpmi/simple_server.[ch] is hard to

Add the rank to pmi_simple_server_request() arguments
and create per-client state, hashed by rank.
The per-client state holds multi-line spawn command
parsing state, and the void *client user-supplied handle.

Simplify barrier handling so that a count rather than
a list of client pointers is accumulated.  On barrier exit,
iterate over the client hash to generate responses.

Simplify the multi-line spawn command parsing so
that lines between 'mcmd' and 'endcmd' are discarded
rather than stored, when we know the only multi-line
command is spawn, which is not implemented.

Add a documentation block at the top of simple_server.c
that hopefully will assist the next person who has to
understand this code.

Update flux-start and test/server_thread.c for the
pmi_simple_server_request() arg change.
Problem: there is no test coverage for server-side
interpretation of publish, unpublish, lookup, and spawn

These requests all return PMI_FAIL and are not fully
implemented on the server, but we should at least exercise
the code and make sure it gets the response to the user.

Add dummy client code to the test and make each request.

Enable tracing on the server side and route trace telemetry
through libtap diag(), then remove a few explicit diag() calls
that are now redundant.

This comment has been minimized.

Copy link

commented Jun 12, 2019

Codecov Report

Merging #2185 into master will increase coverage by 0.16%.
The diff coverage is 93.44%.

@@            Coverage Diff             @@
##           master    #2185      +/-   ##
+ Coverage    80.9%   81.06%   +0.16%     
  Files         199      199              
  Lines       31715    31688      -27     
+ Hits        25658    25687      +29     
+ Misses       6057     6001      -56
Impacted Files Coverage Δ
src/cmd/flux-start.c 78.18% <100%> (ø) ⬆️
src/common/libpmi/simple_server.c 88.12% <93.33%> (+35.18%) ⬆️
src/modules/job-ingest/worker.c 65.03% <0%> (-7.7%) ⬇️
src/common/libflux/mrpc.c 87.79% <0%> (-1.19%) ⬇️
src/broker/module.c 81.64% <0%> (-0.27%) ⬇️
src/cmd/flux-module.c 83.96% <0%> (ø) ⬆️
src/common/libflux/message.c 81% <0%> (+0.25%) ⬆️

This comment has been minimized.

Copy link

commented Jun 12, 2019

Looks nice! Good test coverage etc. Can't really test real world usage yet, so I say this should go in now if it is ready.


This comment has been minimized.

Copy link
Member Author

commented Jun 13, 2019

OK by me, thanks!

@grondo grondo merged commit 5ddfd23 into flux-framework:master Jun 13, 2019
4 checks passed
4 checks passed
Summary 1 potential rule
codecov/patch 93.44% of diff hit (target 80.9%)
codecov/project 81.06% (+0.16%) compared to edfcd12
continuous-integration/travis-ci/pr The Travis CI build passed
@garlick garlick deleted the garlick:pmi_server branch Jun 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
3 participants
You can’t perform that action at this time.