Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libflux: changes to message dispatch to support routers #2367

merged 11 commits into from Sep 13, 2019


Copy link

commented Sep 12, 2019

Here are a few changes peeled off from #2354:

  • Replace ERRNO_SAFE_FREE() macro with generic ERRNO_SAFE_WRAP().
  • Fix bug in message dispatcher where a response that a "router", like connector-local, should deliver to a client might match a local RPC. Probably not hit because connector-local doesn't make many RPCs.
  • Change the dispatcher to match request handlers registered with specific (non-glob) topic strings before those registered with globs, so a router's own service methods match before a catch-all request handler registered for routing purposes (just makes dynamic service routing easier).
  • Fix an unfortunate errno choice in flux_msg_recvfd().
garlick added 2 commits Aug 29, 2019
Instead of creating a wrapper for each function,
create a macro that can wrap any function, e.g.
Before: free(item) -> ERRNO_SAFE_FREE(item)
After:  free(item) -> ERRNO_SAFE_WRAP(free, item)

Update users of ERRNO_SAFE_FREE().
Problem: The message dispatcher does not check whether a
response message has routes remaining in its route stack before
attempting to match an RPC in the matchtag hash.  For a module
like connector-local acting as a router, this could result
in a response being delivered to a connector-local RPC when
it should be routed to a downstream client.

Refuse to match responses to RPC requests (valid matchtag)
unless their route stack is empty.  Other message handlers
for requests continue to match so that routers can be implemented.
Copy link

left a comment

LGTM! One comment inline.

Also, I wouldn't normally worry about documentation at this point, but since we already have a flux_msg_handler_create(3) man page, we may want to update it with the new scheme for matching topic strings? (up to you though, really)

@@ -60,6 +61,13 @@ static void free_msg_handler (flux_msg_handler_t *mh);
static size_t matchtag_hasher (const void *key);
static int matchtag_cmp (const void *key1, const void *key2);

static bool isa_glob (const char *s)
if (!s || strchr (s, '*') || strchr (s, '?'))

This comment has been minimized.

Copy link

grondo Sep 12, 2019


The character '[' also indicates a glob wildcard pattern.


This comment has been minimized.

Copy link
Member Author

commented Sep 12, 2019

Thanks! Fixed the overlooked glob character, and also another case (empty string topic_glob) that I had missed. Renamed isa_glob() to isa_multmatch() to be clearer about its function. I squashed this change.

I also realized that flux_msg_cmp() had overlooked the bracket character as well, so fixed that, and factored out a small function to make the control flow in flux_msg_cmp() clearer. Added a message unit test to cover all the glob characters.

Finally, updated the man page a suggested.

garlick added 9 commits Sep 5, 2019
Problem:  Last-in request handlers have the highest priority,
which allows built-in module methods to be overridden, but prevents
a "fall through" request handler from being registered that would
match methods not explicitly registered earlier.

Instead of pushing all request handlers onto a list, store
non-glob request handlers in a hash by topic string.  Requests
are first matched in the hash, and fall through to the list
if no match.

Registering a handler for a duplicate request method replaces
the old one.  This mimics the current behavior of shadowing
an earlier-registered request handler.
Problem: several errors result in flux_msg_recvfd()
failing with errno set to EPROTO, including EOF from read(2).

Since EOF on read(2) is what we should get when the
connection is closed, change the errno to ECONNRESET
so it can be suppressed separately from other errors.

Update flux-proxy to log EPROTO if returned from recvfd().
Ensure that the dispatcher calls the most recently
registered handler for a non-glob request.
Check the following:
- a RPC response handler is not overridden by a response handler
  that matches all responses
- a request handler for a specific method is not overridden by
  a request handler that matches all requests
Check that a response message with a non-empty route
stack falls through to a catchall response handler
even though there is an RPC response handler waiting
for that matchtag.
Problem: flux_msg_match() tries to literally match a glob
containing square brackets.

The heavy-weight fnmatch() call is avoided in flux_msg_cmp()
unless the match.topic_glob contains a glob character, but
only '*' and '?' were checked.  Add '[' to the check.
Problem: flux_msg_cmp() logic could be clearer

The inline tests on topic_glob for "match any" are
not documented and a bit obscure.  For clarity,
pull this test out to a helper named isa_matchany().
Problem: the dispatch logic described in this man page
is now out of date.

Update the description to describe the dispatch logic
as it exists now.
@garlick garlick force-pushed the garlick:message_fixes branch from 61f382a to 753b6c7 Sep 12, 2019

This comment has been minimized.

Copy link

commented Sep 12, 2019

Codecov Report

Merging #2367 into master will increase coverage by 0.01%.
The diff coverage is 97.05%.

@@            Coverage Diff             @@
##           master    #2367      +/-   ##
+ Coverage   80.81%   80.83%   +0.01%     
  Files         218      218              
  Lines       34514    34539      +25     
+ Hits        27894    27921      +27     
+ Misses       6620     6618       -2
Impacted Files Coverage Δ
src/cmd/builtin/proxy.c 74.12% <100%> (ø) ⬆️
src/common/libsubprocess/server.c 70.9% <100%> (ø) ⬆️
src/common/libflux/msg_handler.c 90.72% <100%> (+0.75%) ⬆️
src/modules/sched-simple/sched.c 73.26% <100%> (ø) ⬆️
src/common/libflux/message.c 81.02% <87.5%> (+0.19%) ⬆️
src/cmd/flux-module.c 84.19% <0%> (+0.23%) ⬆️
grondo approved these changes Sep 12, 2019

This comment has been minimized.

Copy link

commented Sep 13, 2019

Looks good now! One thing that surprised me is the equivalence of "topic glob" "" and "*". That is counter-intuitive and not documented in flux_msg_cmp(3), but I'm assuming there is some internal reason for it?


This comment has been minimized.

Copy link
Member Author

commented Sep 13, 2019

The historical raisins elude me... but since that is how flux_msg_cmp() works, I figured it should work the same in the dispatcher. I'd say if we think it should work differently we should open an issue and handle it separately. It does seem a bit odd.

@grondo grondo merged commit 8f6a9e0 into flux-framework:master Sep 13, 2019
4 checks passed
4 checks passed
Summary 1 potential rule
codecov/patch 97.05% of diff hit (target 80.81%)
codecov/project 80.83% (+0.01%) compared to c02631a
continuous-integration/travis-ci/pr The Travis CI build passed
@garlick garlick deleted the garlick:message_fixes branch Sep 13, 2019
@garlick garlick referenced this pull request Sep 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
3 participants
You can’t perform that action at this time.