Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libflux: make flux_msg_fprint() output clearer #3742

Merged
merged 2 commits into from Jun 29, 2021

Conversation

chu11
Copy link
Member

@chu11 chu11 commented Jun 24, 2021

Per discussion in #3701. With the removal of zmsg internally within the flux_msg_t data structure, we are no longer beholden to follow the output format defined by zframe_fprint(). The hex output can be somewhat confusing and not the most helpful for debugging at times. Also, we can get rid of a call to msg_proto_setup(), which we really don't need in flux_msg_copy() anymore.

I had no strong feelings on what to update the output to, so I just picked this:

>[routes] |id1|
>[topic] foo.bar
>[payload] size=8 ...
>[proto header] type=request flags=0xB userid=4294967295 rolemask=0x0
>[proto aux] nodeid=4294967295 matchtag=0

We could remove the prefix notation (i.e. > vs < for request/reply), but I like it and left it in there. I don't know if using the brackets around the "section" would be confusing b/c that's what previously held frame length in the zframe format.

I could add some extra "if userid == UNKNOWN output "unknown"" kind of logic as well, but left that out as I felt it was unnecessary.

This is built on top of #3701. Mostly just posting so we can debate output format.

@garlick
Copy link
Member

garlick commented Jun 24, 2021

Oh thanks for giving something concrete to look at. That makes it much easier to improve.

Here's a few suggestions:

  • drop the message type since the > prefix already indicates a request
  • drop the bracketed fields since they don't add much information
  • decode values where possible (flags, special userid, special rolemask, special nodeid
>|id1|
>foo.bar
>payload=...
>flags=topic,payload userid=unknown rolemask=none nodeid=any matchtag=none

On the payload, maybe use a heuristic so we get something human readable that doesn't exceed some max line width (with ... indicating truncation?)

Edit: swapping payload and proto order might be useful, so that payload max line width can exceed 80 cols and not break up the flow too much. I don't thnk the frame ordering on the wire necessarily needs to be represented here.

@chu11 chu11 force-pushed the issue3617_flux_msg_fprint branch from 7ae5a60 to 532bc5f Compare June 25, 2021 01:12
@chu11
Copy link
Member Author

chu11 commented Jun 25, 2021

Re-pushed with changes per suggestions per @garlick comments above. So here's some example outputs from some extra coverage tests I added. Perhaps its just me, but I like the "header" to be at the top of the debugging, so moved the "proto frame" stuff to the top and payload is at the bottom since it will most often be the longest.

> flags=topic,payload,route userid=unknown rolemask=none nodeid=any matchtag=0                                                              
> |id1|                                                                                                                                     
> topic=foo.bar                                                                                                                             
> size=8 7878787878787878                                                                                                                   
> flags= userid=42 rolemask=owner nodeid=42 matchtag=0                                                                                      
> flags=payload userid=unknown rolemask=none nodeid=any matchtag=0                                                                          
> size=52 6162636465666768696A6B6C6D6E6F707172737475767778797A41424344454647...                                                             

@garlick
Copy link
Member

garlick commented Jun 25, 2021

Looking better!

I think we could probably drop the payload size.

Also, if the payload is JSON, any chance of seeing it as JSON text?

@chu11
Copy link
Member Author

chu11 commented Jun 25, 2021

Also, if the payload is JSON, any chance of seeing it as JSON text?

oh good idea!

@chu11 chu11 force-pushed the issue3617_flux_msg_fprint branch from 532bc5f to 82e6ce6 Compare June 25, 2021 04:35
@chu11
Copy link
Member Author

chu11 commented Jun 25, 2021

Also, if the payload is JSON, any chance of seeing it as JSON text?

Oh actually, this was already taken care of. The problem was flux_msg_get_string() wasn't working in the tests b/c the NUL byte wasn't written by the payload in the tests. Added a extra test for more coverage:

--------------------------------------
> flags=payload userid=42 rolemask=owner nodeid=42 matchtag=0
> a.special.payload

@chu11 chu11 force-pushed the issue3617_flux_msg_fprint branch from 82e6ce6 to c8e87be Compare June 25, 2021 04:37
@chu11
Copy link
Member Author

chu11 commented Jun 25, 2021

re-pushed, added some tests, removed the size output per comment above.

Aslo added a dumb commit to allow a "verbose" option to ./test_message.t, so its easier to debug the fprint function.

@chu11 chu11 force-pushed the issue3617_flux_msg_fprint branch 3 times, most recently from ccccd23 to 6215b6a Compare June 27, 2021 19:23
@chu11 chu11 changed the title [WIP] libflux: make flux_msg_fprint() output clearer libflux: make flux_msg_fprint() output clearer Jun 28, 2021
@garlick garlick added this to the flux-core v0.28.0 milestone Jun 28, 2021
@chu11 chu11 force-pushed the issue3617_flux_msg_fprint branch 4 times, most recently from 4434f47 to 16b1166 Compare June 28, 2021 23:23
@garlick
Copy link
Member

garlick commented Jun 28, 2021

Quite an improvement:

$ FLUX_HANDLE_TRACE=1 flux ping kvs
--------------------------------------
> flags=topic,payload,route userid=unknown rolemask=none nodeid=any matchtag=1
> ||
> topic=kvs.ping
> {"seq":0,"time.tv_sec":39398,"time.tv_nsec":232011714,"pad":""}
--------------------------------------
< flags=topic,payload,route userid=5588 rolemask=owner errnum=0 matchtag=1
< ||
< topic=kvs.ping
< {"seq": 0, "time.tv_sec": 39398, "time.tv_nsec": 232011714, "pad": "", "route": "abc0bdc0!ad93d6aa!0de7a32e!80f43a27", "userid": 5588, "rolemask": 1}

Would it make it slightly more clear if the topic string were the first thing printed, without the topic= prefix? So like:

> kvs.ping
> flags=topic,payload,route userid=unknown rolemask=none nodeid=any matchtag=1
> ||
> {"seq":0,"time.tv_sec":39398,"time.tv_nsec":232011714,"pad":""}
--------------------------------------
< kvs.ping
< flags=topic,payload,route userid=5588 rolemask=owner errnum=0 matchtag=1
< ||
< {"seq": 0, "time.tv_sec": 39398, "time.tv_nsec": 232011714, "pad": "", "route": "abc0bdc0!ad93d6aa!0de7a32e!80f43a27", "userid": 5588, "rolemask": 1}

Also, if there are no routes should just omit the line with the ||?

@chu11
Copy link
Member Author

chu11 commented Jun 28, 2021

Would it make it slightly more clear if the topic string were the first thing printed, without the topic= prefix? So like:

ahh, yeah, I think that makes more sense. I was thinking more "header goes first", but people tend to identify based on the topic.

Also, if there are no routes should just omit the line with the ||?

Oh yeah, that makes sense.

Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this with or without the changes I proposed, so marking approved.
Feel free to set MWP when you're done.

@chu11 chu11 force-pushed the issue3617_flux_msg_fprint branch from 16b1166 to d2e77e8 Compare June 29, 2021 00:01
@chu11
Copy link
Member Author

chu11 commented Jun 29, 2021

re-pushed per comments above. Re-ordered the output and simply changed a if (hops >=0) to if (hops > 0) to omit the no-routes case.

--------------------------------------
> kvs.ping
> flags=topic,payload,route userid=unknown rolemask=none nodeid=any matchtag=1
> {"seq":0,"time.tv_sec":893526,"time.tv_nsec":950086527,"pad":""}
--------------------------------------
< kvs.ping
< flags=topic,payload,route userid=8556 rolemask=owner errnum=0 matchtag=1
< {"seq": 0, "time.tv_sec": 893526, "time.tv_nsec": 950086527, "pad": "", "route": "fd3fe594!2dd5baad!2fb23123!a0c4d6e7", "userid": 8556, "rolemask": 1}

Problem: It can be difficult to test/debug flux_msg_fprint tests
becauses tests default to routing all output to /dev/null.

Solution: Support a simple verbose option in test/message.c to have
fprintf output go to stderr.
Problem: With zmsg_t being removed from the internals of the
flux_msg_t data structure (commit
cd0cbfb), flux_msg_fprint() is no
longer beholden to the output format from zframe_fprint().  A more
clear format for debugging can be output.

Solution: Update flux_msg_fprint() to output all fields in a more
clear format, with many fields output with key=value output and output
string representations of integer values when valuable.  In addition,
output portion of the payload up to a line limit.

Add additional unit tests for coverage.
@codecov
Copy link

codecov bot commented Jun 29, 2021

Codecov Report

Merging #3742 (a0d3dbb) into master (82cf75e) will increase coverage by 0.01%.
The diff coverage is 92.95%.

@@            Coverage Diff             @@
##           master    #3742      +/-   ##
==========================================
+ Coverage   83.25%   83.27%   +0.01%     
==========================================
  Files         334      334              
  Lines       50909    50967      +58     
==========================================
+ Hits        42385    42443      +58     
  Misses       8524     8524              
Impacted Files Coverage Δ
src/common/libflux/message.c 91.38% <92.95%> (-0.01%) ⬇️
src/broker/overlay.c 89.23% <0.00%> (+0.55%) ⬆️
src/modules/job-archive/job-archive.c 59.83% <0.00%> (+0.80%) ⬆️

@mergify mergify bot merged commit db6b2c2 into flux-framework:master Jun 29, 2021
@chu11 chu11 deleted the issue3617_flux_msg_fprint branch June 29, 2021 03:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants