Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux-relay: initialize log prefix to hostname when possible #4454

Merged
merged 1 commit into from
Aug 1, 2022

Conversation

grondo
Copy link
Contributor

@grondo grondo commented Aug 1, 2022

When the ssh connector fails to connect to a remote uri, but the ssh was successfully able to execute flux-relay, an error such as the following is displayed

flux-relay: local:///tmp/flux-xoZRkf/local-0: Permission denied

The flux-relay prefix probably isn't all that helpful for users, and may be confusing, since flux-relay was only indirectly run as part of the ssh connector. Instead it may be more useful to use the hostname as the command prefix. This way, the user is made aware of the host on which the local connection failed e.g

fluke89: local:///tmp/flux-xoZRkf/local-0: Permission denied

Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! This is fine as is, although I did make some nitpicky comments about gethostname() that likely won't amount to much improvement.

* for end users that may be uknowningly using flux-relay as part of
* the ssh connector.
*/
if (gethostname (hostname, MAXHOSTNAMELEN) < 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By my read of gethostname(1) the length parameter should be the size of the buffer not the maximum string length, so use MAXHOSTNAMELEN + 1, or maybe better, sizeof (hostname) here?

Also, not sure if this is important but the man page references HOST_NAME_MAX not MAXHOSTNAME. (Might be an alias but the former is apparently the POSIX one). Looking around our code base we use both :-(

Another thought is to use "unknown-host" or similar as the fallback if gethostname() fails (very unlikely). Could just statically initialize hostname[] to that value and avoid the duplicate call to log_init().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestions!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I did also see the HOST_NAME_MAX reference in the man page as well, but when I spot checked our codebase I saw use of MAXHOSTNAMELEN. Rather than fix it now, I chose MAXHOSTNAMELEN for consistency (though I guess I missed the uses of HOST_NAME_MAX)

Problem: Users almost always use flux-relay indirectly via the ssh
connector by specifying a ssh:// URI or a high-level URI which resolves
to an ssh URI. Therefore, the flux-relay log prefix on errors from
this utility may not be useful to users.

If possible, use the hostname as the log prefix. This will not only
remove the possibly confusing flux-relay prefix, but will allow
pinpointing the actual host on which the flux-relay is failing to,
for example, open a local connector socket. In the unlikely event that
gethostname(3) fails, then fall back to "uknown-host".
@grondo
Copy link
Contributor Author

grondo commented Aug 1, 2022

@garlick, I addressed your comments. I switched to HOST_NAME_MAX assuming that at some point we might want to standardize on that, though now reading POSIX - limits.h, I see that HOST_NAME_MAX is in the section "Runtime Invariant Values (Possibly Indeterminate)" which has the statement:

A definition of one of the symbolic constants in the following list shall be omitted from <limits.h> on specific implementations where the corresponding value is equal to or greater than the stated minimum, but is unspecified.

This indetermination might depend on the amount of available memory space on a specific instance of a specific implementation. The actual value supported by a specific instance shall be provided by the sysconf() function.

Therefore it seems like existence of HOST_NAME_MAX is not guaranteed. I suppose we could handle that in configure.

Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sheesh, they make a simple thing so complicated.
Oh well, this looks good to me!

@grondo
Copy link
Contributor Author

grondo commented Aug 1, 2022

Thanks! Setting MWP.

@codecov
Copy link

codecov bot commented Aug 1, 2022

Codecov Report

Merging #4454 (f7142ac) into master (4b0dd80) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head f7142ac differs from pull request most recent head bdce25e. Consider uploading reports for the commit bdce25e to get more accurate results

@@           Coverage Diff           @@
##           master    #4454   +/-   ##
=======================================
  Coverage   83.34%   83.34%           
=======================================
  Files         400      400           
  Lines       67258    67226   -32     
=======================================
- Hits        56055    56032   -23     
+ Misses      11203    11194    -9     
Impacted Files Coverage Δ
src/cmd/builtin/relay.c 80.88% <100.00%> (+0.88%) ⬆️
src/cmd/top/top.c 81.63% <0.00%> (-3.14%) ⬇️
src/common/libutil/uri.c 88.88% <0.00%> (-1.59%) ⬇️
src/cmd/top/joblist_pane.c 89.39% <0.00%> (-0.42%) ⬇️
src/broker/overlay.c 86.69% <0.00%> (-0.11%) ⬇️
src/common/libterminus/terminus.c 86.06% <0.00%> (+0.24%) ⬆️
src/common/libsubprocess/local.c 84.39% <0.00%> (+0.48%) ⬆️
src/modules/job-info/guest_watch.c 76.75% <0.00%> (+0.54%) ⬆️
src/modules/job-archive/job-archive.c 63.32% <0.00%> (+0.69%) ⬆️
src/common/libpmi/pmi2.c 89.87% <0.00%> (+1.26%) ⬆️
... and 1 more

@mergify mergify bot merged commit 51e584a into flux-framework:master Aug 1, 2022
@grondo grondo deleted the flux-relay-hostname branch August 1, 2022 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants