Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

broker: make tbon.connect_timeout configurable #5140

Merged
merged 7 commits into from
May 3, 2023

Conversation

garlick
Copy link
Member

@garlick garlick commented May 2, 2023

Problem: there is no way to tune the ZMQ_CONNECT_TIMEOUT socket option.

Allow the connect timeout to be set either in the config:

[tbon]
connect_timeout = "10s"

or on the broker command line: -Stbon.connect_timeout=10s.

The broker command line takes precedence.

Set a default of 30s instead of the system default of "a long time" per zmq_setsockopt(3).

Fixes #5134

@garlick garlick added this to the flux-core v0.50.0 milestone May 2, 2023
Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@garlick
Copy link
Member Author

garlick commented May 3, 2023

Thanks! I'll set MWP.

Problem: connect timeout will require nearly replicating the
overlay_configure_tcp_user_timeout() function.

Restructure the function so it can be used for configuring
tcp_user_timeout and, in the future, connect_timeout.
Problem: there is no way to tune the ZMQ_CONNECT_TIMEOUT
socket option.

Allow the connect timeout to be set either in the config:
  [tbon]
  connect_timeout = "10s"

or on the broker command line:
  -Stbon.connect_timeout=10s

The broker command line takes precedence.

Set a default of 30s instead of the system default of "a long time"
per zmq_setsockopt(3).

Fixes flux-framework#5134
Problem: there is no test coverage for the tbon.connect_timeout
broker attribute/config parameter.

Add tests to t0013-config-file.t.
Problem: the tbon.connect_timeout broker attribute is
undocumented.

Add it to the man page.
Problem: the tbon.connect_timeout config parameter is not documented.

Add it to the man page.
Problem: overlay_parse_timeout() checks the duration result from
fsd_parse_duration() for <= 0  but fsd_parse_duration() cannot
return a value less than zero, and zero itself is documented as
"use system default".

Remove the check for <= 0.
Problem: no tests show that tbon.connect_timeout can be set to 0,
which means "use the system default".

Add tests.
@codecov
Copy link

codecov bot commented May 3, 2023

Codecov Report

Merging #5140 (e675458) into master (c37a8d1) will increase coverage by 0.01%.
The diff coverage is 91.30%.

@@            Coverage Diff             @@
##           master    #5140      +/-   ##
==========================================
+ Coverage   83.10%   83.12%   +0.01%     
==========================================
  Files         453      453              
  Lines       77742    77750       +8     
==========================================
+ Hits        64608    64630      +22     
+ Misses      13134    13120      -14     
Impacted Files Coverage Δ
src/broker/overlay.c 84.57% <91.30%> (-0.27%) ⬇️

... and 11 files with indirect coverage changes

@mergify mergify bot merged commit e79b7fd into flux-framework:master May 3, 2023
31 checks passed
@vsoch
Copy link
Member

vsoch commented May 3, 2023

Holy cow that was so fast!

Will this go into 0.50.0?

@grondo
Copy link
Contributor

grondo commented May 3, 2023

Yes, it made it in. Github actions is being slow so 0.50.0 is delayed until tomorrow morning.

@garlick garlick deleted the issue#5134 branch March 1, 2024 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

broker connect timeout should be tunable
3 participants