Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connector/local: add exponential backoff to connect retry count #1148

Merged
merged 4 commits into from Aug 13, 2017

Conversation

Projects
None yet
4 participants
@grondo
Copy link
Contributor

grondo commented Aug 10, 2017

As @garlick suggested in #677, add an exponential backoff to local connector connect retries, rename the variable to tune the retry count from FLUX_RETRY_COUNT to FLUX_LOCAL_CONNECTOR_RETRY_COUNT, and use this variable in t5000-valgrind.t to give up to 10 retries.

I tried to make the the default retry count of 5 still time out after ~500ms, so I made the starting sleep 16ms, which should sleep a total of 496ms in the default cause -- unless in my haste I did some math wrong.

grondo added some commits Aug 10, 2017

test/spellcheck: fix TAP output when test is skipped
When the spellcheck tests are skipped (due to missing aspell for
example), the skip TAP output is incorrect, or at least is deemed
incorrect by the tap-driver.sh automake TAP driver, which causes
the test to fail. Change the output to `1..0 # skip` to make
the test driver happy.
connector/local: rename connect retry count
Rename local connector retry count from FLUX_RETRY_COUNT to
FLUX_LOCAL_CONNECTOR_RETRY_COUNT ass suggested in issue #677.
connector/local: add exponential backoff during connect
Introduce an exponential backoff to the local connector's attempts
to connect to the flux instance api domain socket. The backoff starts
at 16ms and doubles up to 2s. This is so that the default retry count
of 5 still times out after about 500ms as in the current code.
t5000-valgrind.t: allow more connect retries in flux_open
Allow more retries in flux_open in local connector as suggested in #677.
@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Aug 10, 2017

Oh, oops I also added a fix for the spellcheck test, which isn't producing the skip TAP output correctly (at least according to tap-driver.sh. I can move that to another PR if needed.

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Aug 10, 2017

Codecov Report

Merging #1148 into master will decrease coverage by 0.05%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #1148      +/-   ##
==========================================
- Coverage   77.62%   77.56%   -0.06%     
==========================================
  Files         157      157              
  Lines       28680    28694      +14     
==========================================
- Hits        22262    22257       -5     
- Misses       6418     6437      +19
Impacted Files Coverage Δ
src/connectors/local/local.c 87.4% <100%> (-0.19%) ⬇️
src/common/libflux/info.c 75% <0%> (-2.78%) ⬇️
src/common/libcompat/info.c 65.82% <0%> (-2.54%) ⬇️
src/common/libflux/ev_flux.c 97.56% <0%> (-2.44%) ⬇️
src/modules/connector-local/local.c 74.08% <0%> (-1.47%) ⬇️
src/broker/modservice.c 79.61% <0%> (-0.98%) ⬇️
src/common/libkvs/kvs_watch.c 86.95% <0%> (-0.49%) ⬇️
src/common/libflux/dispatch.c 85.18% <0%> (-0.38%) ⬇️
src/common/libflux/reactor.c 93.18% <0%> (-0.29%) ⬇️
... and 9 more
@coveralls

This comment has been minimized.

Copy link

coveralls commented Aug 10, 2017

Coverage Status

Coverage decreased (-0.05%) to 77.92% when pulling 56b29b8 on grondo:t5000-valgrind-fix into 1665912 on flux-framework:master.

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Aug 13, 2017

Looks great, thanks! IMHO no big deal having the doc test fix in here also.

@garlick garlick merged commit aeb6cfd into flux-framework:master Aug 13, 2017

4 checks passed

codecov/patch 100% of diff hit (target 77.62%)
Details
codecov/project Absolute coverage decreased by -0.05% but relative coverage increased by +22.37% compared to 1665912
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage decreased (-0.05%) to 77.92%
Details

@grondo grondo referenced this pull request Aug 23, 2017

Closed

0.8.0 Release #1160

@grondo grondo deleted the grondo:t5000-valgrind-fix branch Oct 25, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.