Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable MaxQueryTime and DefaultQueryTime #3777

Merged
merged 1 commit into from
Jan 17, 2020

Conversation

urbas
Copy link
Contributor

@urbas urbas commented Jan 2, 2018

This change moves constants maxQueryTime and defaultQueryTime in rpc.go into the configuration.

This provides support for use cases where we're okay with blocking queues that take longer than 10m. For example, AWS ELB allows 1h timeouts. This allows lowering the number of re-connections by orders of magnitude.

Configuring maxQueryTime is also useful for use cases where establishing new connections is at a premium (either because of power or data limitations).

@mkeeler mkeeler added the pr/needs-rebase PR needs to be rebased before merging label Jun 26, 2018
@pierresouchay
Copy link
Contributor

@urbas we are interested as well by this PR, could you rebase it?
Regards

@urbas
Copy link
Contributor Author

urbas commented Nov 27, 2018

@urbas we are interested as well by this PR, could you rebase it?
Regards

Done.

Copy link
Contributor

@pierresouchay pierresouchay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, and I think it is definitly a feature we would like to use.
I just wonder if we could put a max on the jitter time because if we set very high values, the jitter might be quite high (ex: 24h -> more than 1h30 of jitter)

} else if queryOpts.MaxQueryTime <= 0 {
queryOpts.MaxQueryTime = defaultQueryTime
if queryOpts.MaxQueryTime <= 0 {
queryOpts.MaxQueryTime = s.config.DefaultQueryTime
}

// Apply a small amount of jitter to the request.
queryOpts.MaxQueryTime += lib.RandomStagger(queryOpts.MaxQueryTime / jitterFraction)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If MaxQueryTime is set to a very big value, jitter might be quite high.
Limited for now to 37s, but if the value is very high (for instance 24h), then value might be quite high... do you think we might put a max on it ? (such as 60s)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pierresouchay: do you mind if I add the jitter feature in another pull request? The config is changing a lot and rebasing causes conflicts that are tedious to resolve. It would help if I kept the PRs small.

@urbas urbas force-pushed the master branch 2 times, most recently from 124e045 to 4d8fc44 Compare November 29, 2018 19:42
@pierresouchay
Copy link
Contributor

I am not owner of repository, it was just a remark ;)

@hanshasselberg
Copy link
Member

@urbas this sounds like a good idea to me. are you still interested in making this change? And if so, could you update your PR?

@hanshasselberg hanshasselberg added waiting-reply Waiting on response from Original Poster or another individual in the thread and removed pr/needs-rebase PR needs to be rebased before merging labels Dec 12, 2019
@stale stale bot removed the waiting-reply Waiting on response from Original Poster or another individual in the thread label Dec 12, 2019
@hanshasselberg hanshasselberg self-assigned this Dec 12, 2019
@urbas urbas force-pushed the master branch 3 times, most recently from 71a018b to 15e7c5c Compare December 14, 2019 21:12
@urbas
Copy link
Contributor Author

urbas commented Dec 14, 2019

@urbas this sounds like a good idea to me. are you still interested in making this change? And if so, could you update your PR?

Done. However, a unit test failed. It looks unrelated to my change. The test is waiting max 7 seconds for the secondary DC to get CA roots from the primary. This might not be enough time for the transition to happen. Note that the timeout of 7 seconds comes from sdk/testutil/retry/retry.go which is unrelated to my change.

Copy link
Member

@hanshasselberg hanshasselberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your work! This looks good, I noticed that the docs are missing on https://www.consul.io/docs/agent/options.html. Could you also add a test that proves your configuration is working?

@@ -512,8 +512,6 @@ func TestLeader_SecondaryCA_TransitionFromPrimary(t *testing.T) {
func TestLeader_SecondaryCA_UpgradeBeforePrimary(t *testing.T) {
t.Parallel()

maxRootsQueryTime = 500 * time.Millisecond
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is still needed, because it changes how long the roots endpoint is blocking before it returns in the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it to the server configurations. Hopefully that does the trick.

@hanshasselberg hanshasselberg added the waiting-reply Waiting on response from Original Poster or another individual in the thread label Dec 17, 2019
@hanshasselberg
Copy link
Member

Any updates?

@stale stale bot removed the waiting-reply Waiting on response from Original Poster or another individual in the thread label Jan 13, 2020
@hanshasselberg hanshasselberg added the waiting-reply Waiting on response from Original Poster or another individual in the thread label Jan 13, 2020
@stale stale bot removed the waiting-reply Waiting on response from Original Poster or another individual in the thread label Jan 14, 2020
@hanshasselberg
Copy link
Member

Thank you for updating your PR. I think the only thing that is missing now is the documentation for the new config options in https://www.consul.io/docs/agent/options.html.

@hanshasselberg hanshasselberg added the waiting-reply Waiting on response from Original Poster or another individual in the thread label Jan 16, 2020
Updated tests to use max and default query time.
@stale stale bot removed the waiting-reply Waiting on response from Original Poster or another individual in the thread label Jan 16, 2020
@codecov
Copy link

codecov bot commented Jan 16, 2020

Codecov Report

Merging #3777 into master will decrease coverage by 0.04%.
The diff coverage is 85.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3777      +/-   ##
==========================================
- Coverage   65.63%   65.58%   -0.05%     
==========================================
  Files         443      443              
  Lines       53307    53317      +10     
==========================================
- Hits        34988    34968      -20     
- Misses      14094    14116      +22     
- Partials     4225     4233       +8
Impacted Files Coverage Δ
agent/config/config.go 87.69% <ø> (ø) ⬆️
agent/config/runtime.go 92.26% <ø> (ø) ⬆️
agent/config/default.go 91.57% <100%> (+0.08%) ⬆️
agent/config/builder.go 83.33% <100%> (+0.02%) ⬆️
agent/config/flags.go 94.8% <100%> (+0.13%) ⬆️
agent/agent.go 67.97% <100%> (+0.02%) ⬆️
agent/consul/config.go 89.71% <100%> (+0.19%) ⬆️
agent/consul/leader_connect.go 70.15% <100%> (ø) ⬆️
agent/consul/rpc.go 79.32% <33.33%> (+0.75%) ⬆️
api/session.go 70.68% <0%> (-5.18%) ⬇️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f03153f...91b9405. Read the comment docs.

Copy link
Member

@hanshasselberg hanshasselberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your work, this looks great!

@hanshasselberg hanshasselberg merged commit ce02335 into hashicorp:master Jan 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants