-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add endpoint_dial_timeout_in_seconds to spec #249
Add endpoint_dial_timeout_in_seconds to spec #249
Conversation
|
||
endpoint_dial_timeout_in_seconds: | ||
description: | | ||
Maximum time in seconds for gorouter to establish a TCP connection with a backend. This timeout comes before `tls_handshake_timeout_in_seconds` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setting is also used to wait for read response from WebSockets: https://github.com/cloudfoundry/gorouter/blob/65207dc5dafa659d343975216701fb2e29e3992a/proxy/handler/request_handler.go#L71 -> https://github.com/cloudfoundry/gorouter/blob/main/proxy/handler/forwarder.go#L40 -> https://github.com/cloudfoundry/gorouter/blob/main/proxy/utils/response_reader.go#L30
Should we also document that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's debate around changing that and making a new timeout specifically for websockets. I am planning on a another PR for that once we assessed the implications of such a change. Meanwhile we can add a statement that this also affects the websocket read timeout and then later remove it again..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stefanlay I've added a statement for ws timeouts. We can remove it again in a follow-up PR.
A short explanation of the proposed change:
Add spec to make EndpointDialTimeout configurable. It is currently hard-coded at 5s.
An explanation of the use cases your change solves
Some of our customers have apps that need to restart frequently and have thousands of clients connected. After the restart all those clients try to reconnect at once which temporarily puts the app under pressure such that the underlying OS can't keep up with accepting connections quick enough to not hit the 5s timeout. Being able to increase the timeout gives the app more time to eventually accept all connections without disruption clients with a 502.
Instructions to functionally test the behavior change using operator interfaces (BOSH manifest, logs, curl, and metrics)
endpoint_dial_timeout_in_seconds
property which defaults to 5 as beforeExpected result after the change
dial timeout is now configurable but remains at 5s by default
Current result before the change
dial timeout is hard-coded at 5s
Links to any other associated PRs
Make EndpointDialTimeout configurable gorouter#302
I have viewed signed and have submitted the Contributor License Agreement
I have made this pull request to the
develop
branchI have run all the unit tests using
scripts/run-unit-tests-in-docker
(Optional) I have run Routing Acceptance Tests and Routing Smoke Tests on bosh lite
(Optional) I have run CF Acceptance Tests on bosh lite