Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add endpoint_dial_timeout_in_seconds to spec #249

Merged

Conversation

domdom82
Copy link
Contributor

@domdom82 domdom82 commented Dec 14, 2021

  • A short explanation of the proposed change:
    Add spec to make EndpointDialTimeout configurable. It is currently hard-coded at 5s.

  • An explanation of the use cases your change solves
    Some of our customers have apps that need to restart frequently and have thousands of clients connected. After the restart all those clients try to reconnect at once which temporarily puts the app under pressure such that the underlying OS can't keep up with accepting connections quick enough to not hit the 5s timeout. Being able to increase the timeout gives the app more time to eventually accept all connections without disruption clients with a 502.

  • Instructions to functionally test the behavior change using operator interfaces (BOSH manifest, logs, curl, and metrics)

  1. Merge companion PR in gorouter
  2. The BOSH manifest now allows to set an endpoint_dial_timeout_in_seconds property which defaults to 5 as before
(...)
  jobs:
  - name: gorouter
    properties:
      endpoint_dial_timeout_in_seconds: 10    <--- set to 10s in this example
      (...)
  1. Deploy routing-release with new dial timeout
  • Expected result after the change
    dial timeout is now configurable but remains at 5s by default

  • Current result before the change
    dial timeout is hard-coded at 5s

  • Links to any other associated PRs
    Make EndpointDialTimeout configurable gorouter#302

  • I have viewed signed and have submitted the Contributor License Agreement

  • I have made this pull request to the develop branch

  • I have run all the unit tests using scripts/run-unit-tests-in-docker

  • (Optional) I have run Routing Acceptance Tests and Routing Smoke Tests on bosh lite

  • (Optional) I have run CF Acceptance Tests on bosh lite

@domdom82 domdom82 marked this pull request as ready for review December 15, 2021 11:24
@ameowlia ameowlia assigned ameowlia and stefanlay and unassigned ameowlia Dec 16, 2021
@ameowlia ameowlia requested review from stefanlay and removed request for ameowlia December 16, 2021 16:38
@ameowlia ameowlia moved this from Inbox to Reviewer Assigned in DEPRECATED App Platform - Networking Dec 16, 2021

endpoint_dial_timeout_in_seconds:
description: |
Maximum time in seconds for gorouter to establish a TCP connection with a backend. This timeout comes before `tls_handshake_timeout_in_seconds`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's debate around changing that and making a new timeout specifically for websockets. I am planning on a another PR for that once we assessed the implications of such a change. Meanwhile we can add a statement that this also affects the websocket read timeout and then later remove it again..?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stefanlay I've added a statement for ws timeouts. We can remove it again in a follow-up PR.

@stefanlay stefanlay moved this from Reviewer Assigned to Review in Progress in DEPRECATED App Platform - Networking Dec 19, 2021
@stefanlay stefanlay merged commit 7df0d91 into cloudfoundry:develop Jan 12, 2022
DEPRECATED App Platform - Networking automation moved this from Review in Progress to Done Jan 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants