Clarify GCP TCP setting for `frontend_idle_timeout` #116

ljfranklin · 2018-08-11T00:17:17Z

We recently noticed strange "connection reset by peer" occasionally when running test suites with a GCP TCP LB. Turns out GCP will forcibly cut all idle TCP connections after 10 minutes: https://cloud.google.com/compute/docs/troubleshooting/general-tips#communicatewithinternet. With the default value of 900 seconds for router.frontend_idle_timeout our app would come up successfully and open a keep-alive connection through the TCP LB to the gorouter, but the first request after the 10 minute mark would result in "connection reset by peer". Looks like GCP cuts the connection without shutting down the keep-alive connection properly. So to use a GCP TCP LB you need to set frontend_idle_timeout to something less than 600 seconds. Setting this property to 60 seconds fixed the flakiness for us. However, for GCP HTTP LBs you still want to use a value over 600 seconds as described here: https://blog.percy.io/tuning-nginx-behind-google-cloud-platform-http-s-load-balancer-305982ddb340. Fun times.

We recently noticed strange "connection reset by peer" occasionally when running test suites with a GCP TCP LB. Turns out GCP will forcibly cut all idle TCP connections after 10 minutes: https://cloud.google.com/compute/docs/troubleshooting/general-tips#communicatewithinternet. With the default value of 900 seconds for `router.frontend_idle_timeout` our app would come up successfully and open a keep-alive connection through the TCP LB to the gorouter, but the first request after the 10 minute mark would result in "connection reset by peer". Looks like GCP cuts the connection without shutting down the keep-alive connection properly. So to use a GCP TCP LB you need to set `frontend_idle_timeout` to something less than 600 seconds. Setting this property to 60 seconds fixed the flakiness for us. However, for GCP **HTTP** LBs you still want to use a value over 600 seconds as described here: https://blog.percy.io/tuning-nginx-behind-google-cloud-platform-http-s-load-balancer-305982ddb340. Fun times.

cfdreddbot · 2018-08-11T00:17:18Z

Hey ljfranklin!

Thanks for submitting this pull request! I'm here to inform the recipients of the pull request that you and the commit authors have already signed the CLA.

cf-gitbot · 2018-08-11T00:17:18Z

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/159712351

The labels on this github issue will be updated when the story is started.

zachgersh · 2018-08-13T15:11:45Z

@ljfranklin cheers for the clarification!

cf-gitbot added the unscheduled label Aug 11, 2018

zachgersh merged commit f7c667b into cloudfoundry:develop Aug 13, 2018

cf-gitbot removed the unscheduled label Aug 13, 2018

cf-gitbot added delivered and removed delivered labels Aug 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify GCP TCP setting for `frontend_idle_timeout` #116

Clarify GCP TCP setting for `frontend_idle_timeout` #116

ljfranklin commented Aug 11, 2018 •

edited

Loading

cfdreddbot commented Aug 11, 2018

cf-gitbot commented Aug 11, 2018

zachgersh commented Aug 13, 2018

Clarify GCP TCP setting for frontend_idle_timeout #116

Clarify GCP TCP setting for frontend_idle_timeout #116

Conversation

ljfranklin commented Aug 11, 2018 • edited Loading

cfdreddbot commented Aug 11, 2018

cf-gitbot commented Aug 11, 2018

zachgersh commented Aug 13, 2018

Clarify GCP TCP setting for `frontend_idle_timeout` #116

Clarify GCP TCP setting for `frontend_idle_timeout` #116

ljfranklin commented Aug 11, 2018 •

edited

Loading