proxy: Attempt to reconnect on connection errors #1440

vishvananda · 2014-09-25T00:02:17Z

This modifies the proxy dial command to retry to retry if it hits an
error. Previously it would pass a timeout, but not retry if there
is a connection error of some sort. It also adds a similar retry
for the listening socket.

This modifies the proxy dial command to retry to retry if it hits an error. Previously it would pass a timeout, but not retry if there is a connection error of some sort. It also adds a similar retry for the listening socket.

thockin · 2014-09-25T00:24:05Z

pkg/proxy/proxier.go

+		remaining := endTime.Sub(time.Now())
+		outConn, err := net.DialTimeout(network, address, remaining)
+		if err != nil {
+			if endTime.After(time.Now()) {


Could you reorg this as

if now.After(end) {
return err
}
retry

thockin · 2014-09-25T00:30:14Z

Can you explain what the motivation for this is? Did you find a place where retrying a dial made a difference in case of a non-timeout error? I'd almost rather immeidately fall back on the next endpoint it a dial fails...

vishvananda · 2014-09-25T00:44:16Z

So both of these issues might only apply to my usage of proxier, but I will explain the rationale. If the proxy is dynamically constructed, then it is possible that the receiving service has not finished being set up yet, so retrying allows it to hold the connection open while it tries. Failing over to the next item is fine if there is another one in the list. Perhaps I could move the retry so that it cycles through the list of backends trying to find one it can connect to until the timeout is reached? The listen side is because the listen can fail if the socket has not yet been released by a previous listener, or if networking has not finished initializing yet (this happens in the case of new network namespaces).

The listen can be retried by the caller so it isn't necessarily needed, although currently the failure will be eaten by OnUpdate. The dial is initiated by someone connecting to the proxy so doing what we can to find a connection before sending an RST to the caller seems preferable.

That said, I'm using proxy in a non-standard way so If either of these things don't seem appropriate to the usage in Kubernetes feel free to say so.

thockin · 2014-09-25T03:10:59Z

I can get behind making dial more robust, for sure. I'm less convinced about listen. Can we maybe break this into two parts?

Doing dial in a more robust way would be great, something like

retry up to N seconds on a single backend before giving up and trying the next backend, with a total timeout of M seconds. E.g.

try backend 1, fail, sleep 100ms, try again - up to 1 second total, then try the next backend, same pattern, if a total timeout of 5 seconds elapses, give up.

The numbers can be experimented with - good logging will help.

brendandburns · 2014-10-06T18:08:23Z

@vishvananda any thoughts on this?

thockin · 2014-10-20T20:42:36Z

Closing this for now. @vishvananda if you come back to this, please re-open.

…-admission-defaults OCPBUGS-4658: Apply shared defaulters to CRD-based routes.

proxy: Attempt to reconnect on connection errors

4a53711

This modifies the proxy dial command to retry to retry if it hits an error. Previously it would pass a timeout, but not retry if there is a connection error of some sort. It also adds a similar retry for the listening socket.

vishvananda force-pushed the timeouts branch from cbfb460 to 4a53711 Compare September 25, 2014 00:03

thockin reviewed Sep 25, 2014
View reviewed changes

brendandburns assigned thockin Sep 25, 2014

jbeda force-pushed the master branch from 89ee618 to f61d434 Compare October 16, 2014 23:44

thockin closed this Oct 20, 2014

tkashem pushed a commit to tkashem/kubernetes that referenced this pull request Dec 15, 2022

Merge pull request kubernetes#1440 from benluddy/release-4.13-routev1…

0003605

…-admission-defaults OCPBUGS-4658: Apply shared defaulters to CRD-based routes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proxy: Attempt to reconnect on connection errors #1440

proxy: Attempt to reconnect on connection errors #1440

vishvananda commented Sep 25, 2014

thockin Sep 25, 2014

vishvananda Sep 25, 2014

thockin commented Sep 25, 2014

vishvananda commented Sep 25, 2014

thockin commented Sep 25, 2014

brendandburns commented Oct 6, 2014

thockin commented Oct 20, 2014

proxy: Attempt to reconnect on connection errors #1440

proxy: Attempt to reconnect on connection errors #1440

Conversation

vishvananda commented Sep 25, 2014

thockin Sep 25, 2014

Choose a reason for hiding this comment

vishvananda Sep 25, 2014

Choose a reason for hiding this comment

thockin commented Sep 25, 2014

vishvananda commented Sep 25, 2014

thockin commented Sep 25, 2014

brendandburns commented Oct 6, 2014

thockin commented Oct 20, 2014