New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: make Transport's idle connection management aware of DNS changes? #23427

Open
szuecs opened this Issue Jan 12, 2018 · 5 comments

Comments

Projects
None yet
4 participants
@szuecs

szuecs commented Jan 12, 2018

What version of Go are you using (go version)?

go version go1.9 linux/amd64

and

go version go1.9.2 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

% go env
GOARCH="amd64"
GOBIN="/home/sszuecs/go/bin"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/sszuecs/go"
GORACE=""
GOROOT="/usr/share/go"
GOTOOLDIR="/usr/share/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build505089582=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"

What did you do?

I am running go run main.go and change the /etc/hosts to change www.google.de to 127.0.0.1.
Depending on the order it fails to switch DNS after IdleConnTimeout or it will never request DNS again. DNS will be tried to lookup in case you first target it to 127.0.0.1 and after that comment the entry in /etc/hosts. The problem is that if you want to change your target loadbalancers via DNS lookup, this will not be done. The workaround is commented in the code, which reliably will do the DNS failover. The problem seems to be that IdleConnTimeout is bigger then the time.Sleep duration in the code, which you can also change to see that this works. In case of being an edge proxy with high number of requests, the case IdleConnTimeout < process-next-request will never happen.

package main

import (
	"log"
	"net"
	"net/http"
	"time"
)

func main() {

	tr := &http.Transport{
		DialContext: (&net.Dialer{
			Timeout:   5 * time.Second,
			KeepAlive: 30 * time.Second,
			DualStack: true,
		}).DialContext,
		TLSHandshakeTimeout: 5 * time.Second,
		IdleConnTimeout:     5 * time.Second,
	}

	go func(rt http.RoundTripper) {
		for {
			time.Sleep(1 * time.Second)

			req, err := http.NewRequest("GET", "https://www.google.de/", nil)
			if err != nil {
				log.Printf("Failed to do request: %v", err)
				continue
			}

			resp, err := rt.RoundTrip(req)
			if err != nil {
				log.Printf("Failed to do roundtrip: %v", err)
				continue
			}
			//io.Copy(ioutil.Discard, resp.Body)
			resp.Body.Close()
			log.Printf("resp status: %v", resp.Status)
		}
	}(tr)

	// FIX:
	// go func(transport *http.Transport) {
	// 	for {
	// 		time.Sleep(3 * time.Second)
	// 		transport.CloseIdleConnections()
	// 	}
	// }(tr)

	ch := make(chan struct{})
	<-ch
}

What did you expect to see?

I want to to see that IdleConnTimeout will reliably close idle connections, such that DNS is queried for new connections again, similar to the goroutine case in the code. We need to slowly being able to fade traffic.

What did you see instead?

In case you start the application with /etc/hosts entry is not set, and then change it, it will never fail the request, so the new DNS lookup is not being made.

@bradfitz bradfitz changed the title from http client and http.Transport IdleConnTimeout is not reliably doing DNS resolve to net/http: make Transport's idle connection management aware of DNS changes? Jan 12, 2018

@gopherbot gopherbot removed the NeedsFix label Jan 12, 2018

@bradfitz bradfitz added this to the Go1.11 milestone Jan 12, 2018

@bradfitz bradfitz added the NeedsFix label Jan 12, 2018

@bradfitz

This comment has been minimized.

Member

bradfitz commented Jan 12, 2018

@meirf

This comment has been minimized.

Member

meirf commented Feb 27, 2018

Possible Signals

If persistConn.conn is dialed via hostname:

  • TTL expiration. If TTL expires, take action. This would require an API change since net's TTL doesn't appear to be exposed.
  • Poll LookupIP. Store initial ip set and keep updating with new polling response; if ip set changes (or more strictly if dialed ip is not longer in ip set), take action. How frequently would we poll? Are we willing to store this extra state, spend this time polling, using system resourced even with optimizations?

Action

  • Close/mark for closure. If the connection is idle, close it. If the connection is in use, mark it as not reusable. This would be similar to current usage of persistConn.broken though "broken" maybe too harsh language for this case. (The OP uses CloseIdleConnections, which "does not interrupt any connections currently in use", so calling CloseIdleConnections once could leave many connections on the old ips.)

I guess the use of "?" in the title was due to implementation complexity/propriety of addressing this in the standard library. Please let me know if I've gotten something wrong or if there are other issues I'm missing.

@szuecs szuecs changed the title from net/http: make Transport's idle connection management aware of DNS changes? to net/http: make Transport's idle connection management aware of DNS changes Feb 27, 2018

@szuecs

This comment has been minimized.

szuecs commented Feb 27, 2018

@meirf For us CloseIdleConnections works pretty well, we use it in our internet facing central ingress http proxy and it's fast enough to do the DNS lookup while we do DNS based traffic switching (AWS route53 Weighted DNS records). If you would use IdleConnTimeout to call CloseIdleConnections would not do a breaking change of the API. I bet it would be in 99.999% of the cases fine, because applications, also proxies, are most of the time CPU bound.
My question would be why you want to expose additional data if you already have IdleConnTimeout?

To answer your question about "?" in the title: I don't remember the why. I deleted it from the title. I am only digging from time to time into the Go src code, mostly in the net package. I guess with the "?" I wanted to reflect that. I hope it's now more clear. :)

@meirf

This comment has been minimized.

Member

meirf commented Feb 28, 2018

tl;dr (1) Idle connections' awareness of dns changes appears to be absent from the API. (2) The easy part is forcing a redial via closure. (3) IMO the hard part is figuring out what signal the code should use to redial.

@szuecs
My first comment was directed to bradfitz, who set the title to "net/http: make Transport's idle connection management aware of DNS changes?".

Based on my reading of your most recent comment, you think idle connections' awareness of dns changes is already promised by the current API. I (and bradfitz based on his title) do not agree that idle connections' awareness of dns changes is already promised.

I think we agree that currently the only way to be aware of a dns change is to force a redial by not having connections kept alive, but IdleConnTimeout won't force a keep-alive connection to be closed unless your situation meet its definition. If you have a keep-alive connection and it never becomes idle, by definition it will never close itself due to IdleConnTimeout. The cumulative amount of time a keep-alive connection is idle doesn't matter. The only way the keep-alive connection will be closed due to IdleConnTimeout is if it's idle for IdleConnTimeout consecutive/contiguous time.

  1. To prevent any connections from being kept alive you can set DisableKeepAlives.
  2. To increase the likelihood of a keep-alive connection being closed, you can decrease IdleConnTimeout.
  3. To close all currently idle keep-alive connections, you can call CloseIdleConnections.

It's clear that all these would be helpful to you in redialing and therefore connecting to the new ip. This indirectly results in dns awareness (and possibly a lot of inefficiency). But in terms of the standard library, there is no promise of direct dns awareness.

If you would use IdleConnTimeout to call CloseIdleConnections would not do a breaking change of the API.

Briefly touching upon implementation, I don't think dns awareness in the standard library would use CloseIdleConnections since that blindly closes all keep-alive connections and I'd guess that we'd want dns change detection to be done per connection.

@szuecs szuecs changed the title from net/http: make Transport's idle connection management aware of DNS changes to net/http: make Transport's idle connection management aware of DNS changes? Feb 28, 2018

@szuecs

This comment has been minimized.

szuecs commented Feb 28, 2018

@meirf Ok I was not sure if questions were asked to me. As far as I understand now you did not asked me. :)
I put again the "?" in the title, because the change was not done by me (I should read the events).
Thanks for your comments!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment