Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: make Transport's idle connection management aware of DNS changes? #23427

Open
szuecs opened this issue Jan 12, 2018 · 31 comments
Open

net/http: make Transport's idle connection management aware of DNS changes? #23427

szuecs opened this issue Jan 12, 2018 · 31 comments
Labels
Milestone

Comments

@szuecs
Copy link

@szuecs szuecs commented Jan 12, 2018

What version of Go are you using (go version)?

go version go1.9 linux/amd64

and

go version go1.9.2 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

% go env
GOARCH="amd64"
GOBIN="/home/sszuecs/go/bin"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/sszuecs/go"
GORACE=""
GOROOT="/usr/share/go"
GOTOOLDIR="/usr/share/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build505089582=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"

What did you do?

I am running go run main.go and change the /etc/hosts to change www.google.de to 127.0.0.1.
Depending on the order it fails to switch DNS after IdleConnTimeout or it will never request DNS again. DNS will be tried to lookup in case you first target it to 127.0.0.1 and after that comment the entry in /etc/hosts. The problem is that if you want to change your target loadbalancers via DNS lookup, this will not be done. The workaround is commented in the code, which reliably will do the DNS failover. The problem seems to be that IdleConnTimeout is bigger then the time.Sleep duration in the code, which you can also change to see that this works. In case of being an edge proxy with high number of requests, the case IdleConnTimeout < process-next-request will never happen.

package main

import (
	"log"
	"net"
	"net/http"
	"time"
)

func main() {

	tr := &http.Transport{
		DialContext: (&net.Dialer{
			Timeout:   5 * time.Second,
			KeepAlive: 30 * time.Second,
			DualStack: true,
		}).DialContext,
		TLSHandshakeTimeout: 5 * time.Second,
		IdleConnTimeout:     5 * time.Second,
	}

	go func(rt http.RoundTripper) {
		for {
			time.Sleep(1 * time.Second)

			req, err := http.NewRequest("GET", "https://www.google.de/", nil)
			if err != nil {
				log.Printf("Failed to do request: %v", err)
				continue
			}

			resp, err := rt.RoundTrip(req)
			if err != nil {
				log.Printf("Failed to do roundtrip: %v", err)
				continue
			}
			//io.Copy(ioutil.Discard, resp.Body)
			resp.Body.Close()
			log.Printf("resp status: %v", resp.Status)
		}
	}(tr)

	// FIX:
	// go func(transport *http.Transport) {
	// 	for {
	// 		time.Sleep(3 * time.Second)
	// 		transport.CloseIdleConnections()
	// 	}
	// }(tr)

	ch := make(chan struct{})
	<-ch
}

What did you expect to see?

I want to to see that IdleConnTimeout will reliably close idle connections, such that DNS is queried for new connections again, similar to the goroutine case in the code. We need to slowly being able to fade traffic.

What did you see instead?

In case you start the application with /etc/hosts entry is not set, and then change it, it will never fail the request, so the new DNS lookup is not being made.

@bradfitz bradfitz changed the title http client and http.Transport IdleConnTimeout is not reliably doing DNS resolve net/http: make Transport's idle connection management aware of DNS changes? Jan 12, 2018
@gopherbot gopherbot removed the NeedsFix label Jan 12, 2018
@bradfitz bradfitz added this to the Go1.11 milestone Jan 12, 2018
@bradfitz bradfitz added the NeedsFix label Jan 12, 2018
@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Jan 12, 2018

@meirf
Copy link
Contributor

@meirf meirf commented Feb 27, 2018

Possible Signals

If persistConn.conn is dialed via hostname:

  • TTL expiration. If TTL expires, take action. This would require an API change since net's TTL doesn't appear to be exposed.
  • Poll LookupIP. Store initial ip set and keep updating with new polling response; if ip set changes (or more strictly if dialed ip is not longer in ip set), take action. How frequently would we poll? Are we willing to store this extra state, spend this time polling, using system resourced even with optimizations?

Action

  • Close/mark for closure. If the connection is idle, close it. If the connection is in use, mark it as not reusable. This would be similar to current usage of persistConn.broken though "broken" maybe too harsh language for this case. (The OP uses CloseIdleConnections, which "does not interrupt any connections currently in use", so calling CloseIdleConnections once could leave many connections on the old ips.)

I guess the use of "?" in the title was due to implementation complexity/propriety of addressing this in the standard library. Please let me know if I've gotten something wrong or if there are other issues I'm missing.

@szuecs szuecs changed the title net/http: make Transport's idle connection management aware of DNS changes? net/http: make Transport's idle connection management aware of DNS changes Feb 27, 2018
@szuecs
Copy link
Author

@szuecs szuecs commented Feb 27, 2018

@meirf For us CloseIdleConnections works pretty well, we use it in our internet facing central ingress http proxy and it's fast enough to do the DNS lookup while we do DNS based traffic switching (AWS route53 Weighted DNS records). If you would use IdleConnTimeout to call CloseIdleConnections would not do a breaking change of the API. I bet it would be in 99.999% of the cases fine, because applications, also proxies, are most of the time CPU bound.
My question would be why you want to expose additional data if you already have IdleConnTimeout?

To answer your question about "?" in the title: I don't remember the why. I deleted it from the title. I am only digging from time to time into the Go src code, mostly in the net package. I guess with the "?" I wanted to reflect that. I hope it's now more clear. :)

@meirf
Copy link
Contributor

@meirf meirf commented Feb 28, 2018

tl;dr (1) Idle connections' awareness of dns changes appears to be absent from the API. (2) The easy part is forcing a redial via closure. (3) IMO the hard part is figuring out what signal the code should use to redial.

@szuecs
My first comment was directed to bradfitz, who set the title to "net/http: make Transport's idle connection management aware of DNS changes?".

Based on my reading of your most recent comment, you think idle connections' awareness of dns changes is already promised by the current API. I (and bradfitz based on his title) do not agree that idle connections' awareness of dns changes is already promised.

I think we agree that currently the only way to be aware of a dns change is to force a redial by not having connections kept alive, but IdleConnTimeout won't force a keep-alive connection to be closed unless your situation meet its definition. If you have a keep-alive connection and it never becomes idle, by definition it will never close itself due to IdleConnTimeout. The cumulative amount of time a keep-alive connection is idle doesn't matter. The only way the keep-alive connection will be closed due to IdleConnTimeout is if it's idle for IdleConnTimeout consecutive/contiguous time.

  1. To prevent any connections from being kept alive you can set DisableKeepAlives.
  2. To increase the likelihood of a keep-alive connection being closed, you can decrease IdleConnTimeout.
  3. To close all currently idle keep-alive connections, you can call CloseIdleConnections.

It's clear that all these would be helpful to you in redialing and therefore connecting to the new ip. This indirectly results in dns awareness (and possibly a lot of inefficiency). But in terms of the standard library, there is no promise of direct dns awareness.

If you would use IdleConnTimeout to call CloseIdleConnections would not do a breaking change of the API.

Briefly touching upon implementation, I don't think dns awareness in the standard library would use CloseIdleConnections since that blindly closes all keep-alive connections and I'd guess that we'd want dns change detection to be done per connection.

@szuecs szuecs changed the title net/http: make Transport's idle connection management aware of DNS changes net/http: make Transport's idle connection management aware of DNS changes? Feb 28, 2018
@szuecs
Copy link
Author

@szuecs szuecs commented Feb 28, 2018

@meirf Ok I was not sure if questions were asked to me. As far as I understand now you did not asked me. :)
I put again the "?" in the title, because the change was not done by me (I should read the events).
Thanks for your comments!

@dragonsinth
Copy link

@dragonsinth dragonsinth commented May 22, 2020

You guys willing to take any patches on this issue? This turns out to be a shockingly cross cutting problem for which workarounds are not great or perfect, and the default http transport basically guides you directly into this problem.

It doesn't seem that nutty to periodically re-resolve DNS for requested hostnames, and refuse to select an open conn whose IP is not in the current set for that hostname.

@rohitkeshwani07
Copy link

@rohitkeshwani07 rohitkeshwani07 commented Aug 17, 2020

Close/mark for closure. If the connection is idle, close it. If the connection is in use, mark it as not reusable. This would be similar to the current usage of persistConn.broken though "broken" may be too harsh language for this case. (The OP uses CloseIdleConnections, which "does not interrupt any connections currently in use", so calling CloseIdleConnections once could leave many connections on the old IPs.)

@meirf I think just implementing something like

    // Keep-alive connections are closed after this duration.
    //
    // By default connection duration is unlimited.
    MaxConnDuration time.Duration

will guarantee that my connections are not used beyond a minute. So my servers can assume that DNS propagation takes 1 minute over and above TTL time.

This would be easy to implement.

@cnmcavoy
Copy link

@cnmcavoy cnmcavoy commented Sep 18, 2020

We just hit this bug recently, and its very problematic in the kubernetes ecosystem. Kubernetes manages DNS via CoreDNS and updates records of services based on the readiness of the underlying pods in a cluster. However, with this bug, workloads that do not recheck DNS and re-use an idle connection make requests to unhealthy pods and then use garbage data from these connections and cause issues in the cluster.

I was surprised that http.Get did not have a one to one relationship with DNS requests in Go applications, given that Go does not do DNS caching. The current net.Transport behavior is effectively caching DNS for applications that support idle connections, and worse, caching DNS indefinitely. I expected Go to recheck DNS for every http connection, even if it re-used an idle connection.

Possible Signals

If persistConn.conn is dialed via hostname:

  • TTL expiration. If TTL expires, take action. This would require an API change since net's TTL doesn't appear to be exposed.

I agree @meirf that using the TTL to reduce the number of DNS requests would be ideal, but if it's not easy to use the TTL, then checking every time would still be fine with me.

Overall, this is very surprising behavior and not something that I can tune from the outside as a system administrator. My current workaround is to restart Go applications when DNS changes, but that is a brute-force solution and has many drawbacks.

@szuecs
Copy link
Author

@szuecs szuecs commented Sep 18, 2020

@cnmcavoy any http connection pool has some kind of dns caching besides maybe nodejs (sorry for bashing it). I think it's wrong to query dns more often than once in a while, because you want to have a connection open before you got http requests that trigger your calls.
Feel free to change all your http.Client to use either the Transport replacement https://pkg.go.dev/github.com/zalando/skipper@v0.11.151/net#NewTransport
or the Client https://pkg.go.dev/github.com/zalando/skipper@v0.11.151/net#Client
Both are API compatible (best effort compatibility) drop in replacements with additional features you might want to have anyways in kubernetes.

@cnmcavoy
Copy link

@cnmcavoy cnmcavoy commented Sep 18, 2020

I think it's wrong to query dns more often than once in a while

@szuecs Why wouldn't the correct behavior here to be to respect the DNS record's TTL?

I'm aware that you can work around the existing behavior, although I haven't looked at the project you linked. However, that is not helpful for projects that are already built/packaged for the community on dockerhub. Repackaging OSS Golang applications for them to respect DNS in idle connections sounds like an odd thing to tell a user.

I have witnessed real-world cases of the idle connections being reused in a Go application days after the DNS record was changed and out of date. That was very surprising to me and it seems like a bug in Golang.

@szuecs
Copy link
Author

@szuecs szuecs commented Sep 19, 2020

@cnmcavoy I am fully with you if you want to have a ttl based or "once in a while" based DNS Update to refresh newly created connections, but as issue creator I am against resolve every time as you also suggested.

@dragonsinth
Copy link

@dragonsinth dragonsinth commented Sep 19, 2020

Periodic / TTL based would be wonderful. Ideally, we would track the resolved ip for each connection in the idle pool, and when DNS results change, purge any connections whose ips are no longer in the list. (Since a big use case involves IP sets that change incrementally as servers get rolled.)

@networkimprov
Copy link

@networkimprov networkimprov commented Sep 19, 2020

cc @odeke-em @fraenkel since this is marked NeedsFix.

@szuecs
Copy link
Author

@szuecs szuecs commented Sep 19, 2020

@dragonsinth but in case of an ALB/NLB spanning 3 AZs (has 1 ip per AZ), you would constantly refresh, which is not a great idea. So purge cache as soon you see new ip is not as simple as it sounds. It can be done correctly but the change should make sure this case will be done correctly.

@dragonsinth
Copy link

@dragonsinth dragonsinth commented Sep 21, 2020

@dragonsinth but in case of an ALB/NLB spanning 3 AZs (has 1 ip per AZ), you would constantly refresh, which is not a great idea. So purge cache as soon you see new ip is not as simple as it sounds. It can be done correctly but the change should make sure this case will be done correctly.

@szuecs would you mind de-acronym this for me? I'm not a network engineer. :)

@marcind
Copy link

@marcind marcind commented Sep 21, 2020

@dragonsinth when running an application fronted by an AWS Application or Network Load Balancer (ALB/NLB), the load-balancing layer might consist of more servers with distinct IP addresses than the number of records returned in a typical DNS query (this might be a function of how many Availability Zones the LB is spanned across or by how scaled out the LB servers are due to traffic). If that is the case, the set of IPs presented to a client doing frequent DNS queries will appear to change with every query, even though nothing in either the LB or the application that it is fronting has changed.

@dragonsinth
Copy link

@dragonsinth dragonsinth commented Sep 21, 2020

Thanks! Makes sense. Maybe TTL can be used to help here? So we purge connections whose IPs have not been resolved from DNS within their specified TTL? This might actually improve global load balancing without too much churn.

@szuecs
Copy link
Author

@szuecs szuecs commented Sep 28, 2020

@dragonsinth Sorry, but I wonder if I miss read your comment #23427 (comment), because you wrote about IP sets so more than one IP per idle connection, which sounds correct to me. I somehow understood the idea was to close the connection in case of an IP change, which would happen all the time if you have a hostname resolving in multiple IPs.

In practice it would be fine to purge only the idle connections that are connected to IPs, which are not being resolved anymore. We run since years a proxy with up to 500k rps that just closes regularly (every 30s) idle connections, so I guess it's fine to only do it for idle connections. It was always fast enough to move traffic fast to new endpoints.

And sorry for using AWS specific acronyms.

@dragonsinth
Copy link

@dragonsinth dragonsinth commented Sep 28, 2020

No worries. Perhaps for even my own clarity I should spell it out a bit. Imagine querying the same hostname repeatedly and getting different rotating answers:

0s: host.name { ip1, ip2, ip3 }, TTL 300
30s: host.name { ip2, ip3, ip4 }, TTL 300
60s: host.name { ip3, ip4, ip5 }, TTL 300
90s: host.name { ip4, ip5, ip1 }, TTL 300
120s: host.name { ip5, ip1, ip2 }, TTL 300

Basically we could just keep map of last seen ips for a hostname with resolved TTL. After 120s it would look something like this:

"host.name":  map[ip]int {
  "ip1": <300s from now>,
  "ip2": <300s from now>,
  "ip3": <240s from now>,
  "ip4": <270s from now>,
  "ip5": <300s from now>,
}

When we go to borrow an idle connection from the pool, we throw it away if its associated IP has expired (or been removed from the map).

@szuecs
Copy link
Author

@szuecs szuecs commented Sep 29, 2020

@dragonsinth makes absolutely sense to me (type could be time.Duration instead of int, but the idea is great)!

@dragonsinth
Copy link

@dragonsinth dragonsinth commented Sep 29, 2020

Or expiry time.Time -- you'd want to resolve an absolute timestamp when you get the DNS response so you can interpret it later.

@rohitkeshwani07
Copy link

@rohitkeshwani07 rohitkeshwani07 commented Sep 30, 2020

In practice it would be fine to purge only the idle connections that are connected to IPs, which are not being resolved anymore.

@szuecs
weightedDNS will return different IPs but that doesn't mean old IPs are not valid anymore.

@szuecs
Copy link
Author

@szuecs szuecs commented Oct 6, 2020

@rohitkeshwani07 same is if you want to rotate IPs for whatever reason, old ones can still be valid. Maybe infrastructure just observes until no traffic and then stops. So maybe we should resolve again to switch traffic faster, but I don't know. It's probably fine to just use DNS TTL expiry.

@rohitkeshwani07
Copy link

@rohitkeshwani07 rohitkeshwani07 commented Oct 6, 2020

To what I understand, there are 3 possible solutions

  1. In case IP set changes, purge all connections and those in use should be marked not reusable (change in IP set will be done periodically every X seconds)
    Con: if I am using weightedDNS, then probably my connections will be rotated every X seconds which is performance overhead for backends as they will have to do SSL handshake again for new connections.
  2. TTL based expiry
    Con: if for some reason, I have set high TTL in my DNS. Connections won't be rotated ever, which again might raise problems in a scenario where your client assumes that the connection is open but the server has actually closed the connection. Whenever we try to use that half-open connection, we get socket timeout.
  3. max connection lifetime (i vote for)
    This will assume that a connection once created, will be used for a limited lifetime, and hence new DNS IPs will be resolved after this time. And any half-open/stale connection will be removed eventually.

Also, If we set max_connection_lifetime equal to max_connections, It can be assumed that there will be close to 1 connection establishment per second(averaged over max_connection_lifetime) which will be good for performance-intensive applications.

This is also the way other JAVA HTTP clients are solving the problem.

// @szuecs

@szuecs
Copy link
Author

@szuecs szuecs commented Oct 9, 2020

@rohitkeshwani07 3. seems also fine for me. I would not do 1. If popular JAVA clients do 3., I think we can safely assume it's not a bad choice and the behavior is well understood. To not get socket timeouts you can always work with backends together to get an aligned time.Duration (client 1s lower than the backend and you won't get any timeouts for this case).

@rv-rhill
Copy link

@rv-rhill rv-rhill commented Oct 12, 2020

.NET also offers the exact some configuration option described in @rohitkeshwani07's solution 3.

My organization faces the same problem originally articulated by @szuecs, since we're using a Route 53 Latency Policy record to handle dynamic, latency-optimized routing to ELB's + ECS services running the same app in multiple regions. I think this opportunity to differentiate between timeouts for a) connections that have fallen into idle state and b) connections in active (re)use will make it much easier to solve for the common problem of detecting changes in the IP set for a single DNS record over time.

It will also more clearly signal, in the docs for package http, that not all reusable connections necessarily become idle. I've noticed from some experimentation with the fabulous httptrace package that Go's http.Transport can efficiently reuse a greater number of connections per host than the specified MaxIdleConnsPerHost. If I run a group of concurrent goroutines all using the same underlying http.Transport to send requests continually to the same host, I see considerably more connections re-used across those goroutines than the MaxIdleConns. That is to say, many more of the connections reported in the GotConn hook were reused and not idle than reused and idle.

The story in the source code is this block in http.Transport's tryPutIdleConn method. I believe the comment's reference to "socket late binding" applies here. If the transport finds a wantConn lined up for the given connectMethod (e.g. proxy-less HTTPS for foo.bar) on its way to try putting a connection to the idle pool, it'll shortcut said process, hand off the connection, and return.

The next block below these lines shows the problem with using the CloseIdleConnections method as a workaround. So long as the same connections are being requested for reuse faster than the transport can get them to the idle pool, only a fraction (if any) of the transport's current connections will ever get closed between the call to CloseIdleConnections and the next HTTP action to trigger a round trip on the transport. (Follow this code down the ladder of abstraction – you'll eventually get to this line where the same flag controlled by CloseIdleConnections is set back to false.)

@cnmcavoy
Copy link

@cnmcavoy cnmcavoy commented Oct 14, 2020

@rohitkeshwani07 3. sounds like it would fit my needs as well.

Back when I first encountered this issue, I wrote up a Golang proof-of-concept to reproduce, and I had it open sourced. It runs a pure Go DNS server and makes requests on two interfaces to demonstrate the incorrect behavior. It is hopefully simpler to use to reproduce Golang's behavior than the steps in the OP & /etc/hosts mangling. The README.md has more details.

@krishnakowshik
Copy link

@krishnakowshik krishnakowshik commented Dec 7, 2020

@cnmcavoy @rohitkeshwani07 I am facing this issue as well. How do you set the http max connection lifetime ? I am setting IdleConnTimeout and I am setting transport config to -

Transport: &http.Transport{
DialContext: (&net.Dialer{
Timeout: 20 * time.Second,
KeepAlive: 20 * time.Second,
}).DialContext,
IdleConnTimeout: 2 * time.Second,
}

@szuecs
Copy link
Author

@szuecs szuecs commented Dec 7, 2020

@krishnakowshik right now you have to close the idle connections yourself as I wrote in the issue.

@kingcr7
Copy link

@kingcr7 kingcr7 commented Jan 13, 2021

@krishnakowshik right now you have to close the idle connections yourself as I wrote in the issue.
I think it would not happen when disable KeepAlive .Only set enable KeepAlive ,it would happen。
Should i set a job call CloseIdleConnections Once per minute? or when http response error then call CloseIdleConnections?

@RobGraham
Copy link

@RobGraham RobGraham commented Feb 11, 2021

We recently upgraded from 1.15.5 to 1.15.7. We started noticing the same connection issues experienced PRIOR to the introduction of the fix defined in this thread... We ultimately had to rollback and since then haven't experienced any problems.
Reviewing the release notes of 15.6 and 15.7, i saw no obvious net related changes. Has anyone else experienced issues with these versions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet