Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: Transport leaks net.Conns if connections never become idle #33849

Closed
bcmills opened this issue Aug 26, 2019 · 5 comments
Closed

net/http: Transport leaks net.Conns if connections never become idle #33849

bcmills opened this issue Aug 26, 2019 · 5 comments
Assignees
Milestone

Comments

@bcmills
Copy link
Member

@bcmills bcmills commented Aug 26, 2019

The following test case fails at go1.13rc1 due to a leak introduced in CL 184262. The leak occurs due to persistConn pointers entering the idleConnWait queue and remaining reachable via that queue until a connection becomes idle: if a connection never becomes idle, the queue is never cleared.

// A countedConn is a net.Conn that decrements an atomic counter when finalized.
type countedConn struct {
	net.Conn
	atomicConnCount *int64
}

// A countingDialer dials connections and counts the number that remain reachable.
type countingDialer struct {
	dialer      net.Dialer
	atomicCount int64
}

func (d *countingDialer) DialContext(ctx context.Context, network, address string) (net.Conn, error) {
	conn, err := d.dialer.DialContext(ctx, network, address)
	if err != nil {
		return nil, err
	}

	counted := new(countedConn)
	counted.Conn = conn
	atomic.AddInt64(&d.atomicCount, 1)
	runtime.SetFinalizer(counted, d.decrement)
	return counted, nil
}

func (d *countingDialer) decrement(*countedConn) {
	atomic.AddInt64(&d.atomicCount, -1)
}

func (d *countingDialer) Count() int64 {
	return atomic.LoadInt64(&d.atomicCount)
}

func TestTransportPersistConnLeakNeverIdle(t *testing.T) {
	defer afterTest(t)

	ts := httptest.NewServer(HandlerFunc(func(w ResponseWriter, r *Request) {
		// Close every connection so that it cannot be kept alive.
		conn, _, err := w.(Hijacker).Hijack()
		if err != nil {
			t.Errorf("Hijack failed unexpectedly: %v", err)
			return
		}
		conn.Close()
	}))
	defer ts.Close()

	var d countingDialer
	c := ts.Client()
	c.Transport.(*Transport).DialContext = d.DialContext

	body := []byte("Hello")
	prevCount := d.Count()
	for i := 0; ; i++ {
		n := d.Count()
		if n < prevCount {
			break
		}
		if i >= 1<<12 {
			t.Fatalf("Count of allocated client net.Conns (%d) did not decrease after %d Do / GC iterations.", n, i)
		}
		prevCount = n

		req, err := NewRequest("POST", ts.URL, bytes.NewReader(body))
		if err != nil {
			t.Fatal(err)
		}
		_, err = c.Do(req)
		if err == nil {
			t.Fatal("expected broken connection")
		}

		runtime.GC()
	}
}

The test passes with that CL reverted, but since that CL fixed a significant deadlock (#32336), I intend to fix the leak (hopefully today) rather than reverting the CL.

CC @bradfitz

@bcmills bcmills added this to the Go1.13 milestone Aug 26, 2019
@bcmills bcmills self-assigned this Aug 26, 2019
@bcmills

This comment has been minimized.

Copy link
Member Author

@bcmills bcmills commented Aug 26, 2019

Didn't get this fixed today, but I'm close. (I'm worried that the test will either miss a case of the leak or be flaky, so I'm trying to make sure that I actually understand the paths involved.)

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Aug 27, 2019

Change https://golang.org/cl/191964 mentions this issue: net/http: fix wantConnQueue memory leaks in Transport

@bcmills bcmills modified the milestones: Go1.13, Go1.14 Aug 27, 2019
@bcmills

This comment has been minimized.

Copy link
Member Author

@bcmills bcmills commented Aug 27, 2019

@gopherbot, please backport to Go 1.13: this is a regression from 1.12.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Aug 27, 2019

Backport issue(s) opened: #33877 (for 1.12), #33878 (for 1.13).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@gopherbot gopherbot closed this in 94bf9a8 Aug 27, 2019
@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Aug 27, 2019

Change https://golang.org/cl/191967 mentions this issue: [release-branch.go1.13] net/http: fix wantConnQueue memory leaks in Transport

gopherbot pushed a commit that referenced this issue Aug 27, 2019
…ransport

I'm trying to keep the code changes minimal for backporting to Go 1.13,
so it is still possible for a handful of entries to leak,
but the leaks are now O(1) instead of O(N) in the steady state.

Longer-term, I think it would be a good idea to coalesce idleMu with
connsPerHostMu and clear entries out of both queues as soon as their
goroutines are done waiting.

Cherry-picked from CL 191964.

Updates #33849
Updates #33850
Fixes #33878

Change-Id: Ia66bc64671eb1014369f2d3a01debfc023b44281
Reviewed-on: https://go-review.googlesource.com/c/go/+/191964
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
(cherry picked from commit 94bf9a8)
Reviewed-on: https://go-review.googlesource.com/c/go/+/191967
tomocy added a commit to tomocy/go that referenced this issue Sep 1, 2019
I'm trying to keep the code changes minimal for backporting to Go 1.13,
so it is still possible for a handful of entries to leak,
but the leaks are now O(1) instead of O(N) in the steady state.

Longer-term, I think it would be a good idea to coalesce idleMu with
connsPerHostMu and clear entries out of both queues as soon as their
goroutines are done waiting.

Fixes golang#33849
Fixes golang#33850

Change-Id: Ia66bc64671eb1014369f2d3a01debfc023b44281
Reviewed-on: https://go-review.googlesource.com/c/go/+/191964
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
t4n6a1ka added a commit to t4n6a1ka/go that referenced this issue Sep 5, 2019
I'm trying to keep the code changes minimal for backporting to Go 1.13,
so it is still possible for a handful of entries to leak,
but the leaks are now O(1) instead of O(N) in the steady state.

Longer-term, I think it would be a good idea to coalesce idleMu with
connsPerHostMu and clear entries out of both queues as soon as their
goroutines are done waiting.

Fixes golang#33849
Fixes golang#33850

Change-Id: Ia66bc64671eb1014369f2d3a01debfc023b44281
Reviewed-on: https://go-review.googlesource.com/c/go/+/191964
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.