Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: nil pointer dereference in closeConnIfStillIdle #16208

Closed
reusee opened this issue Jun 29, 2016 · 15 comments

Comments

Projects
None yet
9 participants
@reusee
Copy link

commented Jun 29, 2016

  1. What version of Go are you using (go version)?
    go version devel +b75b063 Tue Jun 28 04:49:33 2016 +0000 linux/amd64
  2. What operating system and processor architecture are you using (go env)?
    GOARCH="amd64"
    GOBIN=""
    GOEXE=""
    GOHOSTARCH="amd64"
    GOHOSTOS="linux"
    GOOS="linux"
    GOPATH="/home/reus"
    GORACE=""
    GOROOT="/home/reus/go"
    GOTOOLDIR="/home/reus/go/pkg/tool/linux_amd64"
    CC="gcc"
    GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build483039895=/tmp/go-build -gno-record-gcc-switches"
    CXX="g++"
    CGO_ENABLED="1"
  3. What did you see instead?
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4de58f]

goroutine 54787 [running]:
panic(0x7a2f40, 0xc4200120c0)
    /home/reus/go/src/runtime/panic.go:500 +0x1a1
net/http.(*persistConn).closeConnIfStillIdle(0xc4249cfb00)
    /home/reus/go/src/net/http/transport.go:1255 +0x2f
net/http.(*persistConn).(net/http.closeConnIfStillIdle)-fm()
    /home/reus/go/src/net/http/transport.go:643 +0x2a
created by time.goFunc
    /home/reus/go/src/time/sleep.go:154 +0x44

@ALTree

This comment has been minimized.

Copy link
Member

commented Jun 29, 2016

You should include a recipe to reproduce the error in your report. Point 3 in the issue report list says

  1. What did you do?

If possible, provide a recipe for reproducing the error.

A complete runnable program is good.

A link on play.golang.org is best.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Jun 29, 2016

In addition to code, as @ALTree mentioned, does your program run under the race detector?

@reusee

This comment has been minimized.

Copy link
Author

commented Jun 29, 2016

@bradfitz No, not with race detector.

I don't think it will be easy to reproduce by a short example. It's a large long-running program. I believe it's introduced by recent changes. I will try to do some git bisect to find out which commit.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Jun 29, 2016

No, not with race detector.

Does that mean you haven't tried, or it fails?

I will try to do some git bisect to find out which commit.

Please do. I labeled this "WaitingForInfo" since there's nothing for us to help with until we have more to go on. If you get us something soon, there's a chance it could be fixed for Go 1.7, if there's a problem.

@reusee

This comment has been minimized.

Copy link
Author

commented Jun 30, 2016

@bradfitz I didn't run the program with race detector, I will give it a try.

@reusee

This comment has been minimized.

Copy link
Author

commented Jul 4, 2016

Close because the program ran for several days with no crash at all, with race detector enabled / disabled. No way to reproduce.

@reusee reusee closed this Jul 4, 2016

@vadmeste

This comment has been minimized.

Copy link

commented Aug 17, 2016

This bug occurred with me too (minio/minio#2456). However, it is not reproducible anymore. So, just wanted to encourage you to review the code if many other people experiment the same issue.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Aug 17, 2016

@vadmeste, if you find a reproducible example, please file a new bug. You can reference this one. But we don't re-use old bugs.

@kidoman

This comment has been minimized.

Copy link

commented Aug 18, 2016

Just got this today after upgrading to Go 1.7. Try to see if it occurs multiple times and will file a new bug. As this occurs in a go routine outside our control, it ended up bringing down the entire service.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Aug 18, 2016

@kidoman, if you find a reproducible example, please file a new bug. You can reference this one. But we don't re-use old bugs.

@golang golang locked and limited conversation to collaborators Aug 18, 2016

@golang golang unlocked this conversation Aug 18, 2016

@bradfitz bradfitz reopened this Aug 18, 2016

@bradfitz bradfitz changed the title net/http: invalid memory address or nil pointer dereference net/http: nil pointer dereference in closeConnIfStillIdle Aug 18, 2016

@bradfitz bradfitz modified the milestones: Go1.7.1, Go1.7Maybe Aug 18, 2016

@bradfitz

This comment has been minimized.

Copy link
Member

commented Aug 18, 2016

Okay, I have a fuzzy theory, so let's just use this bug, since it was never fixed anyway.

The crash is because persistConn.t is nil and we're crashing on the mutex Lock line here:

func (pc *persistConn) closeConnIfStillIdle() {
        t := pc.t
        t.idleMu.Lock()  // <--- BOOM, because t == nil

As far as I can tell, the only place wheret is nil (since it's never set to nil once created) is in the HTTP/2 (alt) path:

   return &persistConn{alt: next(cm.targetAddr, pconn.conn.(*tls.Conn))}, nil

So somehow that persistConn is getting into the tryPutIdleConn path. I suspect it's via t.putOrCloseIdleConn, in this path:

        handlePendingDial := func() {
                testHookPrePendingDial()
                go func() {
                        if v := <-dialc; v.err == nil {
                                t.putOrCloseIdleConn(v.pc)  // <--- here
                        }
                        testHookPostPendingDial()
                }()
        }

I don't have a repro yet, but it looks suspicious.

@mrjrieke

This comment has been minimized.

Copy link

commented Aug 19, 2016

Looks promising. Possibly race between setup and teardown?

@gopherbot

This comment has been minimized.

Copy link

commented Aug 20, 2016

CL https://golang.org/cl/27450 mentions this issue.

phuslu pushed a commit to phuslu/go that referenced this issue Aug 20, 2016

net/http: fix unwanted HTTP/2 conn Transport crash after IdleConnTimeout
Go 1.7 crashed after Transport.IdleConnTimeout if an HTTP/2 connection
was established but but its caller no longer wanted it. (Assuming the
connection cache was enabled, which it is by default)

Fixes golang#16208

Change-Id: I9628757f7669e344f416927c77f00ed3864839e3

phuslu pushed a commit to phuslu/go that referenced this issue Aug 21, 2016

net/http: fix unwanted HTTP/2 conn Transport crash after IdleConnTimeout
Go 1.7 crashed after Transport.IdleConnTimeout if an HTTP/2 connection
was established but but its caller no longer wanted it. (Assuming the
connection cache was enabled, which it is by default)

Fixes golang#16208

Change-Id: I9628757f7669e344f416927c77f00ed3864839e3

@gopherbot gopherbot closed this in 2369013 Aug 22, 2016

gopherbot pushed a commit that referenced this issue Sep 7, 2016

[release-branch.go1.7] net/http: fix unwanted HTTP/2 conn Transport c…
…rash after IdleConnTimeout

Go 1.7 crashed after Transport.IdleConnTimeout if an HTTP/2 connection
was established but but its caller no longer wanted it. (Assuming the
connection cache was enabled, which it is by default)

Fixes #16208

Change-Id: I9628757f7669e344f416927c77f00ed3864839e3
Reviewed-on: https://go-review.googlesource.com/27450
Reviewed-by: Andrew Gerrand <adg@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-on: https://go-review.googlesource.com/28637
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

oconnor663 added a commit to keybase/client that referenced this issue Oct 6, 2016

@sudersen

This comment has been minimized.

Copy link

commented Dec 3, 2016

What version of Go are you using (go version)?

go version go1.7.3 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH=""
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build191499881=/tmp/go-build"
CXX="g++"
CGO_ENABLED="1"

What did you do?

This is the first time I'm seeing this crash on the same code running for months and It happened for me sporadically on high load. Following are the initial trace.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x5241a0]

What did you expect to see?

We make an HTTP request to a service, process the response and return it as reference to the caller. The reference is never expected to be out of box since we have created an object and this failure did not occur by far for us in the service running for months.

Caller here is spawned using standard waitGroup method. Are there any other race conditions that could happen?

@bradfitz

This comment has been minimized.

Copy link
Member

commented Dec 3, 2016

@sudersen, you're commenting on a closed issue. Closed issues aren't tracked.

If you're experiencing a problem, please file a new code and include code. There's nothing in your comment to use to help you.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.