Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: Uncatchable panic in net.Dial caused by dns failure #6232

Closed
gopherbot opened this issue Aug 23, 2013 · 15 comments

Comments

Projects
None yet
6 participants
@gopherbot
Copy link

commented Aug 23, 2013

by levtchenko:

panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x20 pc=0x49b075]

goroutine 93699344 [running]:
net.cgoLookupIPCNAME(0xc2ab607980, 0xa, 0x0, 0x0, 0x0, ...)
        net/_obj/_cgo_gotypes.go:188 +0x2a5
net.cgoLookupIP(0xc2ab607980, 0xa, 0x0, 0x0, 0x0, ...)
        net/_obj/_cgo_gotypes.go:228 +0x67
net.cgoLookupHost(0xc2ab607980, 0xa, 0x0, 0x0, 0x0, ...)
        net/_obj/_cgo_gotypes.go:106 +0x79
net.lookupHost(0xc2ab607980, 0xa, 0x0, 0x0, 0x0, ...)
        /usr/lib/go/src/pkg/net/lookup_unix.go:56 +0x61
net.func·019()
        /usr/lib/go/src/pkg/net/lookup.go:42 +0x34
created by net.lookupHostDeadline
        /usr/lib/go/src/pkg/net/lookup.go:44 +0x22f

go version go1.1.2 linux/amd64

Linux 3.3.1-gentoo #22 SMP Mon Apr 8 00:22:25 MSK 2013 x86_64 Intel(R) Xeon(R) CPU
E3-1230 V2 @ 3.30GHz GenuineIntel GNU/Linux
@robpike

This comment has been minimized.

Copy link
Contributor

commented Aug 24, 2013

Comment 2:

Can you provide a complete example to reproduce the failure?

Labels changed: added priority-later, removed priority-triage.

Status changed to Accepted.

@gopherbot

This comment has been minimized.

Copy link
Author

commented Aug 28, 2013

Comment 3 by tomheinan:

Just chiming in to note that I've just run into this as well. My app spins up several
thousand goroutines, each of which does a DNS lookup every ten seconds or so, and the
app is reliably panicking after a few minutes of uptime.
go version go1.1.2 linux/amd64
Linux 3.9.3-x86_64-linode33 #1 SMP Mon May 20 10:22:57 EDT 2013 x86_64 x86_64 x86_64
GNU/Linux
The error in question:
    goroutine 63910 [running]:
    net.cgoLookupIPCNAME(0xc2006a65a0, 0x12, 0x0, 0x0, 0x0, ...)
        net/_obj/_cgo_gotypes.go:188 +0x2a5
    net.cgoLookupIP(0xc2006a65a0, 0x12, 0x0, 0x0, 0x0, ...)
        net/_obj/_cgo_gotypes.go:228 +0x67
    net.cgoLookupHost(0xc2006a65a0, 0x12, 0x0, 0x0, 0x0, ...)
        net/_obj/_cgo_gotypes.go:106 +0x79
    net.lookupHost(0xc2006a65a0, 0x12, 0x0, 0x0, 0x0, ...)
        /usr/local/go/src/pkg/net/lookup_unix.go:56 +0x61
    net.func·019()
        /usr/local/go/src/pkg/net/lookup.go:42 +0x34
    created by net.lookupHostDeadline
        /usr/local/go/src/pkg/net/lookup.go:44 +0x22f
The actual code that's doing the DNS lookup:
    // parse the server into host/port
    host, port, err := net.SplitHostPort(server)
    if err != nil {
        // we weren't given a port; try to find one via dns
        _, addrs, srvErr := net.LookupSRV("minecraft", "udp", server)
        if srvErr != nil {
            _, addrs, srvErr = net.LookupSRV("minecraft", "tcp", server)
        }
    
        if srvErr != nil {
            host = server
            port = "25565"
        } else {
            addr := addrs[0]
            host = strings.TrimRight(addr.Target, ".")
            port = strconv.FormatInt(int64(addr.Port), 10)
        }
    }
    
    conn, err := net.DialTimeout("udp", net.JoinHostPort(host, port), 3 * time.Second)
    if err != nil {
        return nil, err
    }
    .. etc ..
I ran into this this evening while doing a server migration - I thought perhaps it was
due to using a newer version of go, so I downgraded back to 1.1, but it's still crashing
after a few minutes, so it doesn't look like that's the issue.
@mikioh

This comment has been minimized.

Copy link
Contributor

commented Sep 6, 2013

Comment 4:

Can you please try the image that is built with CGO_ENABLED=0 if possible.
It looks like CGO lookup stuff in Dial dose something wrong.
@gopherbot

This comment has been minimized.

Copy link
Author

commented Sep 6, 2013

Comment 5 by tomheinan:

I wiped out the /bin and /pkg directories and rebuilt the app with CGO_ENABLED=0, but it
still seems to be failing on the cgo lookup. Is there some other flag I need to set for
the compiler to obey the CGO_ENABLED flag?
In the meanwhile, I'll try cross-compiling locally and see if that makes any difference.
@davecheney

This comment has been minimized.

Copy link
Contributor

commented Sep 6, 2013

Comment 6:

Please try to produce a code sample which reproduces the problem.
To build go from source
1. ensure you have removed every version of Go you may have on your system
2. hg clone -r release https://code.google.com/p/go 
3. export CGO_ENABLED=0 
4. cd go/src
5. ./make.bash
6. ensure go/bin is in your path.

Status changed to WaitingForReply.

@gopherbot

This comment has been minimized.

Copy link
Author

commented Sep 6, 2013

Comment 7 by arnaud.lb:

I can reproduce this when the process exceeds the opened files limit:
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x20 pc=0x4457c5]
goroutine 21 [running]:
net.cgoLookupIPCNAME(0x505db0, 0xb, 0x0, 0x0, 0x0, ...)
        net/_obj/_cgo_gotypes.go:188 +0x2a5
net.cgoLookupIP(0x505db0, 0xb, 0x0, 0x0, 0x0, ...)
        net/_obj/_cgo_gotypes.go:228 +0x67
net.cgoLookupHost(0x505db0, 0xb, 0x0, 0x0, 0x0, ...)
        net/_obj/_cgo_gotypes.go:106 +0x79
net.lookupHost(0x505db0, 0xb, 0x0, 0x0, 0x0, ...)
        /usr/local/go/src/pkg/net/lookup_unix.go:56 +0x61
net.lookupHostDeadline(0x505db0, 0xb, 0x0, 0x0, 0x0, ...)
        /usr/local/go/src/pkg/net/lookup.go:19 +0xd3
net.resolveInternetAddr(0x4ffb00, 0x2, 0x505db0, 0xb, 0x0, ...)
        /usr/local/go/src/pkg/net/ipsock.go:210 +0x405
net.ResolveIPAddr(0x4ffb00, 0x2, 0x505db0, 0xb, 0x0, ...)
        /usr/local/go/src/pkg/net/iprawsock.go:42 +0x14d
main.func·001()
The attached bug.go reproduces this.
Without CGO, it doesn't crash.

Attachments:

  1. bug.go (407 bytes)
@gopherbot

This comment has been minimized.

Copy link
Author

commented Sep 6, 2013

Comment 8 by arnaud.lb:

Apparently getaddrinfo() can return EAI_SYSTEM while leaving errno to 0
So err is sometimes nil here, even if gerrno is non-zero: (in net/cgo_unix.go)
    gerrno, err := C.getaddrinfo(h, nil, &hints, &res)                          
If err is nil, and gerrno is EAI_SYSTEM, there is a nil pointer deref:
    if gerrno != 0 {                                                            
        var str string                                                          
        if gerrno == C.EAI_NONAME {                                             
            str = noSuchHost                                                    
        } else if gerrno == C.EAI_SYSTEM {                                      
            str = err.Error()
@gopherbot

This comment has been minimized.

Copy link
Author

commented Sep 6, 2013

Comment 9 by levtchenko:

It's look like it is a bug in glibc that getaddrinfo is not thread safe.
http://sourceware.org/bugzilla/show_bug.cgi?id=13271
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Sep 6, 2013

Comment 10:

According to the bug, getaddrinfo is only not thread-safe if your program changes the
environment while concurrently calling getaddrinfo; does your program do that?
Even then I don't see how the reported results (gerrno == EAI_SYSTEM && errno == 0)
would occur.
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Sep 6, 2013

Comment 11:

arnaud.lb: can you reproduce those results with tip so that we get a good file/line for
the panic?
Also, what system are you running on?
@gopherbot

This comment has been minimized.

Copy link
Author

commented Sep 6, 2013

Comment 12 by arnaud.lb:

I'm running this on a debian box, libc6 package 2.17-92
The previously attached bug.go doesn't reproduce on tip; i've attached an other one
which works on tip.
Stacktrace:
panic: runtime error: invalid memory address or nil pointer dereference         
[signal 0xb code=0x1 addr=0x0 pc=0x451270]                                      
                                                                                
goroutine 59 [running]:                                                         
net.cgoLookupIPCNAME(0x51ee30, 0xb, 0x0, 0x0, 0x0, ...)                         
        /home/arnaud/dev/go-hg/src/pkg/net/cgo_unix.go:102 +0x380               
net.cgoLookupIP(0x51ee30, 0xb, 0x0, 0x0, 0x0, ...)                              
        /home/arnaud/dev/go-hg/src/pkg/net/cgo_unix.go:138 +0x9c                
net.lookupIP(0x51ee30, 0xb, 0x0, 0x0, 0x0, ...)                                 
        /home/arnaud/dev/go-hg/src/pkg/net/lookup_unix.go:64 +0x90              
net.func·019(0x0, 0x0, 0x0, 0x0)                                                
        /home/arnaud/dev/go-hg/src/pkg/net/lookup.go:41 +0x5a                   
net.(*singleflight).Do(0x62aeb0, 0x51ee30, 0xb, 0x7f9d7ea3ed70, 0x0, ...)       
        /home/arnaud/dev/go-hg/src/pkg/net/singleflight.go:45 +0x273            
net.lookupIPMerge(0x51ee30, 0xb, 0x0, 0x0, 0x0, ...)                            
        /home/arnaud/dev/go-hg/src/pkg/net/lookup.go:42 +0x11a                  
net.lookupIPDeadline(0x51ee30, 0xb, 0x0, 0x0, 0x0, ...)                         
        /home/arnaud/dev/go-hg/src/pkg/net/lookup.go:57 +0x11e                  
net.resolveInternetAddr(0x5186a0, 0x2, 0x51ee30, 0xb, 0x0, ...)                 
        /home/arnaud/dev/go-hg/src/pkg/net/ipsock.go:285 +0x3ff                 
net.ResolveIPAddr(0x5186a0, 0x2, 0x51ee30, 0xb, 0x0, ...)                       
        /home/arnaud/dev/go-hg/src/pkg/net/iprawsock.go:49 +0x17b               
main.func·001()                                                                 
        /home/arnaud/dev/go/src/bug6232/bug.go:17 +0x6a                         
created by main.lookup                                                          
        /home/arnaud/dev/go/src/bug6232/bug.go:20 +0xba
Attached bug.c confirms that getaddrinfo() sometimes leaves errno to 0 when it returns
an error.

Attachments:

  1. bug.go (439 bytes)
  2. bug.c (804 bytes)
@mikioh

This comment has been minimized.

Copy link
Contributor

commented Sep 6, 2013

Comment 13:

Labels changed: added go1.2, removed priority-later.

Status changed to Accepted.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2013

Comment 14:

https://golang.org/cl/13532045

Status changed to Started.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2013

Comment 15:

This issue was closed by revision 382738a.

Status changed to Fixed.

@gopherbot

This comment has been minimized.

Copy link
Author

commented Sep 11, 2013

Comment 16 by tomheinan:

Fix looks good on my end; I've been running it for about half an hour and it's working
normally. Thanks for your hard work, it's much appreciated!

@rsc rsc added this to the Go1.2 milestone Apr 14, 2015

@rsc rsc removed the go1.2 label Apr 14, 2015

@golang golang locked and limited conversation to collaborators Jun 25, 2016

This issue was closed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.