Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: LookupHost("localhost") returning "no such host" error #3575

Closed
james4k opened this issue Apr 28, 2012 · 39 comments

Comments

Projects
None yet
@james4k
Copy link
Contributor

commented Apr 28, 2012

The attached source file reproduces the issue, by simply calling LookupHost in a loop,
in a few hundred goroutines.

It seems to be related to performing a large number of (concurrent?) DNS requests, and
localhost is not the only host name it happens on. It just highlights the fact that
something is not quite right. In my case, it happened with a basic web crawler which
makes a lot of concurrent requests, and eventually it would fail to connect to a local
DB, or any other host.

This was tested on Mac OS X Lion, using the binary release of Go1.

Attachments:

  1. main.go (238 bytes)
@james4k

This comment has been minimized.

Copy link
Contributor Author

commented Apr 28, 2012

Comment 1:

After searching around, it seems that getaddrinfo may not be thread-safe on OSX.
Although, I'm not sure how credible this is.
http://stackoverflow.com/questions/1212716/python-interpreter-blocks-multithreaded-dns-requests
@robpike

This comment has been minimized.

Copy link
Contributor

commented Apr 28, 2012

Comment 2:

Labels changed: added priority-later, removed priority-triage.

Owner changed to @rsc.

Status changed to Accepted.

@robpike

This comment has been minimized.

Copy link
Contributor

commented Apr 28, 2012

Comment 3:

Labels changed: added priority-soon, removed priority-later.

@james4k

This comment has been minimized.

Copy link
Contributor Author

commented May 3, 2012

Comment 4:

After some more research, I believe this is essentially the same issue:
https://golang.org/issue/1724
@edsrzf

This comment has been minimized.

Copy link

commented Jul 12, 2012

Comment 5:

I'm able to reproduce a similar problem on Linux. If I do lots of concurrent GET
requests to http://www.google.com, I start getting DNS errors.
See the attached program. When I run this program, about half of the 300 requests fail
with this error:
Get http://www.google.com: lookup www.google.com: no such host

Attachments:

  1. dnserror.go (292 bytes)
@edsrzf

This comment has been minimized.

Copy link

commented Jul 12, 2012

Comment 6:

Or to cut HTTP out of the equation, here's a version that just does the DNS lookups.

Attachments:

  1. dnserror.go (286 bytes)
@bradfitz

This comment has been minimized.

Copy link
Member

commented Jul 12, 2012

Comment 7:

dnserror.go from comment #6 runs fine for me on linux/amd64, running it dozens of times.
@davecheney

This comment has been minimized.

Copy link
Contributor

commented Jul 12, 2012

Comment 8:

I think this may be related to file descriptor exhaustion in your process, which causes
DNS lookups to fail. For example, your version that uses http.Get never closes the
response body, leaking a fd. 
Can you please try the following.
1. make sure you close the response body
2. rebuild your tree with the env var CGO_ENABLED=0 set, this will disable the
integration with your OSs native dns resolver, and use the pure go version instead.
@edsrzf

This comment has been minimized.

Copy link

commented Jul 12, 2012

Comment 9:

Closing the HTTP response body makes no difference.
Rebuilding with CGO_ENABLED=0 doesn't help.
Interestingly, adding a lock around net.cgoLookupIPCNAME fixes the problem, though I'm
not suggesting it as a solution.
@gopherbot

This comment has been minimized.

Copy link

commented Jul 13, 2012

Comment 10 by jason@squarepenguin.com:

I have the same problem (DNS lookups failing) running 1.0.2 on Debian 6 if I run more
than 7-8 concurrent requests.
@edsrzf

This comment has been minimized.

Copy link

commented Jul 13, 2012

Comment 11:

A couple more observations:
- If I put a lock around only the getaddrinfo call in net.cgoLookupIPCNAME, things work
as expected. (Again, not suggesting this as a solution.)
- I wrote a C program that exhibits the same behavior, so this may not be a Go problem
after all. Source code is attached for the curious. (Compile with gcc -lpthread
dnserror.c.)
So now I'm left with the unfortunate possibilities of a glibc bug or some kind of
configuration error. I've come across a few other people with this problem also, so if
it's a configuration error it must be a relatively common one.

Attachments:

  1. dnserror.c (815 bytes)
@davecheney

This comment has been minimized.

Copy link
Contributor

commented Jul 13, 2012

Comment 12:

Your c program compiles and runs fine on ubuntu 12.04 amd64. I wonder if this is problem
only affects the 32bit systems?
Could those commenting on this ticket please ensure they include details about their
host os, arch and revision.
@edsrzf, are you able to test this on another host, or better yet on another network, to
rule out a problematic DNS resolver. I hear 8.8.8.8 is a good resolver.
@gopherbot

This comment has been minimized.

Copy link

commented Jul 13, 2012

Comment 13 by jason@squarepenguin.com:

No problems either on debian 6 amd64 running dnserror.c.
@edsrzf

This comment has been minimized.

Copy link

commented Jul 13, 2012

Comment 14:

The machine I've been using thus far is running Arch Linux amd64 with glibc 2.16. If I
change my name server to 8.8.8.8, it significantly reduces the number of errors I get
from dnserror.c: from 150-160 down to 8-24. It makes dnserror.go run flawlessly.
If I try on my laptop, which runs Ubuntu 11.10 amd64 with eglibc 2.13, I have no
problems with any program/name server combination. This is on the same network.
@rsc

This comment has been minimized.

Copy link
Contributor

commented Sep 12, 2012

Comment 15:

This sounds very much like a possible threading bug in glibc 2.16. I am not sure there's
anything for us to do.

Labels changed: added priority-later, removed priority-soon.

@gopherbot

This comment has been minimized.

Copy link

commented Dec 2, 2012

Comment 16 by Princessandsuperman:

Just wanted to throw in that I also have been running into this problem (linux mint 11
amd64 go 1.0.3).  And the mutex around net.cgoLookupIPCNAME is what I am using as a
temporary fix.
@rsc

This comment has been minimized.

Copy link
Contributor

commented Jul 30, 2013

Comment 17:

Labels changed: added go1.2maybe.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Jul 30, 2013

Comment 18:

Labels changed: added feature.

@robpike

This comment has been minimized.

Copy link
Contributor

commented Aug 16, 2013

Comment 19:

Not sure what to do. Deferring to Go1.3 so we can revisit it then and hope the
underlying problem fixed.

Labels changed: added go1.3maybe, removed go1.2maybe.

@robpike

This comment has been minimized.

Copy link
Contributor

commented Aug 20, 2013

Comment 20:

Labels changed: removed go1.3maybe.

@gopherbot

This comment has been minimized.

Copy link

commented Oct 14, 2013

Comment 21 by shane@stealnetwork.com:

Verified on OSX 10.8.4. The exhaustion seems to be FD related to kqueue. Both the go
program and a simple python program. lsof indicates about 250 open KQUEUE file handles.

Attachments:

  1. dnserror.py (128 bytes)
@rsc

This comment has been minimized.

Copy link
Contributor

commented Nov 27, 2013

Comment 22:

Labels changed: added go1.3maybe.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Nov 27, 2013

Comment 23:

Labels changed: removed feature.

@gopherbot

This comment has been minimized.

Copy link

commented Dec 2, 2013

Comment 24 by robin@us2.nl:

I can also confirm that I have this problem on multiple environments (latest OSX, CentOS
6, CentOS 5 and Ubuntu 13.10). All 64 bit machines.
@rsc

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2013

Comment 25:

Labels changed: added release-none, removed go1.3maybe.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2013

Comment 26:

Labels changed: added repo-main.

@gopherbot

This comment has been minimized.

Copy link

commented Dec 9, 2013

Comment 27 by willem.moors:

My go code kicks off 16 goroutines to concurrently run sql on postgresql databases that
are on hosts n01..n16 on my local network.
Once in a while (but not always) I get errors such as these:
  dbh.Prepare():dial tcp: lookup n02: no such host
  dbh.Prepare():dial tcp: lookup n05: no such host
Usually when this occurs only 2 to 3 hosts are not found, while the rest are found.
When I do another run of the program immediately after getting such errors, then the
errors dissappear. 
Also if I replace the hostnames by ip-addresses in the database connection parameter
string, then the errors do not occur. 
System: debian linux running kernel 3.2.0-4-amd64
Go version go1.2 linux/amd64
@remyoudompheng

This comment has been minimized.

Copy link
Contributor

commented Dec 9, 2013

Comment 28:

You may want to check issue #6336.
@gopherbot

This comment has been minimized.

Copy link

commented Aug 18, 2014

Comment 29 by fiorix:

Still having this problem on Go 1.3.1.
@bradfitz

This comment has been minimized.

Copy link
Member

commented Aug 18, 2014

Comment 30:

fiorix, what version of glibc do you have? It's likely you don't have their fix yet. I
think this was determined to not be a Go bug (see issue #6336). We might detect and work
around it, though.
Alternatively, you can update your glibc, or use the netgo build tag and use Go's
built-in DNS resolver instead of the system one.
@gopherbot

This comment has been minimized.

Copy link

commented Aug 27, 2014

Comment 31 by fiorix:

It's Ubuntu 14.04 with the latest libc 2.19. I've been following issue #6336 as well.
Tried the netgo build tag (go install -a -tags netgo std and netgo net iirc) but dns
resolution eventually stops working after a while. Perhaps because my program uses
sqlite thus cgo can't be disabled? Not sure if that has anything to do with it.
@gopherbot

This comment has been minimized.

Copy link

commented Sep 12, 2014

Comment 32 by fiorix:

Works by doing "go install -a -tags netgo net my-pkg"
@gopherbot

This comment has been minimized.

Copy link

commented Nov 24, 2014

Comment 33 by stpra123:

Sorry, I'm pretty new. How do I do a lock around net.cgoLookupIPCNAME?
@gopherbot

This comment has been minimized.

Copy link

commented Nov 24, 2014

Comment 34 by stpra123:

Sorry for the double post but this one includes more details. How do I do a lock around
net.cgoLookupIPCNAME? 
When I try the workaround as suggested in #32 I get:
"go install runtime: open /usr/local/go/pkg/linux_amd64/runtime.a: permission denied".
If I sudo then it seems to work but then when I try to install other packages I get:
"go install container/list: open /usr/local/go/pkg/linux_amd64/container/list.a:
permission denied
go install unicode/utf8: open /usr/local/go/pkg/linux_amd64/unicode/utf8.a: permission
denied
go install crypto/subtle: open /usr/local/go/pkg/linux_amd64/crypto/subtle.a: permission
denied
go install hash: open /usr/local/go/pkg/linux_amd64/hash.a: permission denied
go install unicode: open /usr/local/go/pkg/linux_amd64/unicode.a: permission denied".
I then uninstalled then reinstalled Go. Am I doing something wrong here? Using Linux
Mint 17.
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Nov 24, 2014

Comment 35:

The question about permission errors with "go install runtime" doesn't have anything to
do with this issue.  Please take that question to golang-nuts.  (It should work as long
as you are consistent about the user running "go install" but please do not reply here.)
@mikioh

This comment has been minimized.

Copy link
Contributor

commented Feb 4, 2015

I close this issue for the following reasons.

  1. A glibc bug described in #6336 has been fixed in glibc-2.20 and beyond.
  2. In go1.4, the internal lookup function lookupIPDeadline has been fixed in the case of slow simultaneous query-response exchanges. It is used by both CGO_ENABLED=1 and CGO_ENABLED=0 lookup functions.
  3. In go1.4, built-in DNS stub resolver (which is enabled when CGO_ENABLED=0) has been improved in the case of simultaneous query-response exchanges.

In addition, go1.4 has a test case that is able to invoke simultaneous query-response exchanges.

cd $GOROOT/src/net
go test -v -run=TestLookupIPDeadline -dnsflood

Please try it for troubleshooting, when you have a question with regard to DNS behavior within dial, lookup APIs.

@mikioh mikioh closed this Feb 4, 2015

jpillora added a commit to jpillora/whos-home that referenced this issue Jun 10, 2015

@pmoust

This comment has been minimized.

Copy link

commented Jun 21, 2015

@mikioh I think this is still happening..

 /usr/local/Cellar/go/1.4/libexec/src/net ⮀ ⭠ master  ⮀ go test -v -run=TestLookupIPDeadline -dnsflood
=== RUN TestLookupIPDeadline
--- PASS: TestLookupIPDeadline (3.58s)
    z_last_test.go:98: 59 succeeded, 9941 failed (9941 timeout, 9941 temporary, 0 other, 0 unknown)
PASS
ok      net 3.605s
@pmoust

This comment has been minimized.

Copy link

commented Jun 21, 2015

Hm,
wrong output pasted..
Still, I get lookup errors when launching about 250 goroutines that involve net.LookupIP()

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Jun 21, 2015

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.