Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: net: prefer /etc/hosts over DNS when no /etc/nsswitch.conf is present #35305

Open
schnoddelbotz opened this issue Nov 1, 2019 · 11 comments

Comments

@schnoddelbotz
Copy link

@schnoddelbotz schnoddelbotz commented Nov 1, 2019

What version of Go are you using (go version)?

$ go version
go version go1.13.3 linux/amd64

Does this issue reproduce with the latest release?

Yes, it does.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build004276885=/tmp/go-build -gno-record-gcc-switches"

What did you do?

  • Built a go containerized Go app using best practices as documented by Docker:
    • using a multi-stage build
    • with CGO_ENABLED=0
    • with Alpine as the final stage for the container built
  • After successful build, run the app ...
    • by starting a SSH tunnel on the host system
    • by starting the container with --network=host to let container connect to tunnel
    • by telling container/app to use localhost as target to connect to:
ssh -L 59187:redis.example.com:6379 joe@bastion.example.com
docker run --net=host --rm myGoApp:latest /app/redisDoSomething -host localhost -port 59187

What did you expect to see?

The Go app should consider the container's /etc/hosts file to resolve localhost to 127.0.0.1.

What did you see instead?

The Go app did not consider /etc/hosts inside the container; it queried DNS servers to resolve localhost. Actually, no /etc/hosts entry would ever be considered, all lookups are using DNS directly.

Some background / Summary

  • The issue was first reported in #22846 but unfortunately closed by the reporter, as he found a work-around. Due to premature issue closure, #22846 was never really considered / commented on by Go maintainers.
  • The underlying problem is that Alpine uses musl and thus has no /etc/nsswitch.conf.
    But Go, with CGO_ENABLED=0, does only consider /etc/hosts if it finds a /etc/nsswitch.conf. Alpine understandably rejects adding a /etc/nsswitch.conf to work around
    questionable Go behaviour (e.g. macOS also has no /etc/nsswitch.conf).
  • As a consequence, literally EVERYBODY introduced work-arounds due to Go's musl-antipathy

In essence, everybody following documented Docker best-practices of using multi-stage builds and disabling CGO is prone to run into this issue sooner or later. In full paranoia mode, it could even be seen as security issue, as resolving localhost via DNS could be a potential attack vector, e.g. on public hot spots?

By re-opening the issue, I'd kindly ask for review of the original issue and to reconsider whether Go's net package could be adjusted to use /etc/hosts if present, even if no /etc/nsswitch.conf is found.

The issue just consumes too much developer resources and introduces work-arounds in way too many places, IMHO. Finally, and just for laughs, we ran into this issue as some joker created a localhost entry in our LAN, pointing to his PC.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Nov 1, 2019

When there is no /etc/nsswitch.conf file, the default for glibc for the hosts line is dns [!UNAVAIL=return] files. Go is intended to be compatible with that.

If Go never looks at /etc/hosts, even if a DNS lookup fails, that sounds like a bug. But the code does look OK to me at first glance:

	// If /etc/nsswitch.conf doesn't exist or doesn't specify any
	// sources for "hosts", assume Go's DNS will work fine.
	if os.IsNotExist(nss.err) || (nss.err == nil && len(srcs) == 0) {
		if c.goos == "solaris" {
			// illumos defaults to "nis [NOTFOUND=return] files"
			return fallbackOrder
		}
		if c.goos == "linux" {
			// glibc says the default is "dns [!UNAVAIL=return] files"
			// https://www.gnu.org/software/libc/manual/html_node/Notes-on-NSS-Configuration-File.html.
			return hostLookupDNSFiles
		}
		return hostLookupFilesDNS
	}

Changing Go to look at /etc/hosts first, before doing a DNS lookup, would not be compatible with glibc.

@schnoddelbotz

This comment has been minimized.

Copy link
Author

@schnoddelbotz schnoddelbotz commented Nov 2, 2019

Thanks, @ianlancetaylor

When there is no /etc/nsswitch.conf file, the default for glibc for the hosts line is dns [!UNAVAIL=return] files. Go is intended to be compatible with that.
Changing Go to look at /etc/hosts first, before doing a DNS lookup, would not be compatible with glibc.

I believe you indirectly pointed out the root problem: A potentially wrong assumption about the glibc default -- effectively based on an outdated glibc manual, as far as I can tell:

The Go code in question points at some NSS glibc notes. BUT ... these notes were last updated in 1996 (!) by Ulrich Drepper:

Unsurprisingly, more than 20 years later, the NSS glibc notes seem outdated and seem to no longer match actual code. Comparing the commit from 1996 with current version, we have different behaviour nowadays:

  50 # define DEFAULT_DEFCONFIG "files"
 ...
 157   /* No configuration data is available, either because nsswitch.conf
 158      doesn't exist or because it doesn't have a line for this database.
 159
 160      DEFCONFIG specifies the default service list for this database,
 161      or null to use the most common default.  */
 162   if (*ni == NULL)
 163     {
 164       *ni = nss_parse_service_list (defconfig ?: DEFAULT_DEFCONFIG);

Yes, there's still a mention of "hosts", "dns [!UNAVAIL=return] files", but AFAICT by the sourrounding comment, that portion is /* Called by nscd and nscd alone. */.

I've always been able to override DNS with local hosts entries in the past, without tweaking system defaults. I think it's a common, broadly documented use-case (example, showing that even Windows users can rely on hosts as primary default source).
I was also unable to find the stated default anywhere else:

  • glibc documents hosts: files nisplus nis dns as example
  • glibc sources' nsswitch.conf show hosts: files dns as default/example
  • NetBSD manual page shows hosts: files dns as example
  • FreeBSD example: hosts: cache files dns
  • musl statically tries hosts before dns

Given the current de-facto work-around for Alpine-based containers, including Go itself, is...

RUN [ ! -e /etc/nsswitch.conf ] && echo 'hosts: files dns' > /etc/nsswitch.conf

I'd kindly ask to reconsider my request, based on above observations:
If there's no nsswitch.conf BUT a hosts file exists, try the hosts file before using DNS.

If someone has an account for glibc bugzilla, I'd appreciate if a bug could be opened there to further clarify/ensure that I'm not mistaken and this whole issue here just is simply based on their outdated docs ... and can safely be fixed in both places. Thanks!

@schnoddelbotz

This comment has been minimized.

Copy link
Author

@schnoddelbotz schnoddelbotz commented Nov 2, 2019

I filed an issue to clarify the apparent glibc documentation issue https://sourceware.org/bugzilla/show_bug.cgi?id=25156

If I'm not mistaken, glibc actually falls back to files only if nsswitch.conf is absent; PLEASE do not align Go with that behaviour 😉

Will report back here for sure once the glibc issue brings clarification.

@schnoddelbotz

This comment has been minimized.

Copy link
Author

@schnoddelbotz schnoddelbotz commented Nov 4, 2019

I had to take another look at glibc code, with the friendly help of djdelorie on glibc IRC. He pointed me at
https://sourceware.org/git/?p=glibc.git;a=blob;f=nss/hosts-lookup.c;hb=HEAD

Meaning: I was wrong, glibc really queries dns first in absence of nsswitch.conf 🤦‍♂ 😭
Really -- Sorry for the fuss.

But... It all still feels wrong?!
All Linux+BSD distros ship files dns as mentioned above, the glibc example does it, musl does it without etc. ... only glibc on a "broken" system without nsswitch.conf does it the other way round... leading to Go's observed "broken-by-default" behaviour on Alpine/musl. If Go insists on "compatibility" with glibc behaviour, the only remaining potential fix I see now is to change glibc :-)

I'll try to start that discussion over on libc-alpha now, as I was told the default change is at least discussable.

Would you mind keeping this bug open some more days until I have feedback? Thanks!

@FiloSottile FiloSottile changed the title net: Should prefer /etc/hosts over DNS (even) if no /etc/nsswitch.conf is present proposal: net: prefer /etc/hosts over DNS (even) if no /etc/nsswitch.conf is present Nov 5, 2019
@gopherbot gopherbot added this to the Proposal milestone Nov 5, 2019
@gopherbot gopherbot added the Proposal label Nov 5, 2019
@rsc rsc changed the title proposal: net: prefer /etc/hosts over DNS (even) if no /etc/nsswitch.conf is present proposal: net: prefer /etc/hosts over DNS when no /etc/nsswitch.conf is present Nov 6, 2019
@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Nov 6, 2019

Retitled for clarity. We'll wait for @schnoddelbotz to report back about glibc, but even in the absence of progress there, it seems plausible to diverge from glibc and prefer /etc/hosts over DNS. Probably glibc never runs on systems where /etc/nsswitch.conf is missing, whereas Go clearly does.

@erikwilson

This comment has been minimized.

Copy link

@erikwilson erikwilson commented Nov 8, 2019

Thanks for proposing this fix @schnoddelbotz!

What glibc and go are doing here feels very wrong, every sane /etc/nsswitch.conf configuration prefers files then dns. This is particularly a problem for networks that setup a search path as localhost can resolve to something completely wrong. For Alpine/musl systems this has been particularly painful, please fix asap! :)

@schnoddelbotz

This comment has been minimized.

Copy link
Author

@schnoddelbotz schnoddelbotz commented Nov 10, 2019

Thanks for retitling and reconsidering the issue as proposal, despite my previous misconception about glibc behaviour. There was only one response to the corresponding glibc-alpha thread in the last 6 days, but it was positive. Citing Florian Weimer's response:

> The proposed new default would be
> #define DEFAULT_CONFIG "files dns"
> ... effectively making the /etc/hosts file the primary source for
> hosts lookups, until overridden by a nsswitch.conf file.

I think this is a reasonable change to make.  I've wanted this for a
quite a while and was waiting for the new nsswitch.conf parser (which I
expected would have centralized the policy in one place).  But we can
make this change now.  It's just a bit tedious to find all the places
that need changing.

I'd happily create a PR here now, but somehow doubt it's really helpful/needed? Just let me know...
Thanks again!

@erikwilson

This comment has been minimized.

Copy link

@erikwilson erikwilson commented Nov 10, 2019

I think a PR is absolutely needed, https://golang.org/src/net/conf.go feels like a hot mess, it is making too many assumptions about the system. An upstream glibc change will have no effect on the go dns resolver, so go needs to change since it is making assumptions and emulating broken configs.

I feel like this is an opportunity for Golang to be opinionated and consistent about what is happening here. For musl I would still need to set the netcgo tag to force cgo resolution or maybe RES_OPTIONS to force hostLookupFilesDNS for things which are compiled without cgo or the netgo tag. It feels like there should be a better way, either through an environment variable or other means, to specify the result of hostLookupOrder.

@schnoddelbotz

This comment has been minimized.

Copy link
Author

@schnoddelbotz schnoddelbotz commented Nov 10, 2019

I admittedly only focused on fixing the hosts-over-dns issue for Alpine with CGO_ENABLED=0 builds, which felt like a low-hanging fruit: I could only have imagined a tiny PR that deletes lines 205-209 of conf.go and maybe some doc tweaks in net.go to mention that /etc/hosts is considered before using DNS if nsswitch.conf is absent.

It's perfectly possible that I also mis-read Go code here (besides glibc), but are you sure that the tweak above wouldn't be sufficient to force hostLookupFilesDNS for things which are compiled without cgo, as there's no nsswitch.conf, @erikwilson? And for musl with CGO, you'd always get musl's "static" hostLookupFilesDNS order mentioned above, wouldn't you? Personally, I like automagic behaviour, as long as it matches "working" OS default/configured behaviour i.e. not (today's) glibc without a nsswitch.conf ;)
Just my non-relevant 2¢, let's see what maintainers say... maybe worth a separate issue?

@erikwilson

This comment has been minimized.

Copy link

@erikwilson erikwilson commented Nov 10, 2019

Yah, maybe the magic behavior of when to use cgo dns should be a separate issue.

I doubt very few people, if anyone, will want DNS then file resolution. The problem with the current net conf logic is the assumption that certain libraries are used, even tho that may not be the case. It is trying to be too magical and failing badly. Might as well just always return hostLookupFilesDNS unless something overrides it and remove all of the c.goos logic. If you want system behavior then just use cgo dns.

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Nov 13, 2019

The consensus here seems overwhelming, and maybe glibc will even make the change.
This seems like a likely accept.

Leaving open for a week for final comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.