net: acquireThread might block for a long time #63978
Labels
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone
This was previously reported to the go security team.
When the cgo resolver is being used, we call the acquireThread function to limit the amount of concurrent cgo calls running and also to limit the amount of threads that will be ever created (by the net package) by the runtime for cgo calls. The limit is currently capped to 500.
go/src/net/cgo_unix.go
Lines 148 to 151 in 9d836d4
go/src/net/net.go
Lines 674 to 689 in 9d836d4
go/src/net/rlimit_unix.go
Lines 11 to 33 in 9d836d4
This might be problematic for services that connect to a user-provided hostnames.
The user-provided domain might be running on an "evil" nameserver that might trip timeouts (respond with SERVFAIL after ~5s). I have not tested this deeply, but I assume it would be possible to make the getaddrinfo block for 5-30s (assuming mostly default configuration in resolv.conf).
1-3 (Resolvers Count) * 2 (default attempts count) * 5s (default timeout).
Obviously with MITM it is simpler to trip timeouts.
We don't use the cgo resolver much these days on unix systems (except desktop linux, because of systemd nsswitch modules we don't support in the go resolver). On windows and darwin the cgo resolver is the default.
I think this should be at least made clear by the docs, currently we only note that:
go/src/net/net.go
Lines 49 to 50 in 9d836d4
This should probably mention that we have an internal cgo threads limit.
Also we probably should not force the use of cgo resolver for ".local" subdomains, so that use of the cgo resolver cannot be forced by an external user (consider a service that connects to a user-provided hostname). The go resolver should be able to resolve ".local" domains when there are no other nss modules in use than files and dns.
go/src/net/conf.go
Lines 341 to 347 in 9d836d4
Support context cancellation in acquireThread, so that when the thread limit is reached we don't call cgo stuff when the context is already cancelled (don't queue bunch of cgo calls when context is done).
I will send CLs for this.
CC @ianlancetaylor @golang/security @rolandshoemaker
The text was updated successfully, but these errors were encountered: