Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: unclear how to confirm synchronously that a net.TCPListener is closed #21833

Closed
slingamn opened this issue Sep 11, 2017 · 2 comments

Comments

Projects
None yet
3 participants
@slingamn
Copy link

commented Sep 11, 2017

What version of Go are you using (go version)?

1.9

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/shivaram/go"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build866858012=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"

What did you do?

https://gist.github.com/slingamn/f1f2ef4a8d004e67a9b0f27b698a5b13

This is a simple TCP ping-pong server that uses channels for connection handoff. The wrinkle is that I'm trying to periodically close the connection from a separate goroutine, then open a new connection that rebinds to the address. (The non-toy use case for this is implementing reload of configuration files in a long-running server.) This is implemented by:

  1. Using Listener.Close() to interrupt the Listener.Accept() call on the other goroutine
  2. Using a channel to tell the other goroutine that it should call Listener.Close() itself and exit
  3. Using a channel to wait for the other goroutine to complete its call to Listener.Close()

Observed behavior:

  1. As written, the code appears to work (it runs correctly, and no races are reported under stress testing with -race)
  2. When step 3 above (waiting for the other goroutine to call Listener.Close) is omitted (by running the test case with the command-line argument --waitforstop=false), the program crashes with: listen error: listen tcp :7777: bind: address already in use.

According to strace, the problem is that the close(2) call on the underlying socket has not executed by the time the goroutine executing main() tries to create the new listener; therefore the bind(2) call fails with EADDRINUSE. Based on my reading of the runtime's source code, this is because the underlying file descriptor is reference-counted, and close(2) is not called until the reference count drops to 0. (Since the runtime sets SO_REUSEADDR on the socket, the address can be rebound immediately after close(2).)

The problem I'm having is that based on my reading of the documentation, I'm unable to either prove that step 3 makes the code safe (i.e., that by the time <-wrapper.HasStopped finishes, the underlying socket has definitely been closed), or understand why the omission of step 3 makes the code unsafe (i.e., why the reference count does not immediately go to 0 after the main() goroutine calls Listener.Close()). Is there any way to get a synchronous guarantee that the underlying socket has been closed?

Thanks very much for your time.

@davecheney

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2017

				listener.Close()
				wrapper.HasStopped <- true

Creates a happens before relationship, a value can only be received from HasStopped if the sender, above has reached that step.

We don't the issue tracker to ask questions, so I'm going to close this. Please see https://golang.org/wiki/Questions for good places to ask. Thanks.

@davecheney davecheney closed this Sep 11, 2017

@slingamn

This comment has been minimized.

Copy link
Author

commented Sep 11, 2017

I'm sorry if the question was either misdirected or poorly phrased. I should clarify that I think this may indicate either a bug in the documentation or in the runtime. Specifically, TCPListener.Close is documented as "Close stops listening on the TCP address." The test case demonstrates that a single call to Close() (i.e., the one that occurs on the main() goroutine) is not guaranteed to release the address, in the sense of making it available for new bind(2) calls. It's not clear on the basis of the documentation whether this is the expected behavior, or whether there is any mechanism for ensuring that the address is released.

slingamn added a commit to slingamn/oragono that referenced this issue Sep 11, 2017

refactor listener update/destroy code
Don't close and reopen listeners: golang/go#21833

slingamn added a commit to slingamn/oragono that referenced this issue Sep 11, 2017

refactor listener update/destroy code
Don't close and reopen listeners: golang/go#21833

slingamn added a commit to slingamn/oragono that referenced this issue Sep 11, 2017

refactor listener update/destroy code
Don't close and reopen listeners: golang/go#21833

slingamn added a commit to slingamn/oragono that referenced this issue Sep 11, 2017

refactor listener update/destroy code
Don't close and reopen listeners: golang/go#21833

slingamn added a commit to slingamn/oragono that referenced this issue Sep 11, 2017

refactor listener update/destroy code
Don't close and reopen listeners: golang/go#21833

@golang golang locked and limited conversation to collaborators Sep 11, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.