Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: File.Close() regression - Close hangs while reader present #23943

Closed
gdamore opened this issue Feb 20, 2018 · 12 comments
Closed

os: File.Close() regression - Close hangs while reader present #23943

gdamore opened this issue Feb 20, 2018 · 12 comments

Comments

@gdamore
Copy link

@gdamore gdamore commented Feb 20, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

1.10

Does this issue reproduce with the latest release?

Yes, and it is a regression.

What operating system and processor architecture are you using (go env)?

go version go1.10 darwin/amd64

What did you do?

tcell (github.com/gdamore/tcell) is broken by the latest go 1.10 (works fine on 1.9.x).

We have an input loop that spins reading stdin (to scan for keypresses.)

When the program wants to terminate, we call Close() on that, to clean up things. We expect the input loop to fail with io.EOF or nil and for Close() to terminate immediately.

Instead Close() now hangs, until the user presses a key, and the check for closed file then succeeds.

It would appear that Close() no longer is thread safe, and no longer just calls the underlying OS close function.

@bradfitz

This comment has been minimized.

Copy link
Contributor

@bradfitz bradfitz commented Feb 20, 2018

@gdamore

This comment has been minimized.

Copy link
Author

@gdamore gdamore commented Feb 20, 2018

In retrospect, it may be that the new Close() behavior is causing me grief because of a bug in the macOS tty driver, that does not wake sockets in poll() when the underlying descriptor is closed. In the past I've used termios timeouts to workaround this, but the real solution (better) usually requires a separate file descriptor (like a pipe() based one) that can be used to wake the poller, which must then poll on two separate descriptors.

Unfortunately, I don't know how to set up two descriptors and poll() on them from one goroutine in golang. So what we need I suppose is an "interruptible" kind of file.

Put another way, I think the new golang behavior is now exposing me to an underlying macOS behavior (one I consider defective), that I was previously shielded from. Unfortunately I can't think of any way to work around this using purely go constructs.

@ALTree ALTree changed the title File.Close regression - file.Close() hangs while reader present os: File.Close() regression - Close hangs while reader present Feb 20, 2018
@bradfitz

This comment has been minimized.

Copy link
Contributor

@bradfitz bradfitz commented Feb 20, 2018

@gdamore, do you have a repro? We're much more likely to be able to help if we have code to run to see the problem.

@gdamore

This comment has been minimized.

Copy link
Author

@gdamore gdamore commented Feb 20, 2018

Yes, sure. The easiest way to do this is:

% go get github.com/gdamore/tcell
% cd $GOPATH/src/github.com/gdamore/tcell
% go run _demo/boxes.go

Press ESC to exit the application. In order versions this would just abort the app. But now you have to press one more keystroke to do this.

This is on macOS. It may work fine on other systems (I haven't tested yet).

I'm pretty sure at this point that this is a well known (to me at least) defect in macOS, which I used to be shielded from, but which now I am not -- basically close of a tty based file descriptor does not wake a reader in select/poll. Its interesting to me that with the new runtime, this hangs Close() -- I guess there is some effort in the go runtime to clean up its own internal pollers, but I imagine that your runtime is now getting busted by the same bug.

In C, the way I workaround this is by using a separate notification descriptor (created with pipe()), and then my poller (in poll(), select(), or similar -- and I believe kqueue() also behaves the same here) polls both the tty input fd, and the "special" notification fd I've created. This lets me wake the reader, so that it can bail out of its loop.

In the tcell code, you'll see that in tscreen_bsd.go, it gets "stuck" trying to close the tty file at line 104. I.e. t.in.Close() hangs. The other salient fact here is that the input loop is running, blocked on t.in.Read().

I'll write a test program and post it here or on play shortly.

@gdamore

This comment has been minimized.

Copy link
Author

@gdamore gdamore commented Feb 20, 2018

Test program: https://play.golang.org/p/BXLQplc69dL

Sorry its kind of longer than I'd like, but the termios stuff to set the input into raw mode is a key aspect of this.

If this program behaved properly, and you don't type anything after starting it, it will exit after sleeping for one second. But instead, it just "hangs" until you press a key.

Furthermore, the Read() in the inputLoop actually returns a valid keystroke, so Close() isn't actually tearing down the underlying FD.

@gdamore

This comment has been minimized.

Copy link
Author

@gdamore gdamore commented Feb 20, 2018

Interestingly enough, syscall.Close() of the underlying FD doesn't wake up the reader either -- again, this is because of the macOS tty driver bug. Working around this is a major PITA because you have to set up a separate notification descriptor.

I'd be interested to know if FreeBSD suffers from the same broken behavior.

@gdamore

This comment has been minimized.

Copy link
Author

@gdamore gdamore commented Feb 20, 2018

In tcell I worked around this by just making file.Close() async (goroutine) on darwin. The attached program above still demonstrates the broken behavior though.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Feb 20, 2018

The Go runtime doesn't assume that calling close on a descriptor will wake up a read on that same descriptor. About the only way I can explain the results you are seeing is if calling read on a TTY blocks even though the descriptor is set to nonblocking mode.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Feb 20, 2018

The program passes on FreeBSD. I haven't been able to get gomote to give me a Darwin instance I can test

@bradfitz

This comment has been minimized.

Copy link
Contributor

@bradfitz bradfitz commented Feb 21, 2018

Macs are all messed up right now. #23859

@gdamore

This comment has been minimized.

Copy link
Author

@gdamore gdamore commented Feb 21, 2018

Its likely that the darwin tty driver doesn't even honor nonblocking mode reasonably. Most people use termios to set up timeouts, but that has a nasty effect of chewing up CPU, basically putting the console mode app in polling mode.

What I had witnessed in C land was that poll() or select() didn't wake up, when the fd was closed. Whether a non-blocking read of the tty also blocks or not is not known to me. But if you're using poll or select to wake up (the usual case for non-blocking mode I/O!) then the code will probably break.

With any other FOSS I could just fix the broken driver and submit the fix upstream. Alas, I have no knowledge of how to do that for Apple. The tty driver has been broken in this way for years.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Mar 6, 2018

Change https://golang.org/cl/99015 mentions this issue: internal/poll: if poller init fails, assume blocking mode

@gopherbot gopherbot closed this in 558769a Mar 6, 2018
@golang golang locked and limited conversation to collaborators Mar 6, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.