Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: os.File.Close does not interrupt os.File.ReadAt #20110

Closed
navytux opened this issue Apr 25, 2017 · 3 comments

Comments

Projects
None yet
4 participants
@navytux
Copy link
Contributor

commented Apr 25, 2017

Since CL https://golang.org/cl/41670 os.File.{ReadAt,WriteAt} do not use poller. However os.File.Close logic for interrupting in-flight IO currently works only when poller is used. Quoting #19586 (comment):

In the context of #7970 I've noticed that the actual syscall.Close happens in FD.destroy which itself is called from under last decref. This is ok when poller is used - there awaiting for IO goroutines are first deregistered by pollDesc.evict but in case of blocking IO without poller (ReadAt case) FD.Close will only mark fd mutex as closed (so issuing new read/write requests will not be possible) but e.g. pread request that is already in flight won't be interrupted.

In case where pread could be blocking potentially indefinitely long (networked filesystems, slow IO with retries, ...) that could lead to deadlocks when client code expects Close to interrupt IO, then waits for goroutine doing IO to complete.

So what is the plan here? (offhand I would say that at least for fds not using poller syscall.Close should be called by FD.Close itself with protection from #7970-style races)

Spawning this from #19586 not to forget this problem exists.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Apr 28, 2017

In my testing, if I open a pipe in normal blocking mode, read from it in one thread, and then close it from another thread, the read is not interrupted. It simply hangs indefinitely. So as far as I can tell the situation you describe was broken before and it's broken now. I don't see how to fix it.

The GNU/Linux man page for close says

It is probably unwise to close file descriptors while they  may  be  in
use by system calls in other threads in the same process.  Since a file
descriptor may be reused, there are some obscure race  conditions  that
may cause unintended side effects.

While I think we can probably handle the race condition, that also suggests that closing a file does not interrupt any existing I/O on that file.

@navytux

This comment has been minimized.

Copy link
Contributor Author

commented May 1, 2017

@ianlancetaylor thanks for feedback. What you say about traditional close semantic is also true for sockets. However the issue here was submitted with Go's net.Conn.Close semantic in mind where Close interrupts all currently blocking Read or Write. With recent generalization of runtime poller pipes should now be also handled in non-blocking / epoll mode so the question here really remains only about "regular" files - those that are rejected by epoll.

While it is tempting for me to say that like in #19586 (comment) IO on regular files should be alway really "non-blocking", and thus no cancellation is needed at all, I cannot ignore the fact of FUSE use case and when we already had deadlock where both server and client were tested as part of one process:

#19563 (comment)
https://groups.google.com/forum/#!msg/golang-nuts/11rdExWP6ac/UnP5P5m3VPEJ

With Close it can be a similar situation: Read request is made; a filesystem there really blocks somehow (NFS / FUSE / ...) but the syscall is handled via bocking way. If Close does not interrupt it - there is simply no way to cancel it.

I can imagine interrupting syscall from another P from Close via sending a signal. Offhand this should involve at least some synchronization via fdmu so that the signal is always processed before actual IO operation returns. I'm not sure such complications would be accepted though.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jun 28, 2017

I don't think the deadlock mentioned above is possible here. The deadlock referenced was due to using RawSyscall rather than Syscall. That doesn't happen here.

I'm going to close this until we have a test case.

@golang golang locked and limited conversation to collaborators Jun 28, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.