Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os, runtime: Go 1.9 assumes epoll is non-blocking #21014

Closed
hanwen opened this issue Jul 15, 2017 · 9 comments

Comments

Projects
None yet
4 participants
@hanwen
Copy link
Contributor

commented Jul 15, 2017

See extensive discussion on hanwen/go-fuse#165

Symptoms:

  • The go-fuse end-to-end test show hangs on Go 1.9.
  • kernel stack traces show a syscall hung in fuse_file_poll, i.e. the Go runtime issues an epoll on files residing on the FUSE mount, and the kernel forwards this as a POLL call to the FUSE userspace process.
  • Some of these hangs occur when the FUSE process tries to respond ENOSYS to a POLL opcode.
  • The problem is always reproducible with GOMAXPROCS=1.

After discussion with Heschi and Austin, we came up with the following explanation:

The poller was not implemented with the assumption it could block (and is wired especially into the runtime?). Hence, when it runs on a P, a blocking epoll call will prevent other goroutines on the same P from running. If one of these is either the read syscall that gets the POLL opcode, or the write syscall that returns ENOSYS, then the POLL opcode never gets processed, and the runtime deadlocks itself.

This is only a problem if the same program both issues epoll calls and responds to them, in other words, tests for FUSE filesystems, so this is probably not a showstopper for the Go 1.9 release.

We could kludge around this by having go-fuse issue our own epoll call (a blocking syscall); this would trigger a FUSE POLL opcode which we could respond to with ENOSYS preventing further poll problems. (does epoll work for directory file descriptors?)

other approaches:

  • change the Golang runtime to assume that epoll is blocking. Then the syscalls responding to POLL could be moved onto other Ps. This may be too invasive a change to justify the use case, though?

  • Convince the FUSE development team to provide a capability flag for POLL. Then we could prevent POLL calls from happening altogether. This is much cleaner, but the linux kernel version is outside of our control in many circumstances.

@ianlancetaylor ianlancetaylor changed the title Go 1.9 assumes epoll is non-blocking os, runtime: Go 1.9 assumes epoll is non-blocking Jul 15, 2017

@hanwen

This comment has been minimized.

Copy link
Contributor Author

commented Jul 15, 2017

Heschi suggested a simpler approach, which is LockOSThread on the read loop for FUSE: this would avoid having epoll and FUSE interfering with each other.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jul 15, 2017

I assume that the problem occurs in the call to the epoll_wait system call from the function netpoll in runtime/netpoll_epoll.go. That call ties up the g, m, and p for the duration of the call. I assume that it doesn't matter whether the call is blocking or not; either way, when the kernel tries to call back into the program to run the FUSE code, the FUSE code will stall waiting for a goroutine.

While it seems clearly more likely when GOMAXPROCS is 1, technically there is nothing stopping many simultaneous non-blocking calls to epoll_wait, and if we have enough of them at once the problem will occur at any GOMAXPROCS level.

If I understand the FUSE code correctly, which I probably don't, the FUSE code is running a goroutine that sits in syscall.Read waiting for FUSE requests to arrive from the kernel. So the problem is that we need to permit that goroutine to run even though GOMAXPROCS goroutines are sitting waiting for epoll_wait to return.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jul 15, 2017

If I understand the problem correctly, then I don't see how LockOSThread will help. LockOSThread doesn't permit you to escape from the restrictions on GOMAXPROCS. But, if it works, great.

@bradfitz bradfitz added this to the Go1.9Maybe milestone Jul 15, 2017

@hanwen

This comment has been minimized.

Copy link
Contributor Author

commented Jul 15, 2017

Your analysis of the FUSE code is correct. Part of the time it sits in a Read waiting for the kernel, the other part, it tries to write back to the kernel.

I havent tried the suggestion yet, going out for dinner now :-) .

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jul 15, 2017

I think this would work if the FUSE daemon used syscall.RawSyscall rather than syscall.Read and syscall.Write. But the cost would be that the FUSE daemon would always occupy a GOMAXPROCS slot, so either the user or the FUSE code would have to bump up GOMAXPROCS.

I haven't yet been able to think of a workable fix in the Go runtime.

If there is a way for the os package to detect that a file is on a FUSE filesystem, then we could avoid using the poller for that file. But I don't know of a way to do that.

@hanwen

This comment has been minimized.

Copy link
Contributor Author

commented Jul 15, 2017

Let me see if I can make RawSyscall work. Since this affects tests primarily, that's probably acceptable.

@hanwen

This comment has been minimized.

Copy link
Contributor Author

commented Jul 15, 2017

FWIW, on Linux you can find out if a file is on FUSE by calling fstatfs on the fd, and checking the f_type field of the result. It does sound like a lot of overhead, since most files arent on FUSE, and some care is needed since fstatfs is also forwarded to FUSE, so you cannot call it from where you call epoll.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jul 15, 2017

We could call fstatfs from newFile in os/file_unix.go. There it would be a normal syscall, so there would be no difficulty in running the FUSE code.

@hanwen

This comment has been minimized.

Copy link
Contributor Author

commented Jul 15, 2017

I fixed it by kludging in something that triggers the POLL opcode before the Go runtime has the chance to.

other fuse libraries like the one from @rsc and bazil.org should likely implement something similar.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.