Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: FreeBSD system crashes when adding regular files to kqueue #19093

Open
ianlancetaylor opened this Issue Feb 14, 2017 · 28 comments

Comments

Projects
None yet
6 participants
@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Feb 14, 2017

While testing https://golang.org/cl/36800, it appears that the FreeBSD amd64 system crashes when adding regular disk files to kqueue. This is based on testing using gomote, which creates a VM running FreeBSD 10.1 (I think). I am going to disable using kqueue on FreeBSD for regular files; it will still be used for sockets as usual. I am opening this issue to note the problem in the hopes that it can be fixed by someone more familiar with FreeBSD.

@gopherbot

This comment has been minimized.

Copy link

gopherbot commented Feb 14, 2017

CL https://golang.org/cl/36800 mentions this issue.

@ianlancetaylor ianlancetaylor added this to the Go1.9Maybe milestone Feb 15, 2017

gopherbot pushed a commit that referenced this issue Feb 15, 2017

os: use poller for file I/O
This changes the os package to use the runtime poller for file I/O
where possible. When a system call blocks on a pollable descriptor,
the goroutine will be blocked on the poller but the thread will be
released to run other goroutines. When using a non-pollable
descriptor, the os package will continue to use thread-blocking system
calls as before.

For example, on GNU/Linux, the runtime poller uses epoll. epoll does
not support ordinary disk files, so they will continue to use blocking
I/O as before. The poller will be used for pipes.

Since this means that the poller is used for many more programs, this
modifies the runtime to only block waiting for the poller if there is
some goroutine that is waiting on the poller. Otherwise, there is no
point, as the poller will never make any goroutine ready. This
preserves the runtime's current simple deadlock detection.

This seems to crash FreeBSD systems, so it is disabled on FreeBSD.
This is issue 19093.

Using the poller on Windows requires opening the file with
FILE_FLAG_OVERLAPPED. We should only do that if we can remove that
flag if the program calls the Fd method. This is issue 19098.

Update #6817.
Update #7903.
Update #15021.
Update #18507.
Update #19093.
Update #19098.

Change-Id: Ia5197dcefa7c6fbcca97d19a6f8621b2abcbb1fe
Reviewed-on: https://go-review.googlesource.com/36800
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
@asomers

This comment has been minimized.

Copy link

asomers commented Mar 6, 2017

Do you have a stacktrace of the FreeBSD system?

@asomers

This comment has been minimized.

Copy link

asomers commented Apr 17, 2017

@ianlancetaylor Can you please post some more information about the crash? The stack trace, output of uname -a, and steps to reproduce would be helpful.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor Author

ianlancetaylor commented Apr 17, 2017

Sorry for not noticing your earlier message.

I do not have a stack trace of the system. I am accessing the system using https://godoc.org/golang.org/x/build/cmd/gomote, which runs FreeBSD in a VM, and I am deducing that the system has crashed because it stops responding for a while, and when it does start responding the system uptime has been reset to a very small number. I assume that what is happening is that the VM is being discarded and restarted. Perhaps @bradfitz can say something more about that.

This is how I reproduce it (I just verified that it still fails like this for me). First, remove these lines from src/os/file_unix.go:

	// Don't try to use kqueue with regular files on FreeBSD.
	// It crashes the system unpredictably while running all.bash.
	// Issue 19093.
	if runtime.GOOS == "freebsd" {
		pollable = false
	}

Then run

name=`gomote create freebsd-amd64-110`
gomote put14 $name
gomote push $name
gomote run $name go/src/make.bash

When I try this it gets to the point where it prints

##### Building packages and commands for freebsd/amd64.

which is the point where it starts running cmd/go built with the new polling code. At that point the run stops and gomote prints

2017/04/17 11:06:00 Buildlet https://farmer.golang.org:443 failed three heartbeats; final error: timeout waiting for headers

Further attempts to use gomote run $name fails with

Error running run: Error trying to execute /usr/bin/uptime: buildlet: HTTP status 502 Bad Gateway: 

Then after a while the gomote usually recovers, but running uptime shows that the system has rebooted.

With luck you can create the problem on a real FreeBSD system, or on a VM which you fully control, by simply editing src/os/file_unix.go as suggested and then running make.bash.

@bradfitz

This comment has been minimized.

Copy link
Member

bradfitz commented Apr 17, 2017

I suppose gomote & the build system could be modified to watch the VM's serial console & log that, but I don't have time for that at the moment.

For debugging this, somebody should just use a FreeBSD VM directly, not via gomote.

@asomers

This comment has been minimized.

Copy link

asomers commented Apr 17, 2017

I'll try it on a real FreeBSD system. But when you've done it using gomote, did you notice if there was anything in /var/crash after the reboot? /var/crash/minfree and /var/crash/bounds are normal, but any other file would be interesting.

@bradfitz

This comment has been minimized.

Copy link
Member

bradfitz commented Apr 17, 2017

The gomote proxy server that owns the VM destroys it immediately if it fails its heartbeats. The user (Ian, here) would have no way to check those things before the machine was nuked.

@asomers

This comment has been minimized.

Copy link

asomers commented Apr 17, 2017

I can't reproduce on FreeBSD head or stable/10 amd64. Can you please figure out the exact version that gomote is using, and any nondefault configurations?

@bradfitz

This comment has been minimized.

@asomers

This comment has been minimized.

Copy link

asomers commented Apr 19, 2017

I can reproduce it on 11.0-RELEASE. This turns out to be a duplicate of FreeBSD PR 24923, even though the stack traces look different. Here's the stacktrace:

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address   = 0x18
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80a80c99
stack pointer           = 0x28:0xfffffe085fb26690
frame pointer           = 0x28:0xfffffe085fb26870
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 954 (go_bootstrap)
trap number             = 12
panic: page fault
cpuid = 2
KDB: stack backtrace:
#0 0xffffffff80b24477 at kdb_backtrace+0x67
#1 0xffffffff80ad97e2 at vpanic+0x182
#2 0xffffffff80ad9653 at panic+0x43
#3 0xffffffff80fa1d51 at trap_fatal+0x351
#4 0xffffffff80fa1f43 at trap_pfault+0x1e3
#5 0xffffffff80fa14ec at trap+0x26c
#6 0xffffffff80f845a1 at calltrap+0x8
#7 0xffffffff80a802c5 at kern_kevent+0xb5
#8 0xffffffff80a8014c at sys_kevent+0x11c
#9 0xffffffff80fa26ae at amd64_syscall+0x4ce
#10 0xffffffff80f8488b at Xfast_syscall+0xfb

This bug was fixed in head by r310302 and later MFCed to stable/11 by r310578, but no update was ever issued for FreeBSD 11.0. I checked with secteam, and they don't plan to issue an update. So you should continue to use your current workaround until 11.1 is released, which will happen sometime in June.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214923
https://svnweb.freebsd.org/base?view=revision&revision=310302

@bradfitz

This comment has been minimized.

Copy link
Member

bradfitz commented Apr 19, 2017

@asomers, thanks for the investigation and update!

@bradfitz

This comment has been minimized.

Copy link
Member

bradfitz commented Nov 22, 2017

I suppose gomote & the build system could be modified to watch the VM's serial console & log that, but I don't have time for that at the moment.

Btw, the x/build/cmd/debugnewvm does the serial console thing, btw, but gomote doesn't yet. But sounds like it's not needed now, since the bug has been fixed.

@ianlancetaylor, do you want to conditionally enable this behavior based on FreeBSD version?

In the meantime I can update our builder image from 11.0 to 11.1. We still have our 9.3 builders running too, which would test the fallback path if you make it $FREE_VERSION >= 11.1.

@fuzxxl

This comment has been minimized.

Copy link

fuzxxl commented Dec 28, 2018

Now that FreeBSD 12 is out, it would be nice if files were finally added to the poller by default. @bradfitz @ianlancetayler, any news about enabling this conditionally dependending on FreeBSD version?

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor Author

ianlancetaylor commented Dec 28, 2018

@fuzxxl No news on that. Do you know what kernel version fixes the problem?

@fuzxxl

This comment has been minimized.

Copy link

fuzxxl commented Dec 28, 2018

@ianlancetaylor According to @bradfitz previous comment, r310302 fixed this issue. This was already part of FreeBSD-11.1 apparently.

I just tested this on my FreeBSD 12-RELEASE system and it did crash or fail with polling enabled for regular files.

@paulzhol

This comment has been minimized.

Copy link
Member

paulzhol commented Dec 29, 2018

I've commented out these lines:

go/src/os/file_unix.go

Lines 126 to 128 in 58a17b4

if runtime.GOOS == "freebsd" && kind == kindOpenFile {
pollable = false
}

And ran ./all.bash on FreeBSD 10.4 and 12.0, they both pass.

I've previously asked @bradfitz if we could drop the 10.3 builder for this. (#25289 (comment) - no, not before go 1.13 where we officially drop FreeBSD 10 support).

I do have an additional concern: kqueue supports EVFILT_READ on Vnodes (regular files) but not EVFILT_WRITE. We however always register for both events.

When running the following program to read a large file with truss, the write event is always reported. Is this the expected behavior?

package main

import (
        "io"
        "os"
        "sync"
)

func main() {
        f, err := os.Open("/boot/kernel/kernel")
        if err != nil {
                panic(err)
        }
        defer f.Close()

        var wg sync.WaitGroup
        wg.Add(1)
        go func() {
                defer wg.Done()
                for {
                        b := make([]byte, 1024)
                        _, err := f.Read(b)
                        if err != nil {
                                if err == io.EOF {
                                        break
                                }
                                panic(err)
                        }
                }
        }()
        wg.Wait()
}
open("/boot/kernel/kernel",O_RDONLY|O_CLOEXEC,00) = 3 (0x3)
kqueue()					 = 4 (0x4)
fcntl(4,F_SETFD,FD_CLOEXEC)			 = 0 (0x0)
compat11.kevent(4,{ 3,EVFILT_READ,EV_ADD|EV_CLEAR,0,0,0x80275bea0 3,EVFILT_WRITE,EV_ADD|EV_CLEAR,0,0,0x80275bea0 },2,0x0,0,0x0) = 0 (0x0)
fcntl(3,F_GETFL,)				 = 0 (0x0)
fcntl(3,F_SETFL,O_RDONLY|O_NONBLOCK)		 = 0 (0x0)

compat11.kevent(4,0x0,0,{ 3,EVFILT_READ,EV_CLEAR,0,0x1dd75d8,0x80275bea0 3,EVFILT_WRITE,EV_CLEAR,0,0,0x80275bea0 },64,{ 0.000000000 }) = 2 (0x2)
compat11.kevent(4,0x0,0,{ 3,EVFILT_WRITE,EV_CLEAR,0,0,0x80275bea0 3,EVFILT_READ,EV_CLEAR,0,0x1dc0dd8,0x80275bea0 },64,{ 0.000000000 }) = 2 (0x2)
compat11.kevent(4,0x0,0,{ 3,EVFILT_WRITE,EV_CLEAR,0,0,0x80275bea0 3,EVFILT_READ,EV_CLEAR,0,0x1d971d8,0x80275bea0 },64,{ 0.000000000 }) = 2 (0x2)
compat11.kevent(4,0x0,0,{ 3,EVFILT_WRITE,EV_CLEAR,0,0,0x80275bea0 3,EVFILT_READ,EV_CLEAR,0,0x1d7e9d8,0x80275bea0 },64,{ 0.000000000 }) = 2 (0x2)
compat11.kevent(4,0x0,0,{ 3,EVFILT_WRITE,EV_CLEAR,0,0,0x80275bea0 3,EVFILT_READ,EV_CLEAR,0,0x1d62dd8,0x80275bea0 },64,{ 0.000000000 }) = 2 (0x2)
compat11.kevent(4,0x0,0,{ 3,EVFILT_WRITE,EV_CLEAR,0,0,0x80275bea0 3,EVFILT_READ,EV_CLEAR,0,0x1d475d8,0x80275bea0 },64,{ 0.000000000 }) = 2 (0x2)
compat11.kevent(4,0x0,0,{ 3,EVFILT_WRITE,EV_CLEAR,0,0,0x80275bea0 3,EVFILT_READ,EV_CLEAR,0,0x1d2d9d8,0x80275bea0 },64,{ 0.000000000 }) = 2 (0x2)
...
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor Author

ianlancetaylor commented Jan 4, 2019

Does kevent always return immediately with the EVFILT_WRITE and EVFILT_READ events? That would be problematic: it would mean that the Go program is effectively busy waiting.

@fuzxxl

This comment has been minimized.

Copy link

fuzxxl commented Jan 4, 2019

This might be possible. A FreeBSD programmer told me that files are always ready to read/write, so this mechanism might not work as-is. I think kqueue has to be combined with AIO using EVFILT_AIO for this to work, though I'm not sure how to set this up correctly. Should work the same way on macOS.

@asomers

This comment has been minimized.

Copy link

asomers commented Jan 4, 2019

That is correct. Plain files are always ready for reading and writing. AIO works with kqueue because the asynchronous part is the actual read or write; with sockets the asynchronous part is waiting before you start the read or write. But you're out of luck on OSX; EVFILT_AIO isn't supported there.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor Author

ianlancetaylor commented Jan 5, 2019

Thanks. It seems possible that we should continue avoiding the poller for regular files on FreeBSD, and should perhaps do the same on the other *BSDs.

@asomers

This comment has been minimized.

Copy link

asomers commented Jan 5, 2019

I could be wrong, but I thought that the same was true on Linux; poll(2) will always indicate that plain files are ready for reading and writing.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor Author

ianlancetaylor commented Jan 5, 2019

On GNU/Linux we use epoll; epoll doesn't permit adding normal files, so the current code works fine without going into a busy wait.

@paulzhol

This comment has been minimized.

Copy link
Member

paulzhol commented Jan 5, 2019

Does kevent always return immediately with the EVFILT_WRITE and EVFILT_READ events?

Yes, both events return immediately as far as I can tell. The EVFILT_READ event actually reports the offset from the current position to end of file. It can be seen decreasing in the truss output above.

In the sample program (with a short time.Sleep after the for loop), once EOF is reached only the EVFILT_WRITE keeps being reported as ready.

If I also drop the EVFILT_WRITE registration from the netpoller, then the fd is no longer reported as ready by kevent once it is read to the end.

Is this something we wish to pursue? The added benefit would be not requiring an OS thread when reading regular files.

diff --git a/src/runtime/netpoll_kqueue.go b/src/runtime/netpoll_kqueue.go
index fdaa1cd80d..88afaba44b 100644
--- a/src/runtime/netpoll_kqueue.go
+++ b/src/runtime/netpoll_kqueue.go
@@ -38,9 +38,9 @@ func netpollopen(fd uintptr, pd *pollDesc) int32 {
        ev[0].fflags = 0
        ev[0].data = 0
        ev[0].udata = (*byte)(unsafe.Pointer(pd))
-       ev[1] = ev[0]
-       ev[1].filter = _EVFILT_WRITE
-       n := kevent(kq, &ev[0], 2, nil, 0, nil)
+       //ev[1] = ev[0]
+       //ev[1].filter = _EVFILT_WRITE
+       n := kevent(kq, &ev[0], /*2*/ 1, nil, 0, nil)
        if n < 0 {
                return -n
        }
66946: compat11.kevent(4,0x0,0,{ 3,EVFILT_READ,EV_CLEAR,0,0x19d8,0x8027cbea8 },64,{ 0.000000000 }) = 1 (0x1)
66946: read(3,"\M-Y\M-S\M-d\M-w\M-+O\M-a\^?~"...,1024) = 1024 (0x400)
66946: read(3,"k\b2\M-9K\^\\M-9\M-1\M^]\^T\\"...,1024) = 1024 (0x400)
66946: read(3,"\M^H"\M^G\M-V\M-_\M--\^Qz\^D\^D"...,1024) = 1024 (0x400)
66946: read(3,"\^E\0\0\0\^A\0\0\0\b\0\0\0\0\0\0"...,1024) = 1024 (0x400)
66946: read(3,"\0\0\0\0\0\0\0\0\b\0\0\0\0\0\0\0"...,1024) = 1024 (0x400)
66946: read(3,"\^D\0\0\0\0\0\0\0\b\0\0\0\0\0\0"...,1024) = 1024 (0x400)
66946: read(3,"\0\0\0\0\0\0\0\0\^A\0\0\0\0\0\0"...,1024) = 472 (0x1d8)
66946: read(3,0xc00006eba8,1024)                 = 0 (0x0)
...
66946: compat11.kevent(4,0x0,0,{ },64,{ 0.000000000 }) = 0 (0x0)
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor Author

ianlancetaylor commented Jan 5, 2019

We would only be able to avoid an OS thread in a case where the read or write returned EAGAIN, which presumably never happens on a regular file.

We can't simply drop the EVFILT_WRITE in general, of course, only for files. But it's not clear to me that we get a benefit from registering files. The big benefit of using the poller in the os package is using it for pipes, not files.

Perhaps we should call fstat to check for a regular file before adding it to the poller on systems for which kqueue behaves in this way.

@paulzhol

This comment has been minimized.

Copy link
Member

paulzhol commented Jan 5, 2019

We would only be able to avoid an OS thread in a case where the read or write returned EAGAIN

Thanks for explaining, you're referring to this, correct?

n, err := syscall.Read(fd.Sysfd, p)
if err != nil {
n = 0
if err == syscall.EAGAIN && fd.pd.pollable() {
if err = fd.pd.waitRead(fd.isFile); err == nil {
continue
}
}

The manpages for NetBSD, OpenBSD, DragonFly and OS X all indicate kevent supports Vnodes (regular files) with EVFILT_READ.
I'll do some tests in VMs, but assuming they don't fail when adding a regular file fd with both EVFILT_READ and EVFILT_WRITE filters set. I'll send a CL with an fstat gate.
(Should linux be tested explicitly instead of relying on epoll returning an error?)

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor Author

ianlancetaylor commented Jan 6, 2019

Adding a test means adding another system call, so we should only do it on systems where we can't avoid it in some other way.

@gopherbot

This comment has been minimized.

Copy link

gopherbot commented Jan 7, 2019

Change https://golang.org/cl/156379 mentions this issue: os: disable the use of netpoll on regular files on *BSDs.

@paulzhol

This comment has been minimized.

Copy link
Member

paulzhol commented Jan 7, 2019

I've tested only on OpenBSD 6.4 and NetBSD 8.0 using ktrace. On both systems the regular file is reported ready for reading and writing just once:

  7261 kevent_test 0.005291 CALL  open(0xc000014090,0x10000<O_RDONLY|O_CLOEXEC>)
  7261 kevent_test 0.005300 NAMI  "/bsd.mp"
  7261 kevent_test 0.005312 RET   open 3
  7261 kevent_test 0.005320 CALL  kqueue()
  7261 kevent_test 0.005329 RET   kqueue 4
  7261 kevent_test 0.005336 CALL  fcntl(4,F_SETFD,FD_CLOEXEC)
  7261 kevent_test 0.005343 RET   fcntl 0
  7261 kevent_test 0.005356 CALL  kevent(4,0xc000064d10,2,0,0,0)
  7261 kevent_test 0.005363 STRU  struct kevent [2] { ident=3, filter=EVFILT_READ, flags=0x21<EV_ADD|EV_CLEAR>, fflags=0<>, data=0, udata=0x2031f7ea0 } { ident=3, filter=EVFILT_WRITE, flags=0x21<EV_ADD|EV_CLEAR>, fflags=0<>, data=0, udata=0x2031f7ea0 }
  7261 kevent_test 0.005374 RET   kevent 0
  7261 kevent_test 0.005381 CALL  fcntl(3,F_GETFL)
  7261 kevent_test 0.005388 RET   fcntl 0
  7261 kevent_test 0.005395 CALL  fcntl(3,F_SETFL,0x4<O_NONBLOCK>)
  7261 kevent_test 0.005402 RET   fcntl 0
  7261 kevent_test 0.005434 CALL  read(3,0xc000065bb8,0x400)
  7261 kevent_test 0.005446 RET   read 1024/0x400
  7261 kevent_test 0.005454 CALL  read(3,0xc000065bb8,0x400)
  ...
  7261 kevent_test 0.018952 CALL  kevent(4,0,0,0xc00003d710,64,0xc00003d6f8)
  7261 kevent_test 0.018972 CALL  __thrsleep(0xc00002e6b8,CLOCK_MONOTONIC,0,0,0xc00002e6b8)
  7261 kevent_test 0.018990 CALL  read(3,0xc000065bb8,0x400)
  7261 kevent_test 0.019005 RET   __thrsleep 0
  7261 kevent_test 0.019027 STRU  struct timespec { 0 }
  7261 kevent_test 0.019037 STRU  struct kevent [2] { ident=3, filter=EVFILT_READ, flags=0x21<EV_ADD|EV_CLEAR>, fflags=0<>, data=14725134, udata=0x2031f7ea0 } { ident=3, filter=EVFILT_WRITE, flags=0x21<EV_ADD|EV_CLEAR>, fflags=0<>, data=0, udata=0x2031f7ea0 }
  7261 kevent_test 0.019069 CALL  __thrsleep(0xc00002ea38,CLOCK_MONOTONIC,0,0,0xc00002ea38)
  7261 kevent_test 0.019086 RET   kevent 2
...
  7261 kevent_test 0.038670 CALL  kevent(4,0,0,0xc00003d710,64,0xc00003d6f8)
  7261 kevent_test 0.038688 RET   read 1024/0x400
  7261 kevent_test 0.038697 STRU  struct timespec { 0 }
  7261 kevent_test 0.038707 CALL  read(3,0xc000065bb8,0x400)
  7261 kevent_test 0.038722 RET   kevent 0

with the CALL kevent / RET kevent 0 repeating

Attached is the full OpenBSD kdump as it contains struct values. The NetBSD behaviour seems the same minus the struct prints.
openbsd_ktrace.out.txt.gz

Update: Dragonfly 5.4.1 behaves the same (first kevent returns 2 filters, and then no more).

gopherbot pushed a commit that referenced this issue Jan 9, 2019

os: disable the use of netpoll on regular files on *BSDs.
The kqueue based netpoller always registers file descriptors with EVFILT_READ and EVFILT_WRITE.
However only EVFILT_READ notification is supported for regular files.
On FreeBSD a regular file is always reported as ready for writing, resulting in a busy wait.
On Darwin, Dragonfly, NetBSD and OpenBSD, a regular file is reported as ready for both reading and writing only once.

Updates #19093

Change-Id: If284341f60c6c2332fb5499637d4cfa7a4e26b7b
Reviewed-on: https://go-review.googlesource.com/c/156379
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.