Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: recent FreeBSD update to sys/fusefs to allow kevents breaks go polling model #54100

Open
kwhite-uottawa opened this issue Jul 27, 2022 · 9 comments
Labels
NeedsInvestigation OS-FreeBSD
Milestone

Comments

@kwhite-uottawa
Copy link

@kwhite-uottawa kwhite-uottawa commented Jul 27, 2022

What version of Go are you using (go version)?

$ go version
go version go1.18.4 freebsd/amd64

Does this issue reproduce with the latest release?

Yes, but it's dependent on the FreeBSD release. i.e. issue present on FreeBSD 13.1 and greater.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS="-mod=vendor"
GOHOSTARCH="amd64"
GOHOSTOS="freebsd"
GOINSECURE=""
GOMODCACHE="/root/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="freebsd"
GOPATH="/root/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go118"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go118/pkg/tool/freebsd_amd64"
GOVCS=""
GOVERSION="go1.18.4"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="cc"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/usr/local/poudriere/ports/default/net/rclone/work/github.com/rclone/rclone@v1.59.0/go.mod"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3160769938=/tmp/go-build -gno-record-gcc-switches"
GOROOT/bin/go version: go version go1.18.4 freebsd/amd64
GOROOT/bin/go tool compile -V: compile version go1.18.4
uname -v: FreeBSD 14.0-CURRENT #3 main-n256462-79e1500276a-dirty: Thu Jun 30 19:37:24 EDT 2022     root@e6220:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG
lldb --version: lldb version 14.0.5 (https://github.com/llvm/llvm-project.git revision llvmorg-14.0.5-0-gc12386ae247c)
  clang revision llvmorg-14.0.5-0-gc12386ae247c
  llvm revision llvmorg-14.0.5-0-gc12386ae247c

What did you do?

Tried to use the rclone program https://github.com/rclone/rclone to mount a remote filesystem. (rclone uses fuse behind the scenes.)

# rclone mount ~ /mnt

What did you expect to see?

(nothing) the rclone mount command should complete without error.

What did you see instead?

# rclone mount ~ /mnt
2022/07/27 17:57:57 ERROR : /mnt: Unmounted rclone mount
2022/07/27 17:57:57 Fatal error: failed to umount FUSE fs: resource temporarily unavailable

Discussion

The issue is discussed in FreeBSD bug 258056 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=258056

That *BSD does not map regular files to polling is covered here #19093

Possible patch

Patch below adds "/dev/fuse" as another pollable = false exception.

--- src/os/file_unix.go.orig    2022-07-12 11:22:57.000000000 -0400
+++ src/os/file_unix.go 2022-07-27 08:35:28.234028000 -0400
@@ -165,6 +165,10 @@
                        if (runtime.GOOS == "darwin" || runtime.GOOS == "ios") && typ == syscall.S_IFIFO {
                                pollable = false
                        }
+
+                       if runtime.GOOS == "freebsd" && name == "/dev/fuse" {  // /dev/fuse always reports ready for writing
+                               pollable = false
+                       }
                }
        }
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jul 28, 2022

Looking for the magic name /dev/fuse doesn't seem like a good approach. What if someone does ln -s /dev/fuse myfuse and then tries to open myfuse? Presumably that will fail in the same way.

Is there some way to check for a type as we already do for a FIFO?

@ianlancetaylor ianlancetaylor changed the title recent FreeBSD update to sys/fusefs to allow kevents breaks go polling model os: recent FreeBSD update to sys/fusefs to allow kevents breaks go polling model Jul 28, 2022
@ianlancetaylor ianlancetaylor added OS-FreeBSD NeedsInvestigation labels Jul 28, 2022
@ianlancetaylor ianlancetaylor added this to the Go1.20 milestone Jul 28, 2022
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jul 28, 2022

CC @golang/freebsd

@dmgk
Copy link
Member

@dmgk dmgk commented Jul 28, 2022

We could probably implement a function similar to devname(3), which is used by mount_fusefs(8) to check the device name. Hardcoding "kern.devname" MIB (which is [1, 2147483224] on 13.1-RELEASE) doesn't sound right so this approach would need runtime translation of the sysctl name to MIB. sysctlnametomib() is present in runtime but unexported. I'm not sure what would be the best course of action here.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jul 28, 2022

For example, it would be fine to add sysctlnametomib to internal/syscall/unix and to call it from the os package.

@gopherbot
Copy link

@gopherbot gopherbot commented Jul 30, 2022

Change https://go.dev/cl/420235 mentions this issue: os: Disable polling for fuse devices on FreeBSD

@paulzhol
Copy link
Member

@paulzhol paulzhol commented Jul 30, 2022

I've ended looking at this as well. I ran the bazil/fuse/examples/hellofs example, under truss, getting the same result:

# truss -SdHf -o /tmp/truss ./hellofs -fuse.debug /tmp/tst1
2022/07/30 14:30:17 resource temporarily unavailable

The relevant part is this:

 3678 100150: 0.032046504 open("/dev/fuse",O_RDWR|O_CLOEXEC,00) = 3 (0x3)

 3678 100150: 0.032667484 fstat(3,{ mode=crw-rw-rw- ,inode=102,size=0,blksize=4096 }) = 0 (0x0)
 3678 100150: 0.032856138 kqueue()               = 4 (0x4)
 3678 100150: 0.033037163 fcntl(4,F_SETFD,FD_CLOEXEC) = 0 (0x0)


 3678 100150: 0.033541314 compat11.kevent(4,{ 3,EVFILT_READ,EV_ADD|EV_CLEAR,0,0,0x827253f68 3,EVFILT_WRITE,EV_ADD|EV_CLEAR,0,0,0x827253f68 },2,0x0,0,0x0) = 0 (0x0)
 3678 100150: 0.033721862 fcntl(3,F_GETFL,)      = 2 (0x2)
 3678 100150: 0.033866169 fcntl(3,F_SETFL,O_RDWR|O_NONBLOCK) ERR#19 'Operation not supported by device'

Note how the file is registered under the netpoller, but fails to set the file into non-blocking mode. Corresponding to the following code:

go/src/os/file_unix.go

Lines 171 to 184 in 9a2001a

if err := f.pfd.Init("file", pollable); err != nil {
// An error here indicates a failure to register
// with the netpoll system. That can happen for
// a file descriptor that is not supported by
// epoll/kqueue; for example, disk files on
// Linux systems. We assume that any real error
// will show up in later I/O.
} else if pollable {
// We successfully registered with netpoll, so put
// the file into nonblocking mode.
if err := syscall.SetNonblock(fdi, true); err == nil {
f.nonblock = true
}
}

We don't unregister a pollable file if syscall.SetNonblock() failed. I think that is the bug in the runtime we should be fixing.

While this is addressed (or maybe wait for a kernel fix?), this is a simple workaround to bazil/fuse#280

diff --git a/mount_freebsd.go b/mount_freebsd.go
index 9106a18..52a1af0 100644
--- a/mount_freebsd.go
+++ b/mount_freebsd.go
@@ -56,10 +56,11 @@ func mount(dir string, conf *mountConfig) (*os.File, error) {
                }
        }

-       f, err := os.OpenFile("/dev/fuse", os.O_RDWR, 0o000)
+       fd, err := syscall.Open("/dev/fuse", os.O_RDWR|syscall.O_CLOEXEC, 0o000)
        if err != nil {
                return nil, err
        }
+       f := os.NewFile(uintptr(fd), "/dev/fuse")

        cmd := exec.Command(
                "/sbin/mount_fusefs",

By using os.NewFile() on a blocking file-descriptor, we end up with the "old" behavior.

@gopherbot
Copy link

@gopherbot gopherbot commented Jul 30, 2022

Change https://go.dev/cl/420334 mentions this issue: os: don't attempt to call syscall.SetNonblock() when kind == kindNonBlock

@paulzhol
Copy link
Member

@paulzhol paulzhol commented Jul 30, 2022

I've sent https://go.dev/cl/420334, with it the bazil/fuse module can be made to work properly with /dev/fuse and the netpoller by going through os.NewFile + O_NONBLOCK which makes fdi go through the kindNonBlock flow.

When the kernel fuse device ends up supporting SETFL, then we can return back to regular os.OpenFile use.

diff --git a/mount_freebsd.go b/mount_freebsd.go
index 9106a18..28d4ef1 100644
--- a/mount_freebsd.go
+++ b/mount_freebsd.go
@@ -8,6 +8,8 @@ import (
        "strings"
        "sync"
        "syscall"
+
+       "golang.org/x/sys/unix"
 )

 func handleMountFusefsStderr(errCh chan<- error) func(line string) (ignore bool) {
@@ -56,10 +58,11 @@ func mount(dir string, conf *mountConfig) (*os.File, error) {
                }
        }

-       f, err := os.OpenFile("/dev/fuse", os.O_RDWR, 0o000)
+       fd, err := unix.Open("/dev/fuse", unix.O_RDWR|unix.O_NONBLOCK|unix.O_CLOEXEC, 0o000)
        if err != nil {
                return nil, err
        }
+       f := os.NewFile(uintptr(fd), "/dev/fuse")

        cmd := exec.Command(
                "/sbin/mount_fusefs",

@paulzhol
Copy link
Member

@paulzhol paulzhol commented Jul 31, 2022

Update: I believe there's a kernel issue here with the fuse device. bazil/fuse is always using blocking IO:
https://github.com/bazil/fuse/blob/fb710f7dfd05053a3bc9516dd5a7a8f84ead8aab/fuse.go#L557-L560
https://github.com/bazil/fuse/blob/fb710f7dfd05053a3bc9516dd5a7a8f84ead8aab/fuse.go#L577-L579
(Direct use of syscall.Read/Write on an os.File.Fd())

However the mere initial attempt to set the file descriptor into non-blocking mode, which reports an error:
fcntl(3,F_SETFL,O_RDWR|O_NONBLOCK) ERR#19 'Operation not supported by device'
Does switch the device into non-blocking mode, which enables it to return EAGAIN (resource temporarily unavailable) on Reads. These errors are unexpected by bazil/fuse causing the mount to fail.

If we skip this initial attempt, for example by explicitly starting from a blocking file descriptor #54100 (comment). Then the device doesn't enter non-blocking mode, and never returns EAGAIN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation OS-FreeBSD
Projects
None yet
Development

No branches or pull requests

5 participants