You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm hitting TestMemNodeFsWrite failure in tests. Even though this test lives in deprecated nodefs, it might be due to 2021 kernel regression which might be affecting whole go-fuse and other FUSE libraries. Please read below for details.
The test fails reliably and quickly when run in a loop:
GOMAXPROCS=1 go test -run TestMemNodeFsWrite -count 1000 -failfast
(neo) (z-dev) (g.env) kirr@deca:~/src/neo/src/github.com/hanwen/go-fuse/fuse/nodefs$ GOMAXPROCS=1 go test -run TestMemNodeFsWrite -count 1000 -failfast16:33:29.722233 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:29.773813 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:29.799427 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:29.848899 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:29.952907 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:30.002003 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:30.027920 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE--- FAIL: TestMemNodeFsWrite (0.00s) memnode_test.go:67: content mismatch: got "", want "hello"FAILexit status 1FAIL github.com/hanwen/go-fuse/fuse/nodefs 0.517s(neo) (z-dev) (g.env) kirr@deca:~/src/neo/src/github.com/hanwen/go-fuse/fuse/nodefs$ GOMAXPROCS=1 go test -run TestMemNodeFsWrite -count 1000 -failfast16:33:31.362206 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:31.362356 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:31.436443 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:31.565719 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE--- FAIL: TestMemNodeFsWrite (0.00s) memnode_test.go:67: content mismatch: got "", want "hello"FAILexit status 1FAIL github.com/hanwen/go-fuse/fuse/nodefs 0.523s(neo) (z-dev) (g.env) kirr@deca:~/src/neo/src/github.com/hanwen/go-fuse/fuse/nodefs$ GOMAXPROCS=1 go test -run TestMemNodeFsWrite -count 1000 -failfast16:33:32.804827 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:32.804902 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:32.979033 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:33.059869 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:33.159474 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:33.288119 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:33.366934 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:33.468874 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:33.519913 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:33.654345 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:33.679886 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:33.743035 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:33.819909 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:33.845533 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:33.941402 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:33.996450 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:34.022149 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:34.072047 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:34.114008 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:34.140888 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:34.140927 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:34.217649 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:34.225709 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:34.367711 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE16:33:34.572854 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASE--- FAIL: TestMemNodeFsWrite (0.00s) memnode_test.go:67: content mismatch: got "", want "hello"FAILexit status 1FAIL github.com/hanwen/go-fuse/fuse/nodefs 1.967s
Note: for this test to fail it experimentally turned out to be important for other fuse filesystems to be unmounted and /sys/fs/fuse/connections to be empty.
Here is debug log of a failure:
=== RUN TestMemNodeFsWrite
14:29:30.963701 callFusermount: executing ["/usr/bin/fusermount3" "/tmp/go-fuse-memnode_test230041599/mnt" "-o" "subtype=nodefs.memNode,max_read=131072"]
14:29:30.966811 rx 2: INIT n0 {7.37 Ra 131072 ASYNC_READ,POSIX_LOCKS,ATOMIC_O_TRUNC,EXPORT_SUPPORT,BIG_WRITES,DONT_MASK,SPLICE_WRITE,SPLICE_MOVE,SPLICE_READ,FLOCK_LOCKS,IOCTL_DIR,AUTO_INVAL_DATA,READDIRPLUS,READDIRPLUS_AUTO,ASYNC_DIO,WRITEBACK_CACHE,NO_OPEN_SUPPORT,PARALLEL_DIROPS,HANDLE_KILLPRIV,POSIX_ACL,ABORT_ERROR,MAX_PAGES,CACHE_SYMLINKS,NO_OPENDIR_SUPPORT,EXPLICIT_INVAL_DATA,0x70000000} "\x03\x00\x00\x00\x00\x00\x00\x00"... 48b
14:29:30.966881 tx 2: OK, {7.28 Ra 131072 ASYNC_READ,BIG_WRITES,AUTO_INVAL_DATA,READDIRPLUS,NO_OPEN_SUPPORT,PARALLEL_DIROPS,MAX_PAGES 0/0 Wr 131072 Tg 0 MaxPages 32}
14:29:30.967154 rx 4: ACCESS n1 {u=1000 g=1000 r}
14:29:30.967183 tx 4: 38=function not implemented
14:29:30.967210 rx 6: LOOKUP n1 [".go-fuse-epoll-hack"] 20b
14:29:30.967229 tx 6: OK, {n18446744073709551615 g0 tE=0s tA=0s {M0100644 SZ=0 L=1 0:0 B0*0 i0:18446744073709551615 A 0.000000 M 0.000000 C 0.000000}}
14:29:30.967275 rx 8: LOOKUP n1 [".Trash"] 7b
14:29:30.967293 tx 8: 2=no such file or directory, {n0 g0 tE=0s tA=0s {M00 SZ=0 L=0 0:0 B0*0 i0:0 A 0.000000 M 0.000000 C 0.000000}}
14:29:30.967318 rx 10: OPEN n18446744073709551615 {O_RDONLY,0x8000}
14:29:30.967335 tx 10: OK, {Fh 18446744073709551615 }
14:29:30.967354 rx 12: LOOKUP n1 [".Trash-1000"] 12b
14:29:30.967371 tx 12: 2=no such file or directory, {n0 g0 tE=0s tA=0s {M00 SZ=0 L=0 0:0 B0*0 i0:0 A 0.000000 M 0.000000 C 0.000000}}
14:29:30.967402 rx 14: POLL n18446744073709551615
14:29:30.967447 tx 14: 38=function not implemented
14:29:30.967468 rx 16: FLUSH n18446744073709551615 {Fh 18446744073709551615}
14:29:30.967473 tx 16: OK
14:29:30.967517 rx 18: RELEASE n18446744073709551615 {Fh 18446744073709551615 0x8000 L0}
14:29:30.967527 tx 18: OK
14:29:30.968109 rx 20: LOOKUP n1 ["test"] 5b
14:29:30.968191 tx 20: 2=no such file or directory, {n0 g0 tE=0s tA=0s {M00 SZ=0 L=0 0:0 B0*0 i0:0 A 0.000000 M 0.000000 C 0.000000}}
14:29:30.968341 rx 22: CREATE n1 {0100644 [WRONLY,CREAT,TRUNC,0x8000] (022)} ["test"] 5b
14:29:30.968539 tx 22: OK, {n3 g2 {M0100644 SZ=0 L=1 0:0 B0*0 i0:3 A 1686742170.968376 M 1686742170.968376 C 1686742170.968376} &{2 0 0}}
14:29:30.968690 rx 24: GETXATTR n3 {sz 0} ["security.capability"] 20b
14:29:30.968986 tx 24: 61=no data available
14:29:30.969040 rx 26: WRITE n3 {Fh 2 [0 +5) L 0 WRONLY,NONBLOCK,0x8000} "hello" 5b
14:29:30.969060 tx 26: OK
14:29:30.969120 rx 30: GETATTR n3 {Fh 0 }
14:29:30.969162 tx 30: OK, {tA=0.1s {M0100644 SZ=0 L=1 0:0 B0*0 i0:3 A 1686742170.968376 M 1686742170.968376 C 1686742170.968376}}
14:29:30.969142 rx 28: FLUSH n3 {Fh 2}
14:29:30.969362 tx 28: OK
14:29:30.969452 rx 34: OPEN n3 {O_RDONLY,0x8000}
14:29:30.969510 tx 34: OK, {Fh 3 }
14:29:30.969585 rx 36: FLUSH n3 {Fh 3}
14:29:30.969605 tx 36: OK
memnode_test.go:67: content mismatch: got "", want "hello"
14:29:30.969818 unmount: executing ["/usr/bin/fusermount3" "-u" "/tmp/go-fuse-memnode_test230041599/mnt"]
14:29:30.970508 rx 32: RELEASE n3 {Fh 2 WRONLY,NONBLOCK,0x8000 L0}
14:29:30.970527 tx 32: OK
14:29:30.970738 rx 38: RELEASE n3 {Fh 3 NONBLOCK,0x8000 L0}
14:29:30.970749 tx 38: OK
14:29:30.972261 received ENODEV (unmount request), thread exiting
14:29:30.972323 received ENODEV (unmount request), thread exiting
14:29:30.972336 received ENODEV (unmount request), thread exiting
--- FAIL: TestMemNodeFsWrite (0.01s)
This test(*) first writes hello into /test file and then wants to read it back. The file is ultimately served by loopback filesystem, which delegates getattr, write and read to underlying os fs.
The write happens in rx 26 WRITE n3 {Fh 2 [0 +5) L 0 WRONLY,NONBLOCK,0x8000} "hello" 5b. It goes ok. Then the kernel issues GETATTR request. This request - in this particular run - is probably issued because attributes happen to become timed out on the kernel side (attributes TTL in this test is 100ms). The getattr is served by loopbackfs by statting the underlying os file which still stats as with st_size=0. This seems to be correct behaviour since FLUSH was not yet called. Then the FLUSH is called - when file handle opened for writing is closed. Then the file is opened again for reading. This happens on rx 34: OPEN n3 {O_RDONLY,0x8000}. The kernel does not send new GETATTR request to refresh its view of file size, and since previous getattr returned st_size=0 the kernel serves all reads from its pagecache directly without querying the filesystem with any READ request. Which leads to failure since the file reads as empty instead of hello.
There is no documentation for FUSE protocol anywhere (semantic of it, not messages serialization), but the bug seems to be due to that the kernel does not invoke GETATTR after FLUSH.
This behaviour was introduced in 2021 in https://git.kernel.org/linus/fa5eee57e33e (Linux 5.16) which changed fuse_flush to invalidate only st_blocks, instead of all attributes, in particularly stopping to invalidate st_size.
Hello everyone.
I'm hitting
TestMemNodeFsWrite
failure in tests. Even though this test lives in deprecated nodefs, it might be due to 2021 kernel regression which might be affecting whole go-fuse and other FUSE libraries. Please read below for details.The test fails reliably and quickly when run in a loop:
GOMAXPROCS=1 go test -run TestMemNodeFsWrite -count 1000 -failfast
Note: for this test to fail it experimentally turned out to be important for other fuse filesystems to be unmounted and
/sys/fs/fuse/connections
to be empty.Here is debug log of a failure:
This test(*) first writes
hello
into/test
file and then wants to read it back. The file is ultimately served by loopback filesystem, which delegates getattr, write and read to underlying os fs.The write happens in
rx 26 WRITE n3 {Fh 2 [0 +5) L 0 WRONLY,NONBLOCK,0x8000} "hello" 5b
. It goes ok. Then the kernel issuesGETATTR
request. This request - in this particular run - is probably issued because attributes happen to become timed out on the kernel side (attributes TTL in this test is 100ms). The getattr is served by loopbackfs by statting the underlying os file which still stats as with st_size=0. This seems to be correct behaviour since FLUSH was not yet called. Then the FLUSH is called - when file handle opened for writing is closed. Then the file is opened again for reading. This happens onrx 34: OPEN n3 {O_RDONLY,0x8000}
. The kernel does not send new GETATTR request to refresh its view of file size, and since previous getattr returnedst_size=0
the kernel serves all reads from its pagecache directly without querying the filesystem with any READ request. Which leads to failure since the file reads as empty instead ofhello
.There is no documentation for FUSE protocol anywhere (semantic of it, not messages serialization), but the bug seems to be due to that the kernel does not invoke GETATTR after FLUSH.
This behaviour was introduced in 2021 in https://git.kernel.org/linus/fa5eee57e33e (Linux 5.16) which changed
fuse_flush
to invalidate onlyst_blocks
, instead of all attributes, in particularly stopping to invalidatest_size
.Kirill
/cc @szmi, @Nikratio, @amir73il
(*)
go-fuse/fuse/nodefs/memnode_test.go
Lines 55 to 69 in 255ab74
go-fuse:
v2.3.0-11-g255ab74
os:
Linux deca 6.1.0-9-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.27-1 (2023-05-08) x86_64 GNU/Linux
go:
go version go1.20.5 linux/amd64
The text was updated successfully, but these errors were encountered: