Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stale data (zeros) returned under specific circumstances #168

Closed
exFalso opened this issue May 23, 2023 · 1 comment
Closed

Stale data (zeros) returned under specific circumstances #168

exFalso opened this issue May 23, 2023 · 1 comment

Comments

@exFalso
Copy link

exFalso commented May 23, 2023

Sorry for the mysterious title. We've been implementing a fuse-rs based FS and implemented a fuzzer for simple filesystem operations like read/write/seek. We encountered an issue where our implementation was behaving very strangely, returning zeros or sometimes stale data under certain circumstances. The smallest example I reduced it to:

  1. New file X is open twice, once in read mode R and once in write mode W.
  2. R is read -> empty data. This step is important as it triggers a kernel read request of size 4096, and it seems as if the kernel is caching the returned data
  3. W is written with N bytes -> success.
  4. R is read again -> the read bytes have the correct length N, but they are all zeros. The filesystem is not contacted by the kernel at all. It seems as if the kernel has updated the file metadata during 3, but has not invalidated the cached read.

I have tried reproducing with the regular libfuse passthrough.c implementation, but the issue did not reproduce. I did notice a larger version skew between the libfuse vs fuse-rs ABI versions.

Then I tried modifying hello.rs to add the required features (write and setattr) with as minimal changes as possible, and the issue did reproduce. The code can be found here: https://github.com/zargony/fuse-rs/compare/master...decentriq:fuse-rs:aslemmer/zeros-read-bug?expand=1, note that it branches off the latest release tag, not master.

In particular, the reproducing test:

fuse-rs/examples/hello.rs

Lines 139 to 153 in 10df620

#[test]
fn stale_data_bug() {
use std::io::Read;
use std::io::Write;
let mut file_read = std::fs::File::open("/tmp/mnt/hello.txt").unwrap();
let mut file_write = std::fs::File::create("/tmp/mnt/hello.txt").unwrap();
file_write.write_all("Init".as_bytes()).unwrap();
let mut buffer1 = vec![];
file_read.read_to_end(&mut buffer1).unwrap();
println!("buffer1 {:?}", buffer1);
file_write.write_all("Hello World!".as_bytes()).unwrap();
let mut buffer2 = vec![];
file_read.read_to_end(&mut buffer2).unwrap();
println!("buffer2 {:?}", buffer2);
}

When running this test, the filesystem output:

$ RUST_LOG=debug cargo run --example hello /tmp/mnt
...
INFO:fuse::session: Mounting /tmp/mnt
DEBUG:fuse::request: INIT(2)   kernel: ABI 7.34, flags 0x33fffffb, max readahead 131072
DEBUG:fuse::request: INIT(2) response: ABI 7.8, flags 0x1, max readahead 131072, max write 16777216
DEBUG:fuse::request: LOOKUP(4) parent 0x0000000000000001, name "hello.txt"
DEBUG:fuse::request: OPEN(6) ino 0x0000000000000002, flags 0x8000
DEBUG:fuse::request: OPEN(8) ino 0x0000000000000002, flags 0x8001
DEBUG:fuse::request: GETXATTR(10) ino 0x0000000000000002, name "security.capability", size 0
DEBUG:fuse::request: SETATTR(12) ino 0x0000000000000002, valid 0x208
DEBUG:fuse::request: WRITE(14) ino 0x0000000000000002, fh 0, offset 0, size 4, flags 0x6
DEBUG:fuse::request: GETATTR(16) ino 0x0000000000000002
DEBUG:fuse::request: READ(18) ino 0x0000000000000002, fh 0, offset 0, size 4096
DEBUG:fuse::request: WRITE(20) ino 0x0000000000000002, fh 0, offset 4, size 12, flags 0x6
DEBUG:fuse::request: GETATTR(22) ino 0x0000000000000002
DEBUG:fuse::request: FLUSH(24) ino 0x0000000000000002, fh 0, lock owner 31099430268040709
DEBUG:fuse::request: RELEASE(26) ino 0x0000000000000002, fh 0, flags 0x8001, release flags 0x0, lock owner 0
DEBUG:fuse::request: RELEASE(28) ino 0x0000000000000002, fh 0, flags 0x8000, release flags 0x0, lock owner 0

The test output:

$ cargo test --example hello stale_data_bug -- --nocapture
...
running 1 test
buffer1 [73, 110, 105, 116]
buffer2 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
test stale_data_bug ... ok

(I've added an initial write of "Init" to demonstrate that the basic write functionality works)

As you can see, the first READ request comes through fine, however after the write of "Hello World!", no new read requests arrive, the kernel simply returns zeros.

Any ideas? Is this a kernel bug? Or perhaps an ABI skew issue? I'm running a 5.15.63 kernel

@exFalso
Copy link
Author

exFalso commented Jun 6, 2023

Turns out the issue is with open() returning the same flags that were passed to it, which inadvertently set the FOPEN_KEEP_CACHE flag, causing the kernel cache to not be invalidated properly.

@exFalso exFalso closed this as completed Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant