Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect writes from fuse #70

Closed
aloknerurkar opened this issue Sep 17, 2022 · 12 comments
Closed

Incorrect writes from fuse #70

aloknerurkar opened this issue Sep 17, 2022 · 12 comments

Comments

@aloknerurkar
Copy link

I created a fuse filesystem using the cgofuse lib. The repo is here. The fuse implementation is almost similar to the MemFs reference implementation provided in this project. The only difference is instead of storing files in mem, they are pushed to some storage.

There are some tests in the package. I am currently running this on an M1 Mac mini PC. What I observe is, if we do smaller writes on files mounted on fuse, we get duplicate write ops.

For eg, in the test writes are done in 1024 byte lengths, I see the following ops:

write off 4702208 len 1024
write off 4702208 len 2048
write off 4702208 len 3096
....
....
write off 4702208 len 10240

Then at some point in time, I get a write op which writes 0s to the first 64 bytes of the file

write off 0 len 65536 [0 0 0 0 0 0 0 0 0 0]

This doesnt happen always, the test fails like 2/5 times. Also, things are better if I run the test on my Macbook Pro M1Pro machine.

I have been debugging this in my code for a couple of weeks and I have added more tests around things that I had doubts about, but eventually, I have concluded that this is happening from the fuse end.

Is there any known issue around this?

@billziss-gh
Copy link
Collaborator

billziss-gh commented Sep 17, 2022

Is the problem that data is written twice over the same range or is the problem that erroneous data are being written?

If the former, this is legal. The OS does not provide any guarantees about the order that writes will arrive at your file system.

If you want to have better control over the writes you must use open/O_DIRECT on Linux and fnctl/F_NOCACHE on OSX.

@aloknerurkar
Copy link
Author

The former is happening although this is not the problem. I just wanted to make sure this is legal.

The problem happens when I get the write for 0s. And it's almost always offset 0-64k. You should be able to reproduce it if you run the test with count 5/10.

@billziss-gh
Copy link
Collaborator

Yes, this is legal. You should not expect any particular order for writes when doing cached I/O.

@aloknerurkar
Copy link
Author

Yes, this is legal. You should not expect any particular order for writes when doing cached I/O.

So this is not a problem. The write for 0s for the first 64k bytes is the main problem. It's always the 0-64k offsets as i mentioned. Any idea why this could happen?

@billziss-gh
Copy link
Collaborator

The write for 0s for the first 64k bytes is the main problem.

Is your test/application writing something other than zeroes in the range 0-64K and the OS sends you zeroes instead?

Or is your test/application writing at offset 4702208 onward, thus creating a hole from 0-4702208 (which conceptually contains zeroes)?

@aloknerurkar
Copy link
Author

aloknerurkar commented Sep 17, 2022

So the write for 0-64k comes much later. At this point the initial writes have already happened. So basically, after the 4702208 offset, it would send this. This write overwrites the first 64k bytes to 0. The application/test is not sending this.

@aloknerurkar
Copy link
Author

I ran the test by logging all the write ops I get from fuse. Attaching the logs here. So when I was testing the other day I was seeing 64k, but now I am seeing bigger values.

If you check the logs, around L935 is when the problem surfaces. This happens after the writes for the initial offsets have been completed already. I see these write ops which zero out some parts of the file. These are definitely not coming from the tests as in the test it writes the file sequentially with random bytes.

the log line can be read as write <file path> <offset> <file handle> <length of write buf> <first 10 bytes of buf>

log.txt

@aloknerurkar
Copy link
Author

@billziss-gh So any more pointers on this? Were you able to check the logs?

@billziss-gh
Copy link
Collaborator

If you are getting writes that overwrite legitimate data in macOS my suggestion would be to follow up on the OSXFUSE repo. Cgofuse is a thin layer around different FUSE libraries and would not introduce writes of its own.

@asabya
Copy link

asabya commented Sep 26, 2022

We are seeing something like this I suppose.

@aloknerurkar
Copy link
Author

@asabya This is exactly what is happening. Seems like in between the writes, we are getting Getattr call which doesn't have a valid filehandle populated even though the file is open.

@billziss-gh Any idea why the filehandle is invalid in the Getattr call?

I think we can close this issue.

@billziss-gh
Copy link
Collaborator

Any idea why the filehandle is invalid in the Getattr call?

This is by design. Getattr may be called with or without a file handle. You can identify the no file handle case by checking whether fh == ^uint64(0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants