Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect size when doing preallocation #597

Open
EchterAgo opened this issue Oct 7, 2023 · 5 comments
Open

Incorrect size when doing preallocation #597

EchterAgo opened this issue Oct 7, 2023 · 5 comments

Comments

@EchterAgo
Copy link

I encountered an issue with OpenZFS on Windows which also seems to exist here. When doing a preallocation (SetFileInformationByHandle with FileAllocationInfo) the file always shows the preallocated size. On NTFS, ReFS and a Samba share the file shows 0 bytes after preallocation, 0 bytes after the write and the written size after file close.

See openzfsonwindows/openzfs#281

There is also a Python test program for this issue, it just needs 2 files.

@maharmstone
Copy link
Owner

maharmstone commented Oct 7, 2023

Python's not very useful here, as you have to dig deep to know what it's actually doing. It'd be better for there to be a C or C++ program using ntdll calls (not kernel32).

the file always shows the preallocated size

Do we know which size os.path.getsize returns? There's three on Windows: allocation size, end of file, and valid data length.

@EchterAgo
Copy link
Author

Python seems to use GetFileInformationByHandle to get a BY_HANDLE_FILE_INFORMATION and uses its nFileSizeLow and nFileSizeHigh.

A simple dir command will also show the incorrect size:

(.venv) H:\dev\openzfs>dir d:
 Volume in drive D has no label.
 Volume Serial Number is 609A-6821

 Directory of D:\

08/10/2023  06:57               512 testfile.bin  
               1 File(s)            512 bytes     
               0 Dir(s)  10,717,904,896 bytes free

It should show 117 here with the test program. Rclone is a good tool to test this too, it always preallocates files with multiple of the volumes cluster size:

https://github.com/rclone/rclone/blob/bcb3289dad58efad617eacd2ee1103b51a6ee3cf/lib/file/preallocate_windows.go#L74

But when it writes less than that it errors later because it expects the created file to be the size of actually written data after close.

@EchterAgo
Copy link
Author

EchterAgo commented Oct 8, 2023

Is there a userspace API to get all 3 sizes? I'd like to get all 3 in my test and compare them to NTFS & ReFS.

Edit: seems like there is no userspace way to get the valid data length, see https://stackoverflow.com/questions/35572871/how-to-get-valid-data-length-of-a-file

@lundman
Copy link

lundman commented Oct 10, 2023

As a side comment, just wanted to mention it, the set_validdatalength() function:

https://github.com/maharmstone/btrfs/blob/master/src/fileinfo.c#L3754
The "<=" in
if (fvdli->ValidDataLength.QuadPart <= fcb->Header.ValidDataLength.QuadPart ...
and I could never get the call to succeed, even with your test.exe and ifstest.exe. Always fails.

Then I noticed:
https://github.com/microsoft/Windows-driver-samples/blob/main/filesys/fastfat/fileinfo.c#L4847
has a simple "<" instead. With that, the code will occasionally trigger the valid path with test.exe and ifstest.exe.

@EchterAgo
Copy link
Author

FYI the test I added in openzfsonwindows/openzfs#284 also works on other filesystems, I tested on NTFS, ReFS, FAT32, ZFS and SMB. It doesn't have dependencies other than Python. You can just call it with --path <path to btrfs dir> --no_pool so it doesn't try to create a zpool and uses the path as a test directory instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants