Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

massive slowdown in buffered IO performance from fstat st_blksize incorrectly reported #3898

Closed
f948lan opened this issue Mar 6, 2019 · 9 comments
Labels

Comments

@f948lan
Copy link

f948lan commented Mar 6, 2019

Your Windows build number:

  • 10.0.17763.316

What you're doing and what's happening:

  • Investigating cause of image processing routines running much slower in WSL than VM on same host.

    • Raw IO performance (eg. 4GB copy with dd) much faster in WSL (1500MB/s vs 130MB/s in VM) - (SSD backed)
    • Noticed from strace ~8x more read/write calls running same job in WSL vs VM.
    • Full strace log reveals this is due to 512byte buffer being used instead of 4096.
    • As system calls are slower under WSL anyway, this has a big impact on some workloads.
    • The 512byte buffer was the default assigned by glibc buffered IO. (fopen, fread, fwrite, etc...)

    Investigating glibc, buffer size is set by _IO_file_doallocate, specifically in this code:

#if defined _STATBUF_ST_BLKSIZE
      if (st.st_blksize > 0 && st.st_blksize < BUFSIZ)
	size = st.st_blksize;
#endif

where st.st_blksize is the result of an fstat system call, as is set to 512.
The same can be seen much more easily just by running stat /

What's wrong / what should be happening instead:

The underlying filesystem is using 4096byte blocks, as confirmed by powershell:

PS C:\>  Get-CimInstance -ClassName Win32_Volume | Select-Object Label, BlockSize | Format-Table -AutoSize
 
Label       BlockSize
-----       ---------
WINRE_DRV        4096
LRS_ESP          4096
Windows_OS       4096
                 4096
SYSTEM_DRV       4096

So fstat should return st_blksize as 4096

Effects
As noted above, glibc relies on this value to set the default buffer size for buffered IO, used by many applications. In some cases this can result in 8x more system calls for read and write operations, which carry a heavy penalty under WSL.
To prove the impact, I rebuild glibc with the code block above commented out (which falls back on a default value BUFSIZ, 8192, for the buffer).
In one particular example, rasterising an image with ghostscript takes 7minutes in default WSL, but reduces to 3m30 with the patched glibc!
Given the ubiquitous use of libc buffered IO, it seems likely there are many things affected by this.

I've seen it widely noted that WSL currently suffers on IO performance, and slow system calls seem to be a key reason for that. Clearly this will be compounded where 8x more calls are generated than necessary.

@f948lan
Copy link
Author

f948lan commented Mar 6, 2019

Just rebooted my Linux workstation to Windows (19H1 insider updates) and it does correctly report block size.
So either:
a) This was fixed in latest updates?
Or b) fstat is reporting something hardware related, rather than the fs blocksize?

Interested what blocksize others see from stat /

@Gabrielcarvfer
Copy link

Running 18845 (20H1), stat reports 512.

@benhillis
Copy link
Member

@Gabrielcarvfer - You mention fstat is being called, but I don't see what the file descriptor is. LxFs? DrvFs? ProcFs?

@benhillis
Copy link
Member

Looking at this a bit, it looks like we will fall back to 512 for LxFs and DrvFs if FileFsSectorSizeInformation is not supported by the device. @SvenGroot - Do you know why 512 was chosen instead of 4096?

@Gabrielcarvfer
Copy link

Gabrielcarvfer commented Mar 6, 2019

@benhillis - @f948lan said that. I've asked you guys to take a look at it. :)

Checking both glibc 2.28 code and the strace, I can confirm that it is working properly on 18845, at least for normal files.

Writing to the terminal is more complicated, and the code pointed by @f948lan is related to that. It was supposed to adjust the buffer size to the tty buffer size.

The fstat called is the following:
fstat(1, {st_dev=makedev(0, 3), st_ino=2251799813922347, st_mode=S_IFCHR|0660, st_nlink=1, st_uid=1000, st_gid=5, st_blksize=512, st_blocks=0, st_rdev=makedev(4, 1), ...) = 0
ioctl(1, TCGETS, {c_iflags=0x6d02, c_oflags=0x5, c_cflags=0x4bf, c_lflags=0x8a3b, c_line=0, ...) = 0

The weird part comes after that, but I'm pretty sure it is a glibc problem. The write calls are being split somewhere, not respecting the st_blksize (512 bytes), nor the 4096 byte fs block size, nor the 8192 byte BUFSIZ constant.

Whatever size of string I try to write to the terminal (bigger than 512 bytes), gets split into (size-512) byte and another 512 byte write. As an example, for a 2048 byte string, I got the following:
write(1, "11"..., 1536) = 1536
write(1, "11"..., 512) = 512

Even worse, if I call a flush or append a std::endl, another 512 byte write is called, which is pretty weird as writes to std::cout were supposed to be always flushed and those could be ignored.

@f948lan
Copy link
Author

f948lan commented Mar 6, 2019

@Gabrielcarvfer - You mention fstat is being called, but I don't see what the file descriptor is. LxFs? DrvFs? ProcFs?

Sorry, should have mentioned that.
On the 'problem' system 512 is reported on LxFs, DrvFs and tmpfs (which I think is basically one of the other two anyway at the moment?)

@f948lan
Copy link
Author

f948lan commented Mar 6, 2019

Whatever size of string I try to write to the terminal (bigger than 512 bytes), gets split into (size-512) byte and another 512 byte write. As an example, for a 2048 byte string, I got the following:
write(1, "11"..., 1536) = 1536
write(1, "11"..., 512) = 512

Yes, I saw this too (but to file). Seems to be a side effect of the libc implementation, I didn't explore in detail, but it always happens if fwrite is called with more than bufsize bytes, it seems to fill the buffer, write that(512) then write the rest of the data as a second write call (except for the very first fwrite on a stream, which will get written as a single write).
But we can't blame WSL for that :)

@Gabrielcarvfer
Copy link

Gabrielcarvfer commented Mar 7, 2019

Yes, I saw this too (but to file). Seems to be a side effect of the libc implementation, I didn't explore in detail, but it always happens if fwrite is called with more than bufsize bytes, it seems to fill the buffer, write that(512) then write the rest of the data as a second write call (except for the very first fwrite on a stream, which will get written as a single write).
But we can't blame WSL for that :)

Hold down. Look at the stack:
/lib/x86_64-linux-gnu/libc-2.28.so(__write+0x14) [0x109fd4]
/lib/x86_64-linux-gnu/libc-2.28.so(_IO_file_write+0x2d) [0x8b33d]
/lib/x86_64-linux-gnu/libc-2.28.so(_IO_do_write+0xb1) [0x8ca71]
/lib/x86_64-linux-gnu/libc-2.28.so(_IO_file_xsputn+0x125) [0x8ba15]
/lib/x86_64-linux-gnu/libc-2.28.so(fwrite+0xd8) [0x801d8]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.26(std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)+0x153) [0x12c3b3]

_IO_file_xsputn points to _IO_new_file_xsputn in https://github.com/lattera/glibc/blob/master/libio/fileops.c.

Edit:
Just confirming, glibc problem worsened by the small buffer size. (check code block starting with if (to_do + must_flush > 0) )

Copy link
Contributor

This issue has been automatically closed since it has not had any activity for the past year. If you're still experiencing this issue please re-file this as a new issue or feature request.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants
@Gabrielcarvfer @adrastogi @benhillis @f948lan and others