Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stat: block size differences #41

Closed
d4rken opened this issue Aug 11, 2016 · 4 comments
Closed

stat: block size differences #41

d4rken opened this issue Aug 11, 2016 · 4 comments

Comments

@d4rken
Copy link

d4rken commented Aug 11, 2016

Toybox 0.7.1, Busybox 1.24.2
On a Nexus5@6.0.

Why is there a difference in blocksize?

root@hammerhead:/sdcard # busybox stat -c %B:%b:%o:%s twrp-3.0.0-0-hammerhead.img
512:28632:4096:14657536
root@hammerhead:/sdcard # toybox stat -c %B:%b:%o:%s twrp-3.0.0-0-hammerhead.img
4096:28632:4096:14657536

%B Bytes per block

busybox says 512Byte while toybox says 4096Byte.

28632 Blocks * 512Byte = 14659584 Byte
Which is a lot closer to the actual file size of 14657536 Byte (reported by both toybox&busybox).

Other commands from both binaries also show a block size of 4096 Byte though.

root@hammerhead:/sdcard # busybox stat twrp-3.0.0-0-hammerhead.img                                       <
  File: twrp-3.0.0-0-hammerhead.img
  Size: 14657536        Blocks: 28632      IO Block: 4096   regular file
Device: 11h/17d Inode: 218474      Links: 1
Access: (0660/-rw-rw----)  Uid: (    0/ UNKNOWN)   Gid: ( 1015/ UNKNOWN)
Access: 2016-03-18 14:16:00.000000000
Modify: 2016-03-18 14:16:03.000000000
Change: 2016-03-18 14:16:03.000000000
root@hammerhead:/sdcard # busybox stat -f .
  File: "."
    ID: 0        Namelen: 255     Type: UNKNOWN
Block size: 4096
Blocks: Total: 7015287    Free: 189096     Available: 189096
Inodes: Total: 1785856    Free: 1747803
root@hammerhead:/sdcard # toybox stat twrp-3.0.0-0-hammerhead.img
  File: `twrp-3.0.0-0-hammerhead.img'
  Size: 14657536         Blocks: 28632   IO Blocks: 4096        regular file
Device: 11h/17d  Inode: 218474   Links: 1
Access: (660/-rw-rw----Segmentation fault
root@hammerhead:/sdcard # toybox stat -f .                                                 <
  File: "."
    ID: 0000000000000000 Namelen: 255    Type: 65735546
Block Size: 4096    Fundamental block size: 4096
Blocks: Total: 7015287  Free: 189096    Available: 189096
Inodes: Total: 1785856  Free: 1747803
@d4rken
Copy link
Author

d4rken commented Aug 11, 2016

This stackoverflow post indicates that the stat %B parameter for block size should be ignored as it may not necessarily reflect the actual file system block size value but usually some unrelated OS value.

This would explain the values of 4096 which leads to huge discrepancies between allocated and actual file size (it's not a sparse file). What would be interesting though why busybox still gets the correct 512 blocksize value.

Is it planned to support something like --block-size for commands like stat,ls and du?

@d4rken
Copy link
Author

d4rken commented Aug 11, 2016

After more research, I think I got it now, see here.
%b comes from st_blocks which is always in units of 512 byte.

So the allocated file size on the filesystem is always calculated as block-count * 512 byte, with the result being in increments of the actual file system block size, which is the %o format i.e. I/O Block size.

So why does busybox print the "correct" value of 512 for "Bytes per Block" which should usually just be ignored? Well it's hardcoded.

Toybox returns the same value of st_blksize for both IO Block size and Bytes per block.
Busybox returns a hardcoded 512 for Bytes per block and st_blksize for IO Block size.

@landley
What would you say to also hardcoding 512 for Bytes per bock for toybox?
Returning the IO Block size for seems wrong.
Telling us the "Bytes per block" for the "Blocks allocated" count seems like stat's and the %B formats responsibility and would actually fit the parameters descriptions.
Or Bytes per block could be cut from code, due to serving no useful purpose... but that would break "spec" and could lead to compatibility issues. So hardcoding 512 seems like a good solution.

@landley
Copy link
Owner

landley commented Aug 11, 2016

On 08/11/2016 03:49 AM, Matthias Urhahn wrote:

Toybox 0.7.1, Busybox 1.24.2
On a Nexus5@6.0.

Why is there a difference in blocksize?

|root@hammerhead:/sdcard # busybox stat -c %B:%b:%o:%s
twrp-3.0.0-0-hammerhead.img 512:28632:4096:14657536
root@hammerhead:/sdcard # toybox stat -c %B:%b:%o:%s
twrp-3.0.0-0-hammerhead.img 4096:28632:4096:14657536 |

%B Bytes per block

In toybox, both %o and %B come from the stat() system call.

else if (type == 'B') out('u', stat->st_blksize);

} else if (type == 'o') out('u', stat->st_blksize);

I.E. we're telling you what the operating system told us (presumably
getting it from the filesystem driver).

A quick check shows busybox is printing a hardwired value for %B but the
stat value for %o:

    } else if (m == 'B') {
            strcat(pformat, "lu");
            printf(pformat, (unsigned long) 512); //ST_NBLOCKSIZE

    } else if (m == 'o') {
            strcat(pformat, "lu");
            printf(pformat, (unsigned long) statbuf->st_blksize);

I don't understand the point of printing a hardwired value for %B? FAT
block sizes vary from 512 bytes up to 65535k. ext2 can be 1k or 4k.

A few years ago hard drives went from 512 byte physical block to 4k,
which caused some problems because of longstanding assumptions:

https://lwn.net/Articles/322777/
https://lwn.net/Articles/377895/

And now the sectors are getting bigger:

https://lwn.net/Articles/582862/

Here's some articles about the damage conflicting block size assumptions
can do (data loss when an interrupted write changes stuff you didn't
think was being rewritten) and ways around it:

https://lwn.net/Articles/349970/
https://lwn.net/Articles/665299/
https://lwn.net/Articles/353411/

You shouldn't have to care about 90% of that (the OS handles it all for
you), but from my perspective having two ways to query block size would
be really nice if the OS gave me a way to determine the block size of
the physical media, as distinct from the block size of the filesystem.
Unfortunately, although it seems like %B would be filesystem block
size and %o would be physical media block size, Linux doesn't give me a
way to query physical media block size that I've noticed. (I might be
able to beat it out of the mtd layer for some types of flash?)

busybox says 512Byte while toybox says 4096Byte.

Busybox is returning a single hardwired answer.

That said, it looks like the Ubuntu version is also doing that. If %B
should always say "512" regardless of context, I can make it do that and
change the help text to say something like:

%B prints "512"

28632 Blocks * 512Byte = 14659584 Byte
Which is a lot closer to the actual file size of 14657536 Byte (reported
by both toybox&busybox).

More stat fields: %b is st_blocks and %s is st_size.

The stat(2) man page says that st_blocks is "number of 512B blocks
allocated" so %B is units for %b (and yes, it's a hardwired value). So
the help text should be something like:

%B units for %b (always 512)

Other commands from both binaries also show a block size of 4096 Byte
though.

The "man 1 stat" page says:

   %B     the size in bytes of each block reported by %b

The "man 2 stat" page says:

   blkcnt_t  st_blocks;  /* number of 512B blocks allocated */

So yes, you've found an inconsistency and I should fix it. %B should
output a hardwired "512", busybox is correct here.

Thanks. Good catch,

Rob

@d4rken
Copy link
Author

d4rken commented Aug 11, 2016

Fixed with 4460e9f 👍 Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants