Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-41092: Optionally request actual filesize via 'os.path.getsize' #21088

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

stephenfin
Copy link

@stephenfin stephenfin commented Jun 23, 2020

Provide the ability to get the actual, on-disk size of a file using os.path.getsize. This is helpful when dealing with things like sparse files.

https://bugs.python.org/issue41092

Provide the ability to get the actual, on-disk size of a file using
'os.path.getsize'. This is helpful when dealing with things like sparse
files.

Signed-off-by: Stephen Finucane <stephen@that.guru>
@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA).

CLA Missing

Our records indicate the following people have not signed the CLA:

@stephenfin

For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

If you have recently signed the CLA, please wait at least one business day
before our records are updated.

You can check yourself to see if the CLA has been received.

Thanks again for the contribution, we look forward to reviewing it!


.. versionchanged:: 3.6
Accepts a :term:`path-like object`.

.. versionchanged:: 3.10
Add the optional *apparent* parameter.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not wedded to this parameter name. Open to better suggestions

if apparent:
return os.stat(filename).st_size
else:
return os.stat(filename).st_blocks * 512
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this magic constant coming from? Are you assuming that all file systems and operating systems have 512 bytes block size? POSIX standard does not specify the block size, see https://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm taking it from fstat(2):

The st_blocks field indicates the number of blocks allocated to the file, 512-byte units. (This may be smaller than st_size/512 when the file has holes.)

Our own docs state much the same thing.

st_blocks
Number of 512-byte blocks allocated for file. This may be smaller than st_size/512 when the file has holes.

Note that we should not use st_blksize because it's the preferred blocksize and presumably not relevant here. From fstat(2):

The st_blksize field gives the "preferred" blocksize for efficient file system I/O. (Writing to a file in smaller chunks may cause an inefficient read-modify-rewrite.)

Would a (private) constant be beneficial? Or simply a comment, perhaps?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The man page https://linux.die.net/man/2/fstat reflects the situation on Linux. Python supports many more platforms than just Linux or BSD. I guess our own documentation is also based on the Linux or BSD documentation, too.

The Open Group docs cover the generic POSIX standard. It mentions that some platforms define DEV_BSIZE in <sys/param.h>. I also found some code that defines DEV_BSIZE from BSIZE. I recommend to expose the constant from posixmodule.c and only support the feature when the system has a usable DEV_BSIZE or BSIZE.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@csabella
Copy link
Contributor

@stephenfin, please make the changes requested. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants