Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comment on implementation details #85

Closed
wisp3rwind opened this issue Jul 10, 2021 · 1 comment
Closed

Comment on implementation details #85

wisp3rwind opened this issue Jul 10, 2021 · 1 comment

Comments

@wisp3rwind
Copy link

Hi @rui314,

I've just stumbled upon this project coming from some reddit comment, and skimmed through the readme. The following excerpt from the Details section

At least on Linux, it looks like the filesystem's performance to allocate new blocks to a new file is the limiting factor when creating a new large file and filling its contents using mmap. If you already have a large file in the buffer cache, writing to it is much faster than creating a new fresh file and writing to it. [...]

reminded me of something I had read before at https://stackoverflow.com/a/34579110/3451198:

Secondly, never ever extend an output file during a write, you force an extent allocation and possibly a metadata flush. Instead fallocate the file's maximum extent to some suitably large value, and keep an internal atomic counter of the end of file. That should reduce the problem to just extent allocation which for ext4 is batched and lazy - more importantly you won't be forcing a metadata flush.

Assuming that actually applies to this project and you didn't consider it already, maybe fallocate(2)ing in large chunks would be an alternate option given that common filesystems (ext4, btrfs) support it.

@rui314
Copy link
Owner

rui314 commented Jul 10, 2021

I actually tried to use fallocate(2) and didn't observe a notable difference. My filesystem is ext4. I don't know why, but it looks like what matter is whether file contents are in the buffer cache or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants