Join GitHub today
Avoid duplicate cluster writing #264
Currently clusters, which represent 99% of the whole size of a ZIM files, are written two times to the file system:
If we could do everything in one pass, this would:
We might be able to do so by just writing ZIM files on the fly. That would imply to keep the header at the beginning on the file, but to write dirents + indexes at the end. Would should be able to do so without modifying the ZIM spec (probably by pre-allocating enough fs space for the variable part of the header: mimetypes, etc...)
Yes, this PR is still open, I wrongly closed it.
https://manpages.ubuntu.com/manpages/disco/en/man2/fallocate.2.html could help here.