-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop Seek
or input size knowledge requirements
#7
Conversation
Thanks for putting in all this work, @AlexTMjugador. I'm all for dropping the
Not that it's a problem but note that the changes are already breaking because you drop |
Thank you for the review! 😄 I will try to resolve your questions.
Yes, it would. The only thing preventing us from compressing blocks as they are read without any further ado is knowing whether that block is the last one in the stream or not, as we can't seek back to write the proper last block flag after the fact.
This is a very interesting idea! It requires handling that extra byte with care, so implementing it is not dead simple, but I've done so in the latest commit I've pushed. I think this technique provides for a far more elegant solution to the problem, without any additional memory usage (technically, it uses an extra byte to store what I've called the "sentinel byte", but this is offset by the simpler implementation with less code and local variables). |
Requiring input byte sources to implement the `Seek` trait is onerous for end-users, as most programs that generate DEFLATE streams do not impose seekability requirements. In the Unix world, it's common to pipe the output of a program as an input for a compression program, which is a non-seekable data source. In addition, achieving input seekability is often non-practical in network scenarios due to buffering resource requirements and other reasons. By being smarter about how we use a sliding window, we can drop seekability and exact input size knowledge requirements from the API exposed by this crate, making it readily applicable for even more usage scenarios, without affecting compression. A high-level overview of the new technique is given through code comments. A downside I can see of this change is that it requires a sliding window ZOPFLI_MASTER_BLOCK_SIZE bytes (≈ 1 MiB) bigger than before due to the need to temporarily store an additional uncompressed master block in memory. However, I think that the better API design makes this trade-off worth it: the additional memory usage is insignificant for the kind of computers that are most likely to run a compression algorithm as demanding as Zopfli anyway. This is also a breaking change, as the `compress_seekable` function was removed from the public API. As a minor note, the sliding window for the `deflate` function is no longer allocated in the stack, which avoids running out of stack memory in practical scenarios I've encountered: allocating ZOPFLI_MASTER_BLOCK_SIZE bytes on the stack is a lot, and the cost of calling the memory allocator pales in comparison to actually compressing data with Zopfli.
The idea for the new compression algorithm was suggested by @mqudsi. I've replaced the several_master_blocks.bin test file with a more realistic one extracted from the well-known Calgary dataset, as an all-zeros data may not be enough to expose some implementation problems.
Other test files separate words in their names by dashes instead of underscores. Let's use them in the new test files too.
b8c756a
to
47d2ae7
Compare
I've successfully tested the changes both manually and with |
Thanks for your work on this! |
Requiring input byte sources to implement the
Seek
trait is onerous for end-users, as most programs that generate DEFLATE streams do not impose seekability requirements. In the Unix world, it's common to pipe the output of a program as an input for a compression program, which is a non-seekable data source. In addition, achieving input seekability is often non-practical in network scenarios due to buffering resource requirements and other reasons.By being smarter about how we use a sliding window, we can drop seekability and exact input size knowledge requirements from the API exposed by this crate, making it readily applicable for even more usage scenarios, without affecting compression. A high-level overview of the new technique is given through code comments.
A downside I can see of this change is that it requires a sliding window
ZOPFLI_MASTER_BLOCK_SIZE
bytes (≈ 1 MiB) bigger than before due to the need to temporarily store an additional uncompressed master block in memory. However, I think that the better API design makes this trade-off worth it: the additional memory usage is insignificant for the kind of computers that are most likely to run a compression algorithm as demanding as Zopfli anyway.This is also a breaking change, as the
compress_seekable
function was removed from the public API. As a minor note, the sliding window for thedeflate
function is no longer allocated in the stack, which avoids running out of stack memory in practical scenarios I've encountered: allocatingZOPFLI_MASTER_BLOCK_SIZE
bytes on the stack is a lot, and the cost of calling the memory allocator pales in comparison to actually compressing data with Zopfli.