Drop `Seek` or input size knowledge requirements #7

AlexTMjugador · 2022-07-24T23:07:51Z

Requiring input byte sources to implement the Seek trait is onerous for end-users, as most programs that generate DEFLATE streams do not impose seekability requirements. In the Unix world, it's common to pipe the output of a program as an input for a compression program, which is a non-seekable data source. In addition, achieving input seekability is often non-practical in network scenarios due to buffering resource requirements and other reasons.

By being smarter about how we use a sliding window, we can drop seekability and exact input size knowledge requirements from the API exposed by this crate, making it readily applicable for even more usage scenarios, without affecting compression. A high-level overview of the new technique is given through code comments.

A downside I can see of this change is that it requires a sliding window ZOPFLI_MASTER_BLOCK_SIZE bytes (≈ 1 MiB) bigger than before due to the need to temporarily store an additional uncompressed master block in memory. However, I think that the better API design makes this trade-off worth it: the additional memory usage is insignificant for the kind of computers that are most likely to run a compression algorithm as demanding as Zopfli anyway.

This is also a breaking change, as the compress_seekable function was removed from the public API. As a minor note, the sliding window for the deflate function is no longer allocated in the stack, which avoids running out of stack memory in practical scenarios I've encountered: allocating ZOPFLI_MASTER_BLOCK_SIZE bytes on the stack is a lot, and the cost of calling the memory allocator pales in comparison to actually compressing data with Zopfli.

mqudsi · 2022-07-25T03:10:34Z

Thanks for putting in all this work, @AlexTMjugador. I'm all for dropping the Seek or file size oracle requirement here, but I do have a couple of questions about the approach you took, if you don't mind.

If we magically knew whether or not a block was the last one at the time of reading it, that would make the code a bit simpler, no? Because we can easily implement a fake "peek" by reading one byte more than we want the very first time, then so long as we read in a full block each time thereafter we can know that it's not the end of the input stream (because we have at least one byte more than what we're going to handle).
If we do just keep your current approach, I think we can at least avoid the 2x memory reads/writes penalty if we use a ring buffer approach instead of moving the previous buffer to the window location each time and the data we just read to the previous block location. If you don't want to do a ring buffer due to all the index juggling, two separate ZOPFLI_WINDOW_SIZE + ZOPFLI_MASTER_BLOCK_SIZE buffers (previous and current) that you swap between each time means you increase the memory requirement by ZOPFLI_WINDOW_SIZE but you avoid moving the currently read bytes to the previous block index each time. (Zopfli is already insanely slow, anything we can do to avoid adding on to that would be nice.)

Not that it's a problem but note that the changes are already breaking because you drop insize from the public compress() function; the dropping of compresss_seekable is actually a non-issue since we could just keep it until a breaking release (although there would be no benefit to using it anymore).

AlexTMjugador · 2022-07-25T12:26:50Z

Thank you for the review! 😄 I will try to resolve your questions.

If we magically knew whether or not a block was the last one at the time of reading it, that would make the code a bit simpler, no?

Yes, it would. The only thing preventing us from compressing blocks as they are read without any further ado is knowing whether that block is the last one in the stream or not, as we can't seek back to write the proper last block flag after the fact.

Because we can easily implement a fake "peek" by reading one byte more than we want the very first time, then so long as we read in a full block each time thereafter we can know that it's not the end of the input stream (because we have at least one byte more than what we're going to handle).

This is a very interesting idea! It requires handling that extra byte with care, so implementing it is not dead simple, but I've done so in the latest commit I've pushed. I think this technique provides for a far more elegant solution to the problem, without any additional memory usage (technically, it uses an extra byte to store what I've called the "sentinel byte", but this is offset by the simpler implementation with less code and local variables).

Requiring input byte sources to implement the `Seek` trait is onerous for end-users, as most programs that generate DEFLATE streams do not impose seekability requirements. In the Unix world, it's common to pipe the output of a program as an input for a compression program, which is a non-seekable data source. In addition, achieving input seekability is often non-practical in network scenarios due to buffering resource requirements and other reasons. By being smarter about how we use a sliding window, we can drop seekability and exact input size knowledge requirements from the API exposed by this crate, making it readily applicable for even more usage scenarios, without affecting compression. A high-level overview of the new technique is given through code comments. A downside I can see of this change is that it requires a sliding window ZOPFLI_MASTER_BLOCK_SIZE bytes (≈ 1 MiB) bigger than before due to the need to temporarily store an additional uncompressed master block in memory. However, I think that the better API design makes this trade-off worth it: the additional memory usage is insignificant for the kind of computers that are most likely to run a compression algorithm as demanding as Zopfli anyway. This is also a breaking change, as the `compress_seekable` function was removed from the public API. As a minor note, the sliding window for the `deflate` function is no longer allocated in the stack, which avoids running out of stack memory in practical scenarios I've encountered: allocating ZOPFLI_MASTER_BLOCK_SIZE bytes on the stack is a lot, and the cost of calling the memory allocator pales in comparison to actually compressing data with Zopfli.

@mqudsi

The idea for the new compression algorithm was suggested by @mqudsi. I've replaced the several_master_blocks.bin test file with a more realistic one extracted from the well-known Calgary dataset, as an all-zeros data may not be enough to expose some implementation problems.

Other test files separate words in their names by dashes instead of underscores. Let's use them in the new test files too.

AlexTMjugador · 2022-08-03T10:47:12Z

I've successfully tested the changes both manually and with cargo test. I also can't see anything that may be improved right now, so I'm merging this. If anyone has something to discuss about this PR, I'd be glad to hear it 😄

mqudsi · 2022-08-03T15:05:31Z

Thanks for your work on this!

AlexTMjugador added the enhancement New feature or request label Jul 24, 2022

AlexTMjugador added this to the v0.7.0 milestone Jul 24, 2022

AlexTMjugador added 3 commits July 25, 2022 15:17

Rename new test files to use dashes more consistently

47d2ae7

Other test files separate words in their names by dashes instead of underscores. Let's use them in the new test files too.

AlexTMjugador force-pushed the refactor/drop-seek-requirement branch from b8c756a to 47d2ae7 Compare July 25, 2022 13:17

AlexTMjugador merged commit 318bf34 into main Aug 3, 2022

AlexTMjugador deleted the refactor/drop-seek-requirement branch August 3, 2022 10:47

AlexTMjugador mentioned this pull request Aug 14, 2022

v0.7.0 release #10

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop `Seek` or input size knowledge requirements #7

Drop `Seek` or input size knowledge requirements #7

AlexTMjugador commented Jul 24, 2022

mqudsi commented Jul 25, 2022

AlexTMjugador commented Jul 25, 2022 •

edited

AlexTMjugador commented Aug 3, 2022

mqudsi commented Aug 3, 2022

Drop Seek or input size knowledge requirements #7

Drop Seek or input size knowledge requirements #7

Conversation

AlexTMjugador commented Jul 24, 2022

mqudsi commented Jul 25, 2022

AlexTMjugador commented Jul 25, 2022 • edited

AlexTMjugador commented Aug 3, 2022

mqudsi commented Aug 3, 2022

Drop `Seek` or input size knowledge requirements #7

Drop `Seek` or input size knowledge requirements #7

AlexTMjugador commented Jul 25, 2022 •

edited