-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LARGE_WINDOW 32kb windows #9
Comments
Sync_flush does do what you expect and allow matches over submit bounties as does no_flush in statefull compress. Changing the default to large_window may not have a big impact depending on data set unless you also change the default hufftables at compile time or use isal_create_hufftables() at runtime. We are working on a number of improvements in compression that we will push soon that both improve the compression ratio and speed. This includes making 32K window the default. |
@gbtucker thanks. Changing the default Huffman tables to the ones tailored to the data did increase compression ratio for 32kB windows from 2.49x to 3.97x. But then, the same tables on 8kB windows gave 3.66x. This leads me to believe that larger windows are not getting using properly in this code. |
It depends a lot on your data of course. What data set are you using? |
I created a demo script. It concatenates two random files of size ${BLOCK} several times and then tries to compress them using
As you see, the ISA-L in 32K window mode performs much worse than
|
I looked into this case and it's not that large_window or sync_flush isn't working. The dataset is highly compressible with lots of large matches. That the 16000 case compresses shows that large_window must be finding matches of distance > 8k even with sync_flush on. After an initial set of literals there are mostly matches of max size 258. At the end of an input segment when sync_flush is on the input must be drained resulting in a partial match of less than 258. This results in a few more literals before max matches resume. Since the compression ratio is so high on this dataset a few literals makes a big difference. The other contributor is that when sync flush is on, a new header must be added at the start of each input block. This contributes over 100 bytes every 8k. |
What I see is that ISA-L does significantly (2 times) worse on my streaming data examples. I've prepared an apples-to-apples comparison code which executes https://github.com/vlm/isa-l/commit/f77f252d3bb95659fbe7b97e82794e8eb418ba15 The scripts and the test file are in the above patch. The outcome:
Note that gzip-1 and zlib's deflate in a line-per-line mode yields about 4.1 .. 4.8x compression, whereas isa-l yields out about 2.1x compression under the same conditions. Although ISA-L is very fast (thank you!), the 2.1 compression is not very useful, and it is indicative of some problem that will pop up for all customers trying to use a library in streaming contexts. The default |
It's true I didn't expect a common usage to be sync_flush so often on such small buffers. Is sync_flush really necessary on every line? Zlib probably puts a type 1, constant static block, for each line. Without sync_flush on small buffers we do much better.
Would this make it worth it to delay the sync_flush in your usage? |
The use case is streaming compression, such as WebSocket's RFC 7692. It requires a sync flush and mandates removing the terminating So, as much as we'd like the to avoid spontaneous flushing, this flushing has semantic significance and can't be avoided for this class of applications. |
OK, perhaps there are some features we can add to help with the compression ratio in this case. |
Any news on that or plans to address somehow? |
Yes, we have pulled in the feature to do static (btype 01) headers. This is the real issue on compression ratio with very small inputs and sync flush on. |
Before the latest changes (i.e., copied from #9 (comment)):
After the latest changes:
From what I see, no tangible difference has been made on this type of load (278k->213k ISA-L vs 142k zlib). ISA-L still gives 50% less efficient compression than Z_BEST_SPEED zlib. All the sources are in https://github.com/vlm/isa-l/commit/12aa33a1f6cce53a0f7e6ac3b5744056a94cb722, just |
When you have such small blocks and sync flushing each one set for static hufftables. isal_deflate_set_hufftables(stream, NULL, IGZIP_HUFFTABLE_STATIC); |
With
So, it is largely solved! Thank you very much! |
However, there's another anomaly. I tried to do the custom hufftables based on a 456k file. Now here's the result with DEFAULT hufftables, STATIC hufftables, and then a pair of I would expect the custom hufftables to be markedly more efficient (in STATIC or DEFAULT) modes than the default ones shipped with igzip. However, they aren't:
As you see, the custom hufftable in the DEFAULT mode works best, whereas in a STATIC mode the file outputs are exactly the same (81271 bytes!), irrespectively of whether the default (shipped) table is used or the generated one. With another sample file:
And another:
In the last examples I am not finding any advantage of using generated custom static tables over the default static ones. Is this expected? |
That's great. I hope you see the point of isal_deflate_set_hufftables() is to give more control to the programmer especially when you know something about your data that you would rather not the compressor guess at. As such setting IGZIP_HUFFTABLE_STATIC will force the use of btype 01 hufftables and header as specified by the standard. If you changed the default hufftables this doesn't effect the format of the standard btype 01 static table so having the same compression ratio is expected. The benefit being the 01 header is 100 bytes shorter.
Then you can have the best of both worlds. |
Thank you! |
Hello!
I am configuring the project with
autogen.sh && ./configure CFLAGS=-DLARGE_WINDOW
to obtain 32kB deflate windows instead of default 8k. I am setting.flush = SYNC_FLUSH
in the compression context after callingisal_deflate_init()
.The compression rate is virtually unchanged between 8k and 32k windows. The speed doesn't change at all. This prompts me to think about LARGE_WINDOW is not implemented properly, or the flag is not visible all the way through to the assembly code. However, I modified the
options.asm
manually to switch LARGE_WINDOW option on, and recompiled — to no avail. Here's my table on my large text data file I am using:It is important to consider is that I am using a
.flush = SYNC_FLUSH
setting on blocks of about 5kB each at a time. Had it worked likeZ_FULL_FLUSH
(as in, discard the history window), it could lead to the observed behavior.Would you please clarify whether this absense of a compression rate change is something I should expect after enabling
LARGE_WINDOW
.The text was updated successfully, but these errors were encountered: