-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch default Zip compression level from 6 to 4 #1125
Conversation
Looks like ImfStandardAttributes.h is missing #include "ImfIntAttribute.h", which leads to the clang link error. |
This is a very exciting change! The prospect of doubling write speed with a few lines of code change, no user intervention, and barely any detectable change in compression ratio is crazy awesome. |
This does look promising. Could I suggest running these performance comparison tests on the standard OpenEXR test images as well as your larger test set so there's a more self-contained reproducible test? Also, somewhat pedantic point that might need clarification in comments: the presence of a zipCompressionLevel attribute in a file won't guarantee the file was compressed with that level of compression, or even that it was written with ZIP compression. A ZIP file might be read in and written out again as PIZ preserving all metadata, so the zipCompressionLevel attribute is preserved but ignored. Equally an older version of the library might read in a ZIP file and write it back out at again. The old library won't interpret the zipCompressionLevel and will use the old default value, so there will be an unexpected change in file size. |
Thanks for that graph: that's still a sizeable chunk of performance gain. Yes, good point about However, |
Good point, haven't thought of that! Something like f4036be1c then perhaps? |
Do we really want the compression level as persistent attributes? as part of ImfHeader, sure, but the compression level doesn't seem right to have in the attribute list. I have identical feelings about both dwaCompressionLevel and zipCompressionLevel - both seems like something we should remove from the attribute list, and have the ImfHeader have an ephemeral list of options to control the compression (we can re-use the attribute type system for genericity)? |
Do we really want the compression level as persistent attributes? as part
of ImfHeader, sure, but the compression level doesn't seem right to have in
the attribute list.
For historical purposes, dwaCompressionLevel seemed like the least-bad
option at the time, as there wasn’t really a mechanism to signal details to
the compressor otherwise.
Having something non-persistant seems like a good idea.
|
It does feel a bit wrong to me too, but here I was just following what DWA compression did. Some sort of "non-persistent options" thing would feel better indeed, but that's perhaps a larger scope task/effort/decision. It still feels to be that changing zlib level from current 6 to 4 (even without any attributes to control it) would be worth it. Compression gets 2x faster, at a tiny loss of ratio. |
We discussed this at length at the last TSC meeting (which you are very welcome to join, they are open zoom calls). We're very excited about these optimizations and wish to implement them, and 100% agree that doubling speed with just a tiny change in compression ratio is definitely the right thing to do. We thought that just changing the level internally with no runtime control was sort of the lazy way out, and we preferred to make it externally controllable so that (a) we can more easily test across a range of images to verify that we're replicating your results before changing the default for everybody, (b) others can try all of the compression levels without altering or recompiling the source, as different users and their cases may have different preferences about what compression vs speed is the right trade-off, (c) we feared that there might be users who require bit-exact files, not just bit-exact pixels (for testing/validation replicability?) and so an option to keep the old behavior may ease the transition for them, even if they agree that 4 should be the default moving forward. |
yes, yes, my comments are only about the mechanism, not that we should avoid the change :) I will try to prototype an api to add to ImfHeader and exr_context that we can comment on and once happy we can merge the efforts. Thanks again for bringing this up! |
@aras-p Do you recall whether your tests results posted here and on your blog here were using multithreading or not? I don't see any reference to the thread count. And also not 100% sure if you were doing the full I/O, or doing some trick like writing to /dev/null to minimize I/O overhead in the timing. I tried doing an analogous change in OpenImageIO, but for TIFF files, and got good results, though not as good. I tested with 100 images randomly selected from the Animal Logic Lab USD scene (they are almost all 4k^2, a mix of 1- and 3-channel images), converted from half to uint16 TIFF images. Going from zip level 6 to level 4 resulted in 3% bigger files, and around 40% faster write times. It's not apples-to-apples versus your tests... uint16 vs half, TIFF vs OpenEXR (different overheads), the texture files I used may have different compression characteristics than the set of images you used. And I was benchmarking using a real image manipulation program, not merely a test harness, so there was probably other overhead mixed in. Oh, and you said y our files were all 4 channel RGBA (and the thumbnails you showed looked like maybe A=1.0 everywhere), versus mine that were a mix of 1- and 3-channel images, so dunno if that also may have been an important difference. But another thing that occurred to me is threading -- I was using 32 threads when I tested, because I mostly care about performance on our typical production workstations. But writing files consists of both compression (fairly well parallelized by threads), and the actual I/O (serialized). So I was thinking that if your tests were single threaded, or had a low thread count, the improvement in compression time would appear larger to you because it would have represented a larger portion of your total runtime. Switching the default and a ~3% larger disk space still seems like a more than fair tradeoff for a 40% improvement in write speed, but I was curious to poke a bit more about why I was seeing less than the 2x you reported. |
@aras-p After looking at your previous blog post and comparing the charts, I think I figured out that you are reporting results with 16 threads? |
It would be nice to have similar controls for dwaa/dwab lossless channels which use zip compression. |
It would help if I read the code before commenting. Ignore me, this is already done in the request. |
@lgritz yes, on a 2019 MacBookPro with 16 hardware threads. My guess for why my numbers are slightly different: fewer threads in my case, so CPU time is more important; and Macs tend to have really fast SSDs, so IO time is less important. |
In my test of 18 various exr files (total raw data size 1057MB): - zip 6: 2.452x ratio, 206MB/s compression - zip 4: 2.421x ratio, 437MB/s compression So a tiny drop of compression ratio, but compression is more than 2x faster. This makes writing zip faster than writing uncompressed (386MB/s). Decompression speed unaffected. Signed-off-by: Aras Pranckevicius <aras@unity3d.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% sure how we should document all this...
Thanks! |
…areFoundation#1125) In my test of 18 various exr files (total raw data size 1057MB): - zip 6: 2.452x ratio, 206MB/s compression - zip 4: 2.421x ratio, 437MB/s compression So a tiny drop of compression ratio, but compression is more than 2x faster. This makes writing zip faster than writing uncompressed (386MB/s). Decompression speed unaffected. Signed-off-by: Aras Pranckevicius <aras@unity3d.com>
In my test of 18 various exr files (total raw data size 1057MB): - zip 6: 2.452x ratio, 206MB/s compression - zip 4: 2.421x ratio, 437MB/s compression So a tiny drop of compression ratio, but compression is more than 2x faster. This makes writing zip faster than writing uncompressed (386MB/s). Decompression speed unaffected. Signed-off-by: Aras Pranckevicius <aras@unity3d.com>
Kinda related to #1002 but this one does not add any new libraries or compressors. After #1149 added controllable zip compression levels, this makes the default zip level change; from 6 to 4.
In my test of 18 various exr files (total raw data size 1057MB):
So a tiny drop of compression ratio, but compression is more than 2x faster. This makes writing zip faster than writing uncompressed (386MB/s). Decompression speed unaffected.
A longer writeup of all this, with data graphs: https://aras-p.info/blog/2021/08/05/EXR-Zip-compression-levels/