-
Notifications
You must be signed in to change notification settings - Fork 8
Chunk Block Size vs Number of Threads
lrzip-next
now employs the Dictionary coding that lzma2
uses. This reduces the dictionary set but also streamlines it. The code snippet below is obsolete. Now, dictionary size is set as follows.
2^n, 3*2^(n-1)
2^(n+1), 3*2^(n)
2^(n+2), 3*2^(n+1)
lrzip-next
will attempt to fit the largest possible block size based on compression level. If the block size * number of threads is greater than the memory overhead required, it will be reduced stepwise until it fits.
If threads * 2^26 (64MB) is too large, threads * 3*2^24 (48MB) will be tried
If threads * 3*2^24 is too large, threads * 2^25 (32MB) will be tried.
and so on.
This allows lrzip-next
to store dictionary size in 1 byte.
See Update on LZMA Compression Memory Usage
One of the first things I changed in lrzip-next was to alter the way that block sizes were computed in the open_stream_out
function. The methodology (still employed in lrzip) favored having high number of threads, even if it meant reducing individual block sizes. A smaller block size defeats the purpose of Long Range Zip since a small block has nothing long about it. Depending on the amount of system ram, the old method created some very odd results. Consider...
Current lrzip
Compression Level | # Threads | Chunk Size | # Blocks | Compression Ratio | LZMA Dictionary Size |
---|---|---|---|---|---|
5 | 9 | 128,700,872 | 8 | 6.568 | 1,048,576 |
6 | 9 | 128,073,425 | 8 | 6.649 | 4,194,304 |
7 | 9 | 55,721,681 | 16 | 7.770 | 16,777,216 |
8 | 8 | 10,485,760 | 85 | 7.610 | 33,554,432 |
lrzip-next
Compression Level | # Threads | Block Size | # Blocks | Compression Ratio | LZMA Dictionary Size |
---|---|---|---|---|---|
5 | 9 | 128,700,872 | 8 | 7.816 | 16,777,216 |
6 | 5 | 231,661,568 | 5 | 7.850 | 22,020,096 |
7 | 5 | 231,661,568 | 4 | 7.995 | 22,020,096 |
8 | 5 | 231,661,568 | 4 | 8.001 | 22,020,096 |
When I discovered the 85 blocks, it forced a revisit to the methodology. The methodology now tries first, reduce threads, then reduce LZMA Dictionary size, test against memory limit. If OK break out. If not, reduce Dictionary by 10% and try again. This method favors larger block sizes.
The end result is a slightly smaller dictionary size than the LZMA spec would have, and fewer threads than current. The benefit is still a larger Dictionary and a much larger block size to pass to the backend compressor. The goal was to keep Dictionary size >= 16MB and to keep threads >=4.
The same methodology was applied to ZPAQ as well and its Block Sizes. This makes ZPAQ compression much more efficient.
In pseudo-code
MAX_THREADS=current_number_of_threads
MIN_THREADS=4 or current_number_of_threads, whichever is lower
While Overhead * Num_Threads > limit_usable_ram
do
Dictionary_Size = lower power of 2^n or 3^n
Recompute_Overhead()
for threads = MAX_THREADS; threads > MIN_THREADS; threads--
do
if Overhead * threads <= limit_usable_ram; then break!
done
can we do again? Or, are minimums passed?
done
If still not resolved, keep reducing threads until 1.
Home
About
No More 7/9 Compression Level Limits
Threshold Limits
Using tar
with lrzip-next
Piping with lrzip-next
How to Maximize Compression
HOWTO Speed up Compiling
Feature added: Debugging
What we pass to LZMA
What we pass to ZPAQ
What we pass to BZIP3
What we pass to ZSTD
Computing Memory Overhead
Increasing Block Sizes to Compress
LZMA2 Dictionary Sizes NEW