Avoid 0xFF00 markers in Huffman data #27

kornelski · 2014-03-19T01:44:02Z

0xFF bytes in Huffman data need special coding that adds another byte of overhead. If this overhead could be reliably minimized, then approximation of size if scan data described #15 would be more effective.

Here's an algorithm off the top of my head:

Find which code happens to overlap the 0xFF bytes most frequently
Swap the code with another code of the same length (if there isn't a code to swap with — maybe modify/regenerate the Huffman tree (e.g. flip all bits in all codes?))
goto 1 as long as result keeps getting smaller

And/Or:

For all symbols in the stream make histogram of pairs (code[symbol[n-1]], code[symbol[n]])
Find which pairs have 8+ consecutive 1s.
Starting from most frequent pair swap codes with other codes of the same length to break the streaks.

Likely it can be improved.

The text was updated successfully, but these errors were encountered:

frkay · 2014-03-19T22:19:18Z

I have to check how codes having the same Huffman code length are stored in the file, I seem to recall that it's by increasing value and nothing related to there actual frequency (a part from having the same Huffman code length which comes from the fact that they have similar frequencies).
i.e codes of length 5, 4 entries : 7 9 10 12
But if 12 appears a bit more frequently than 10 and unfortunately has a code ending with a lot of ones like 00111 it could be smart move to change the recording order to 7 9 12 10 and this time 12 would get a less "dangerous" 00110 code and 10 would get 00111.

I'm pretty sure the histogram of pairs will not work since there is often raw binary data in-between two Huffman codes.

bdaehlie modified the milestone: v2.0 May 7, 2014

bdaehlie added the enhancement label Jun 26, 2014

jrmsmith mentioned this issue Jul 28, 2015

Jpeg custom Huffman tables #184

Closed

jiangdongguo mentioned this issue Dec 2, 2021

OOM on Android #411

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid 0xFF00 markers in Huffman data #27

Avoid 0xFF00 markers in Huffman data #27

kornelski commented Mar 19, 2014

frkay commented Mar 19, 2014

Avoid 0xFF00 markers in Huffman data #27

Avoid 0xFF00 markers in Huffman data #27

Comments

kornelski commented Mar 19, 2014

frkay commented Mar 19, 2014