Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

miniz_oxide doesn't validate HLIT #130

Closed
fintelia opened this issue Nov 25, 2022 · 6 comments
Closed

miniz_oxide doesn't validate HLIT #130

fintelia opened this issue Nov 25, 2022 · 6 comments

Comments

@fintelia
Copy link

The deflate spec requires that dynamic blocks contain at most 286 length/literal codes, despite the encoding allowing values up to 288 to be represented:

We can now define the format of the block:

5 Bits: HLIT, # of Literal/Length codes - 257 (257 - 286)
5 Bits: HDIST, # of Distance codes - 1 (1 - 32)
4 Bits: HCLEN, # of Code Length codes - 4 (4 - 19)

Unfortunately, this crate doesn't validate the requirement when decoding, so blocks with 287 or 288 length/literal codes are also accepted.

Example file (Make sure to extract the zip file and then feed bad.bin into miniz_oxide. GitHub doesn't allow uploading .bin files directly.)

@oyvindln
Copy link
Collaborator

What do zlib and C miniz when given this data? People often expect corrupted/non-conforming stuff to work regardless so if it e.g decompresses in zlib we might want to let it work as well.

@fintelia
Copy link
Author

I'm not sure about C miniz, but zlib rejects this case and even clarified in an issue that the handling is intentional.

@oyvindln
Copy link
Collaborator

Okay, then we should also reject it.

@oyvindln
Copy link
Collaborator

oyvindln commented Dec 7, 2022

When adding some code for this it looks like the test file has HLIT 2 and HDIST 23 (i.e 259 litlen codes 24 dist codes) so if it's supposed to fail it might be for something else. Does it decode fine with zlib? (miniz_oxide decodes it and seems to be raw deflate not zlib wrapped)

The code seems to already reject inputs with 32 distance codes (tested the bogus_data test in inflate/core.rs) though by checking some total number rather than HDIST directly (it was ported from miniz so some parts are still a tad cryptic). Would need to craft something with 287 or 288 litlen codes to see if that would be rejected too.

@fintelia
Copy link
Author

fintelia commented Dec 8, 2022

I probably should have clarified that the bad.bin file is a zlib wrapped deflate stream. The first bytes of the file are 78 01 FD E0 01, of which the first two are a zlib header. Of the next couple:

   FD       E0       01
-------- -------- --------
11111101 11100000 00000001
       1                   BFINAL=Final Block
     10                    BTYPE=Dynamic Huffman Codes
11111                      HLIT=31+257=288
            00000          HDIST=0+1=1
         111             1 HCLEN=15+4=19

If it is helpful, this is the code I used to generate the header.

@oyvindln
Copy link
Collaborator

oyvindln commented Dec 8, 2022

Ah it's probably correct, checked the source now and I was being a doofus and forgetting to change the filename in the test case I made. Will look into it with the right file later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants