Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Explorer cannot unpack some files generated with libdeflate (reward offered) #323

Closed
chrisgch opened this issue Sep 7, 2023 · 9 comments · Fixed by #324
Closed
Labels

Comments

@chrisgch
Copy link

chrisgch commented Sep 7, 2023

First, let me explain what I'm doing: I'm using libdeflate to compress files on Windows, wrap them with my own ZIP headers and store them as ZIP files. Libdeflate is amazingly fast compared with other deflate engines.

Recently a user sent me a ZIP archive created with libdeflate which the Windows 7/10/11 Explorer cannot unpack. It fails with error 0x80004005 (unknown error). The file is named FreeImage.dll version 3.17 from a sourceforge project. Strangely, all other unpackers I tried can unpack the file just fine, including the original pkzip 2.04g, 7zip, Winzip, and WinRAR.

Therefore the libdeflate compression isn't faulty, but incompatible with Windows 7/10/11 Explorer.

I have managed to extract the smallest possible block with which the error occurs (10kBytes). Interestingly, the compressed data only uses literals, no offsets, because apparently no duplicate data is found. Also the file is the same for compression method 1 to 9. The problem must therefore be located in the Huffman encoder.

Attached are 4 archives:
Explorer_Error_libdeflate.zip: This file was created with libdeflate and cannot be unpacked with Explorer
Explorer_OK_7zip.zip: This file was created with 7zip and can be unpacked with Explorer
Explorer_OK.zip: This file was created by my program with an different deflate library and can be unpacked with Explorer
Freeimage.zip: The entire dll packed with libdeflate which cannot be unpacked with Explorer

Btw, I checked the difference of the files in the libdeflate decoder:
libdeflate:
num_litlen_syms=257
num_offset_syms=1
num_explicit_precode_lens=10

regular deflate:
num_litlen_syms=257
num_offset_syms=2
num_explicit_precode_lens=18

You will probably say that it's Microsoft's fault and I should contact them about the bug, but it's impossible to contact anyone in charge at Microsoft about such a bug. And it's annoying for my users when they create ZIP files which then can't be used by Windows Explorer users. Also Windows 7 is out of support and will never get a fix for this.

Therefore I'm willing to offer you a $2000 reward if you can make libdeflate compatible with Windows Explorer.

Explorer_Error_libdeflate.zip
Explorer_OK_7zip.zip
Explorer_OK.zip
FreeImage.zip

@ebiggers
Copy link
Owner

ebiggers commented Sep 8, 2023

It sounds like Windows Explorer is violating the DEFLATE specification (RFC 1951), which permits a DEFLATE block to contain one distance (offset) codeword:

A code length of 0 indicates that the corresponding symbol in the literal/length or distance alphabet will not occur in the block, and should not participate in the Huffman code construction algorithm given earlier. If only one distance code is used, it is encoded using one bit, not zero bits; in this case there is a single code length of one, with one unused code. One distance code of zero bits means that there are no distance codes used at all (the data is all literals).

This may be a case where practice does not entirely match the specification, so I will consider changing it... It would reduce compression ratio slightly, but it sounds like other DEFLATE implementations already take this penalty.

Even if I "fix" this, please file a bug report with Microsoft too. They could fix their violation of the DEFLATE specification.

I don't accept compensation for work on libdeflate.

@chrisgch
Copy link
Author

chrisgch commented Sep 8, 2023

Your analysis seems to be 100% on point! When I change the code

	if (unlikely(num_used_syms == 0)) {
		return;
	}

to (taken from the following block if (unlikely(num_used_syms == 1)) {

	if (unlikely(num_used_syms == 0)) {
		/*
		 * Code is empty.  sort_symbols() already set all lengths to 0,
		 * so there is nothing more to do.
		 */
		unsigned nonzero_idx = 1;
		codewords[0] = 0;
		lens[0] = 1;
		codewords[nonzero_idx] = 1;
		lens[nonzero_idx] = 1;
		return;
	}

then the Explorer is able to unpack the archive just fine! I hope that this is the change you had in mind. The 6 dlls out of 3718 from system32 which Explorer failed to unpack can now all

I have reported the issue via Windows feedback hub now. Unfortunately it cannot be viewed via Web, only via Feedback Hub app. Here is the link for anyone interested:
https://aka.ms/AAmj513

@tansy
Copy link
Contributor

tansy commented Sep 8, 2023

Windows 7/10/11 Explorer cannot unpack (it). Strangely, all other unpackers I tried can unpack the file just fine, including the original pkzip 2.04g, 7zip, Winzip, and WinRAR.

Maybe you should ask Microsoft what they did to it?

Neither unzip, 7zip, nor zipinfo have seen any problem with this file.

ebiggers added a commit that referenced this issue Sep 10, 2023
Fix compatibility with some DEFLATE decompressors that don't correctly
handle some edge cases in the DEFLATE specification.

Fixes #323
ebiggers added a commit that referenced this issue Sep 10, 2023
Fix compatibility with some DEFLATE decompressors that don't correctly
handle some edge cases in the DEFLATE specification.

Fixes #323
ebiggers added a commit that referenced this issue Sep 10, 2023
Fix compatibility with some DEFLATE decompressors that don't correctly
handle some edge cases in the DEFLATE specification.

Fixes #323
ebiggers added a commit that referenced this issue Sep 10, 2023
Fix compatibility with some DEFLATE decompressors that don't correctly
handle some edge cases in the DEFLATE specification.

Fixes #323
ebiggers added a commit that referenced this issue Sep 10, 2023
Fix compatibility with some DEFLATE decompressors that don't correctly
handle some edge cases in the DEFLATE specification.

Fixes #323
@ebiggers ebiggers added the bug label Sep 10, 2023
ebiggers added a commit that referenced this issue Sep 10, 2023
Fix compatibility with some DEFLATE decompressors that don't correctly
handle some edge cases in the DEFLATE specification.

Fixes #323
@ebiggers
Copy link
Owner

Fixed by #324.

I decided it's worth changing this for several reasons:

  • The edge case of incomplete Huffman codes is inherently somewhat ambiguous, and the decompressor in very old versions of zlib mishandles it too. So it's not just Windows Explorer. Perhaps Windows Explorer incorporates a very old version of zlib and that is why the issue exists there.
  • The other DEFLATE compressors I checked avoid this edge case too.
  • Avoiding incomplete codes only makes a difference in rare cases, and it only uses a few more bits.

I do still consider it to be a bug for decompressors to not handle this edge case in the specification.

@ace-dent
Copy link

ace-dent commented Sep 17, 2023

@ebiggers - is there any way to disable, for instance if we are using with png?
I believe pngout provides a parameter for this (source):

/mincodes# : ensure Huffman tables have a minimum number of codes: {0(default),1,2}. May add a few extra bytes to the file size, but provides compatibility with buggy PNG decoders; applications using Zlib 1.2.1 require 1, while some mobile phones require 2.

... or is this something else?

@ebiggers
Copy link
Owner

I don't plan to add a parameter like that. More parameters == more complexity, and users will not know when to use it. zlib, 7-Zip, zopfli do not provide such a parameter either. The difference in compression ratio is very small. Also it usually only affects short inputs, but libdeflate v1.19 usually does better on short inputs than v1.18 due to other improvements.

@ace-dent
Copy link

@ebiggers - thanks for the reply. It is good to avoid unnecessary parameters +1.
I'm suggesting that png images do not need these minimum codes, (since the 2000s), so perhaps those streams could be treated as before?

@ebiggers
Copy link
Owner

Even if no PNG decoder has this issue (I wouldn't be so sure about that), libdeflate can't do something specifically for PNG without adding a parameter that tells libdeflate whether the DEFLATE stream is being created for a PNG or not.

@white-gthb
Copy link

white-gthb commented Sep 23, 2023

I have reported the issue via Windows feedback hub now. Unfortunately it cannot be viewed via Web, only via Feedback Hub app. Here is the link for anyone interested:
https://aka.ms/AAmj513

I get Your account doesn't have access to this feedback
I this feedback still publicly accessible in Microsoft Feedback-hub?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants