Skip to content

Conversation

wapmorgan
Copy link
Contributor

@wapmorgan wapmorgan commented Jun 17, 2017

Original thread: https://bugs.php.net/bug.php?id=73944

Description:

I'm writing an unpacker of MSZIP blocks that use deflate compression.
I faced a problem and created a question on stackoverflow (http://stackoverflow.com/questions/41653663/is-there-something-special-in-windows-mszip-format-or-cfdata-in-cab-files), where maintainer of zlib library adviced to use dictionary option to pass raw uncompressed data of previous block to decompress next block of data. But in fact that this option doesn't work correctly in php.

Test script:

$in = inflate_init(ZLIB_ENCODING_RAW, array('dictionary' => str_repeat("\00", 32768)));
$a = inflate_add($in, file_get_contents('https://github.com/wapmorgan/CabArchive/releases/download/0.0.1-dev/1block'));
echo '1 block: '.strlen($a).PHP_EOL;

$in = inflate_init(ZLIB_ENCODING_RAW, array('dictionary' => $a));
$b = inflate_add($in, file_get_contents('https://github.com/wapmorgan/CabArchive/releases/download/0.0.1-dev/2block'));
echo '2 block: '.($b === false ? 'failed' : strlen($b)).PHP_EOL;

Expected result:

1 block: 32768
2 block: 32768

Actual result:

1 block: 32768
Warning: inflate_add(): data error in ...
2 block: failed

What I've done

  1. Removed check of dictionary parameter in inflate_add if it's a string. As told Mark Adler

a deflate dictionary is just a bunch of bytes

  1. Set of dictionary in inflate_init if compression method is raw.
    If used raw method, Z_NEED_DICT will not be returned so php will not use dictionary at all.

Result after this patch, applied to master

1 block: 32768
2 block: 32768

@sgolemon
Copy link
Member

Could you clarify what the 1block and 2block files are? To properly accept this patch we should include a unit test demonstrating what's wrong. While you did include a test with expected and actual output, the data files are... fairly opaque. Can you describe how they are created?

@wapmorgan
Copy link
Contributor Author

wapmorgan commented Jun 19, 2017

These files are MSZip. They can be found in any .cab file using MSZip compression.
Example of file attached.
webrec.cab.zip
More information about MSZip format in my question on stackoverflow.

@nikic
Copy link
Member

nikic commented Jun 23, 2017

Can you please add your test-case from the PR description as a phpt test?

@bwoebi You might want to review this.

@bwoebi
Copy link
Member

bwoebi commented Jun 23, 2017

That looks pretty much fine, I can merge that with a phpt test attached.

@wapmorgan
Copy link
Contributor Author

Can I include these raw files 1block and 2block in test or tests should not contain binary files?

@bwoebi
Copy link
Member

bwoebi commented Jun 24, 2017

It's totally fine to add fixture files as needed.

@wapmorgan
Copy link
Contributor Author

Where I should put fixtures and what names it should have?

@wapmorgan
Copy link
Contributor Author

Ok, I've added test to ext/zlib/tests as bug73944.phpt, bug73944_fixture1 and bug73944_fixture2.

@nikic
Copy link
Member

nikic commented Jun 25, 2017

Looks like this got merged into 7.0+ as cd1869b yesterday, so closing here.

@nikic nikic closed this Jun 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants