Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deflateSync produces binary that inflateSync fails to decompress #8886

Open
MarByteBeep opened this issue Feb 13, 2024 · 5 comments
Open

deflateSync produces binary that inflateSync fails to decompress #8886

MarByteBeep opened this issue Feb 13, 2024 · 5 comments
Labels
bug Something isn't working bun.js Something to do with a Bun-specific API

Comments

@MarByteBeep
Copy link

What version of Bun is running?

1.0.26

What platform is your computer?

Microsoft Windows NT 10.0.22631.0 x64

What steps can reproduce the bug?

const buffer = new Uint8Array([0x12, 0x01, 0x3, 0x5, 0x5]);

{
    const compressed = Bun.deflateSync(buffer, { level: 9, windowBits: -15 });

    console.log(compressed);

    const decompressed = Bun.inflateSync(compressed);

    // decompressed isn't a Uint8array but an error: invalid stored block lengths
    console.log(decompressed);
}

{
    const compressed = Bun.deflateSync(buffer, { level: 9, windowBits: -15 });

    console.log(compressed);

    const decompressed = Bun.gunzipSync(compressed);

    // decompressed works fine
    console.log(decompressed);
}

What is the expected behavior?

A binary compressed with deflateSync should always be decompressible with inflateSync. When using the { level: 9, windowBits: -15 } settings, it is not.

However it IS decompressible with gunzipSync.

So we end up with the weird situation in which you have to use gunzipSync to decompress and deflateSync to compress.

What do you see instead?

decompressed is error: invalid stored block lengths

Additional information

As per discord discussion, might be related to #8529

Also I ran into: #6401 as well, the docs seem to be incorrect.

@MarByteBeep MarByteBeep added the bug Something isn't working label Feb 13, 2024
@Electroid Electroid added the bun.js Something to do with a Bun-specific API label Feb 13, 2024
@argosphil
Copy link
Contributor

You're passing a windowBits parameter value of -15.

From the zlib documentation at https://www.zlib.net/manual.html:

The windowBits parameter is the base two logarithm of the window size (the size of the history buffer). It should be in the range 8..15 for this version of the library. Larger values of this parameter result in better compression at the expense of memory usage. The default value is 15 if deflateInit is used instead.

Internally, a negative value of windowBits is used to switch between gzip and zlib compression modes. We should catch negative values in opts.windowBits and throw an error, but that's about all we can do.

@MarByteBeep
Copy link
Author

MarByteBeep commented Feb 14, 2024

That then creates a new problem: my original binary is compressed and starts with a78 DA header, which is a perfectly valid and common zlib header. I need to be able to recompress said data and end up with that header again. The only setting that did that was windowBits -15. How am I able to recreate the original if you simply prevent that value from being passed?

@MarByteBeep
Copy link
Author

Also, any idea why inflateSync fails to decompress a zlib binary with said header, while gunzipSync works? That feels a bit counterintuitive to me 😊

@argosphil
Copy link
Contributor

Internally, a negative value of windowBits is used to switch between gzip and zlib compression modes. We should catch negative values in opts.windowBits and throw an error, but that's about all we can do.

I was entirely and completely wrong. My apologies.

The actual problem is that inflateSync calls zlib like this:

        var reader = zlib.ZlibReaderArrayList.initWithOptions(compressed, &list, allocator, .{
            .windowBits = -15,
        }) catch |err| {

and you'd like to call it with .windowBits = 15. So this almost looks like my favorite kind of bugfix, those that involve removing or changing a single character.

But it's not: the existing code is right for the "raw deflate" format that zlib produces. Per the guidance of the zlib manual, we should default to expecting a zlib header, and assume it is absent only when explicitly told so. But right now, there's no option for that. We could simply check whether there's a valid zlib header, but that would still have many false positives.

So, technically, everything is as it should be, except for the function names. inflateSync should be inflateRawDeflatedDataSync, and used very rarely. Inflating zlib data and decompressing gzip files can safely share a function, since in those very rare cases where you want to accept only one of those formats, it's an acceptable burden to just check for the gzip magic number yourself.

Of course there's no way to guess that without improved API documentation.

Again, I'm sorry I was wrong. In summary, there are three compressed formats only two of which are known to be disjoint. Both (may) overlap with the third.

@paperdave
Copy link
Member

ive found that this affects bun.report's ability to decompress some panic messages, though the node:zlib polyfill does not have this issue (but is slower).

in addition to this issue, since Bun 1.1.21, more payloads dont seem to be decompressable:

expect(Bun.gunzipSync(Buffer.from('eNrLzCvLz05NUUguSizOcKoMSMzLTNbQVMhIzEvJSS0CAK/LCxc', 'base64url')).toString('utf8')).toBe('invoked crashByPanic() handler');

above:

  • Bun.gunzipSync fails with incorrect header check
  • Bun.inflateSync fails with invalid stored block lengths
  • node zlib gunzipSync fails with invalid header check
  • node zlib inflateSync works

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working bun.js Something to do with a Bun-specific API
Projects
None yet
Development

No branches or pull requests

4 participants