Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in deflate with Z_FULL_FLUSH and a specific output buffer size #149

Closed
vinniefalco opened this issue Jun 10, 2016 · 4 comments
Closed

Comments

@vinniefalco
Copy link

vinniefalco commented Jun 10, 2016

When deflate is called with a buffer that is exactly large enough to hold the output, avail_out comes back as zero. As per the ZLib documentation:

If deflate returns Z_OK and with zero avail_out, it must be called again after making room in the output buffer because there might be more output pending.

Note that in this case, the sync trailer has already been emitted (0x00, 0x00, 0xff, 0xff). On the subsequent call to deflate, a second sync trailer is emitted. This results in an unnecessary 4 bytes being added to the output.

This example program illustrates the problem. The result of compress should be the same no matter the value of availOut. But it isn't, the first string is 24 bytes in size and the second string is 19 bytes in size even though both inflate to the same result.

#include <iostream>
#include <zlib.h>

std::string
compress(std::string const& in, std::size_t availOut)
{
    z_stream zs;
    zs.zalloc = Z_NULL;
    zs.zfree = Z_NULL;
    zs.opaque = Z_NULL;
    zs.avail_in = 0;
    zs.next_in = Z_NULL;
    deflateInit2(&zs,
        Z_DEFAULT_COMPRESSION,
        Z_DEFLATED, -15,
        4, // memory level 1-9
        Z_DEFAULT_STRATEGY
    );

    std::string out;
    zs.avail_in = in.size();
    zs.next_in = (Bytef*)in.data();
    do
    {
        out.resize(out.size() + availOut);
        zs.avail_out = availOut;
        zs.next_out = (Bytef*)&out[zs.total_out];
        deflate(&zs, Z_FULL_FLUSH);
    }
    while(zs.avail_out == 0);
    out.resize(zs.total_out);
    deflateEnd(&zs);
    return out;
}

int main()
{
    auto s1 = compress("Hello, world!", 19);
    auto s2 = compress("Hello, world!", 30);

    if(s1 != s2)
        std::cerr << "Strings are different!";
}
@madler
Copy link
Owner

madler commented Jun 10, 2016

This is not a bug because zlib does not promise anywhere in the documentation that the output is invariant to changes in the amount of available output provided to the deflate() calls. In fact, there is an explicit mention in the documentation in zlib.h that avail_out returning zero can result in repeated flush markers.

Despite that, I will look at how to remedy the situation and improve the predictability of flush results. The fault lies more in the definition of the interface than in the code.

@madler madler closed this as completed Jun 10, 2016
@vinniefalco
Copy link
Author

@madler I agree its a problem with the interface and not the code. However, consider this section of draft-ietf-hybi-permessage-compression-28:

   An endpoint uses the following algorithm to compress a message.

   1.  Compress all the octets of the payload of the message using
       DEFLATE.

   2.  If the resulting data does not end with an empty DEFLATE block
       with no compression (the "BTYPE" bits are set to 00), append an
       empty DEFLATE block with no compression to the tail end.

   3.  Remove 4 octets (that are 0x00 0x00 0xff 0xff) from the tail end.
       After this step, the last octet of the compressed data contains
       (possibly part of) the DEFLATE header bits with the "BTYPE" bits
       set to 00.

https://tools.ietf.org/html/draft-ietf-hybi-permessage-compression-28#section-7.2.1

Turns out it is quite difficult to know when to remove the last flush marker without causing deflate to emit a spurious flush marker.

Another way to look at this is that a compression algorithm should produce a small as output as possible. If callers have no way to avoid spurious flush markers (I have not been able to figure out how to detect that a flush marker is spurious), I would consider that a bug.

@madler
Copy link
Owner

madler commented Jun 10, 2016

Actually it's quite easy to do if you separate the compression from the flushing. Just call deflate() with Z_BLOCK until avail_out is not zero. At that point there will be at most a few bits left to write. Then call deflate() with Z_FULL_FLUSH and no more input and at least six bytes of available output.

@vinniefalco
Copy link
Author

Ah!!!! Thanks a zillion! I was wondering about that but sometimes its really tough to decipher the description of the flush parameters in the documentation.

hzhuang1 pushed a commit to Linaro/warpdrive-zlib that referenced this issue Jul 31, 2019
* Fix build problems about NEON on AArch64

NEON is enabled by default on armv8-a platforms, and so NEON related
objects should be included when the platform is armv8-a. Errors about
adler32_neon will occur when you run ./configure on armv8-a platforms
without --neon option, because zlib-ng uses --neon option to include
NEON related objects regardless of Arm architecture.
You will have similar issue when you build the project with cmake.

This patch fixes the problem by including NEON related objects when
the platform is armv8-a(including aarch64).

Use adler32_neon only when zlib-ng is configured with --neon (or
-DWITH_NEON=ON if using cmake), or else use the default adler32
no matter what Arm architecture is.

Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants