Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: allow decompress to decompress stream data without content size in the header #150

Open
CaselIT opened this issue Apr 26, 2021 · 1 comment

Comments

@CaselIT
Copy link

CaselIT commented Apr 26, 2021

Edit: this is documented, but I think it would be an useful addition. Maybe by passing max_output_size=-1?

The example below raises ZstdError: could not determine content size in frame header

import io
import zstandard as zstd
data = b'some data'
input_buffer = io.BytesIO(data)
with zstd.ZstdCompressor().stream_reader(input_buffer) as r:
    compressed = r.read()
decompressed = zstd.decompress(compressed)
assert decompressed == data

Passing max_output_size to the decompress call works as expected:

...
decompressed = zstd.decompress(compressed, max_output_size=len(data))
assert decompressed == data

Using ZstdDecompressor.stream_reader also works:

...
with zstd.ZstdDecompressor().stream_reader(io.BytesIO(compressed)) as r:
    decompressed = r.read()
assert decompressed == data

I think decompress should handle this case as well, since it's not always possible to known how the data was compressed.

The package version I use is zstandard 0.15.2 on windows 10 (but I don't think it's os dependent)

@CaselIT
Copy link
Author

CaselIT commented Apr 26, 2021

Ok reading a the documentation in more details, this is the expected behaviour:

If the frame header of the compressed data does not contain the content
size ``max_output_size`` must be specified or ``ZstdError`` will be
raised. An allocation of size ``max_output_size`` will be performed and an
attempt will be made to perform decompression into that buffer. If the
buffer is too small or cannot be allocated, ``ZstdError`` will be
raised. The buffer will be resized if it is too large.

This is then a feature request, since I think it would be an useful addition.

@CaselIT CaselIT changed the title decompress fails to decompress data compressed with stream_reader if max_output_size is not provided Feature request: allow decompress to decompress stream data without content size in the header Apr 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant