-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deflat Compressor Length #69
Comments
You'd need to guess some uncompressed size, allocate a buffer of that size, then try decompressing into it. If it fails with LIBDEFLATE_INSUFFICIENT_SPACE, enlarge the buffer and try again. And so on. It's not really a good solution. It's much better to just store the uncompressed size along with the compressed data. It only takes a few bytes to store the uncompressed size -- or less if you store an approximate size only. Note that your code snippet only uses compression level 9, while libdeflate goes up to level 12. You probably could save much more than a few bytes by using a higher compression level. |
Hello @ebiggers thanks for your fast reply, I noticed there's a buffer and windows size value in .net internal const int DefaultBufferSize = 8192;
private const int WindowSizeUpperBound = 47; What do you think about it? How it's handling unknown size?
I want to prevent data splitting as much as I can but if it's the last thing I can do, I'll do but I'm dying from curiosity how .net managed deflate compression/decompression. Also i tried a very small test data (24 byte) This is deflatestream output Regards, |
Your C# code snippet streams the data to a MemoryStream, which implements an automatically-resizing array. So it could end up copying the data many times as it incrementally reallocates the array, and end up with a buffer up to 2x larger than is required. You can improve the performance of both the C# version and the libdeflate version, and also make it much easier to use libdeflate, by storing the uncompressed size along with the compressed data.
Check libdeflate.h for how to use the API. |
@Bit00009 internally any deflate decoder rely on BFINAL bit in stream so them generally has idea when they should stop. Also there is exist some kind of natural chunking in deflate, so it may adapted to work with input/output buffers which may essentially be represented under Technically, at least for decompressing, "Stream"-like facade - is not so simple in fact, as it may look. Just imagine what you have file with 80,000 small compressed blocks (say 500-1.5Kb) which are written in stream one-by-one. Each instance of I'm generally trying to show what blind reproducing of generic |
I guess there could be a function that simulates the decompression without actually writing any output. It might be useful for checking if a given block of data contains a valid deflate stream, and could report the number of input bytes consumed and output bytes produced, which would be useful here. This might be outside the scope of libdeflate though. |
Sure, but that would encourage people to do the wrong thing (decompress the data twice) instead of the right thing (store the uncompressed size along with the compressed data). And I'd like to keep the API simple. So I think it's better to not add it, and instead just encourage people to do the right thing. |
When the deflated stream is like several MB, partial read results in |
Hi there dear @ebiggers
First thank you for this amazing library, I just came from zlib to your library and I have some difficulties.
I want to do a very simple deflate compression on a buffer and I can't figure it out how to deal with length.
I know there's two way to deal with length in gzip and lz4 , I worked with them before and I was adding original data size as a little header in my compressed data but In this case every single byte is important to me.
In C# there's a internal compressor known as
DeflatStream
we just pass a byte array and get a byte array back :Result data doesn't contain any thing related to original size data.
So I tried to do the same using libdeflate and here's my current code :
As you can see I always need
originalDataSize
to make compression/decompression work well, I don't know if I'm doing something wrong or not but I need to do it like how C#DeflateStream
works, I need to pass a new vector or byte array without knowing original size and do the decompression.I want to have a pure deflate result in compression with out adding any metadata or extra information I will be very greatful if you tell me how is this possible in c++ within your library.
Regards,
Ultran
The text was updated successfully, but these errors were encountered: