-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Add gzip. Add deflate, replacing flate. Upgrade miniz.cpp to 1.16. #10360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add streaming compression API for GZip and deflate. Add Reader/Writer support for GZip.
Refactor code, clean up comments.
the data from the source to the destination streams automatically. It is | ||
callee-driven with an internal loop running until all the data have been processed. | ||
It's more efficient with less buffer copying. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also provide a code example or two of how to use these? They don't have to be large (they should actually be small), but just to get an idea of what's necessary to use the module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the following examples. See if they are ok.
Thanks for the review. I'll address the feedback shortly. |
Add changes for the feedback. Please review again. Thanks. |
// | ||
// The Original Code is: gzip.rs | ||
// The Initial Developer of the Original Code is: William Wong (williamw520@gmail.com) | ||
// Portions created by William Wong are Copyright (C) 2013 William Wong, All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @brson, just want to make sure that this looks good to you.
I'd recommend running |
|
||
/// Low level compress method to compress input data to DEFLATE compliant compressed data. | ||
/// You really need to know what you are doing to call this directly. It's fragile with edge cases. | ||
/// It has multiple modes of operation depending on the parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments like this on functions make me think that this shouldn't be a public API at all. I'm still a little confused why this module can't be exposed as a Reader
and a Writer
implementation. Shouldn't that sort of interface be just as efficient as the methods exposed here? It also seems like it'd be much simpler to read over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the low level workhorse method for the DEFLATE algorithms, but it's difficult to use and fragile with edge cases. The compress_stream and compress_read are user-friendlier versions of the API taking case of most of the problems. I left the low level API pub so that in case some people don't like the high level API and want to build their own.
Reader and Writer are not appropriate for a low level module like deflate. decompress_stream_rw() is one such example. See at the end how it discards the extra data. It might be ok for some cases to discard data but it shouldn't be the universal case. Providing the buffer-based API is absolutely essential.
The DEFLATE compressed data are usually embedded in another data stream. It's rarely standalone. E.g. gzip file has its own down data, then embeds the deflated data, and ends with its own data again. Same thing with zip file. The buffer-based API let the caller have greater control in dealing with the data boundaries when transition between non-deflate data and deflate data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to hinder usage of any of the underlying deflate utilities or anything, but I don't think that this method should still exist. We should have utilities to control what's read from a reader or written to a writer for things like fine-grained buffer management.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I have explained in above, Reader/Writer in the deflate module is not possible. Buffer-based API is the right approach.
As to whether a low level API like compress_buf() should be exposed, like I said before, the deflate module is not a general purpose library for day to day compression need. It's a building block for other higher level compression library writers. I believe we should give them all the powers they need to do their job, with sufficient warning.
…emove error returned from API.
…0 in gzip file_name.
Any update on the status of this pull request? |
We have chosen for I/O to be heavily based around Others may feel differently, but I am hesitant to land this while it continues to drive focus away from the |
I appreciate the update and I respect that decision. GZip has Reader/Writer support. Just deflate doesn't due to correctness. The reason I was adamant about not supporting Reader/Writer for deflate because I have been bitten by the subtle bug of deflate consuming extra data, which Reader won't be able to handle. The bug only shows up for some data but not the others. Correctness is number one priority and I don't feel like providing an API that could cause bug. I will withdraw the pull request for merging the changes. I'll file a separate pull request to merge in the new version of Miniz.cpp, which is just a upgrade and has low impact. |
Cancel the pull request since no agreement can be reached about the issue on the reader/writer API for the deflate component. Since gzip depends on deflate, it's withdrawn as well. The gzip and deflate will be parts of a compression library maintained separately on https://github.com/williamw520/rustyzip |
…ro_wrapper, r=llogiq Add the `transmute_int_to_non_zero` lint Fixes rust-lang#10288 This adds a new complexity lint `transmute_int_to_non_zero` which checks for transmutes to any of the `NonZero*` types, and suggests their `new_unchecked` method instead. r? `@llogiq` changelog: New lint: [`transmute_int_to_non_zero`]
No description provided.