Skip to content

Add gzip. Add deflate, replacing flate. Upgrade miniz.cpp to 1.16. #10360

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

williamw520
Copy link

No description provided.

the data from the source to the destination streams automatically. It is
callee-driven with an internal loop running until all the data have been processed.
It's more efficient with less buffer copying.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also provide a code example or two of how to use these? They don't have to be large (they should actually be small), but just to get an idea of what's necessary to use the module.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the following examples. See if they are ok.

@williamw520
Copy link
Author

Thanks for the review. I'll address the feedback shortly.

@williamw520
Copy link
Author

Add changes for the feedback. Please review again. Thanks.

//
// The Original Code is: gzip.rs
// The Initial Developer of the Original Code is: William Wong (williamw520@gmail.com)
// Portions created by William Wong are Copyright (C) 2013 William Wong, All Rights Reserved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @brson, just want to make sure that this looks good to you.

@alexcrichton
Copy link
Member

I'd recommend running make tidy as well. We have a maximum column limit on our files, and I think there are many lines which are over the limit.


/// Low level compress method to compress input data to DEFLATE compliant compressed data.
/// You really need to know what you are doing to call this directly. It's fragile with edge cases.
/// It has multiple modes of operation depending on the parameters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments like this on functions make me think that this shouldn't be a public API at all. I'm still a little confused why this module can't be exposed as a Reader and a Writer implementation. Shouldn't that sort of interface be just as efficient as the methods exposed here? It also seems like it'd be much simpler to read over.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the low level workhorse method for the DEFLATE algorithms, but it's difficult to use and fragile with edge cases. The compress_stream and compress_read are user-friendlier versions of the API taking case of most of the problems. I left the low level API pub so that in case some people don't like the high level API and want to build their own.

Reader and Writer are not appropriate for a low level module like deflate. decompress_stream_rw() is one such example. See at the end how it discards the extra data. It might be ok for some cases to discard data but it shouldn't be the universal case. Providing the buffer-based API is absolutely essential.

The DEFLATE compressed data are usually embedded in another data stream. It's rarely standalone. E.g. gzip file has its own down data, then embeds the deflated data, and ends with its own data again. Same thing with zip file. The buffer-based API let the caller have greater control in dealing with the data boundaries when transition between non-deflate data and deflate data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to hinder usage of any of the underlying deflate utilities or anything, but I don't think that this method should still exist. We should have utilities to control what's read from a reader or written to a writer for things like fine-grained buffer management.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I have explained in above, Reader/Writer in the deflate module is not possible. Buffer-based API is the right approach.

As to whether a low level API like compress_buf() should be exposed, like I said before, the deflate module is not a general purpose library for day to day compression need. It's a building block for other higher level compression library writers. I believe we should give them all the powers they need to do their job, with sufficient warning.

@williamw520
Copy link
Author

Any update on the status of this pull request?

@alexcrichton
Copy link
Member

We have chosen for I/O to be heavily based around Readers and Writers, and I still feel like this has a large amount of extra functionality which is not necessary for supporting that abstraction. I still believe that each of flate and the gzip module should provide a Reader and a Writer and that's about it.

Others may feel differently, but I am hesitant to land this while it continues to drive focus away from the Reader/Writer abstraction.

@williamw520
Copy link
Author

I appreciate the update and I respect that decision. GZip has Reader/Writer support. Just deflate doesn't due to correctness.

The reason I was adamant about not supporting Reader/Writer for deflate because I have been bitten by the subtle bug of deflate consuming extra data, which Reader won't be able to handle. The bug only shows up for some data but not the others. Correctness is number one priority and I don't feel like providing an API that could cause bug.

I will withdraw the pull request for merging the changes. I'll file a separate pull request to merge in the new version of Miniz.cpp, which is just a upgrade and has low impact.

@williamw520
Copy link
Author

Cancel the pull request since no agreement can be reached about the issue on the reader/writer API for the deflate component. Since gzip depends on deflate, it's withdrawn as well.

The gzip and deflate will be parts of a compression library maintained separately on https://github.com/williamw520/rustyzip

@williamw520 williamw520 closed this Dec 6, 2013
Jarcho pushed a commit to Jarcho/rust that referenced this pull request Feb 26, 2023
…ro_wrapper, r=llogiq

Add the `transmute_int_to_non_zero` lint

Fixes rust-lang#10288

This adds a new complexity lint `transmute_int_to_non_zero` which checks for transmutes to any of the `NonZero*` types, and suggests their `new_unchecked` method instead.

r? `@llogiq`

changelog: New lint: [`transmute_int_to_non_zero`]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants