Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4890732: GZIPOutputStream doesn't support optional GZIP fields #3072

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

@linzang
Copy link
Contributor

@linzang linzang commented Mar 18, 2021


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Integration blocker

 ⚠️ The change requires a CSR request to be approved.

Issue

  • JDK-4890732: GZIPOutputStream doesn't support optional GZIP fields

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3072/head:pull/3072
$ git checkout pull/3072

Update a local copy of the PR:
$ git checkout pull/3072
$ git pull https://git.openjdk.java.net/jdk pull/3072/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 3072

View PR using the GUI difftool:
$ git pr show -t 3072

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/3072.diff

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Mar 18, 2021

👋 Welcome back lzang! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

@openjdk openjdk bot commented Mar 18, 2021

@linzang The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs label Mar 18, 2021
@linzang
Copy link
Contributor Author

@linzang linzang commented Mar 18, 2021

Dear All,
This PR introduce new constructor of GZIPOutputStream to support adding extra field info in gzip file header, as decribed in RFC 1952 gzip spec (https://tools.ietf.org/html/rfc1952).
BTW, does a CSR required for this PR?

Thanks!
Lin

@linzang
Copy link
Contributor Author

@linzang linzang commented Mar 18, 2021

/issue 4890732

@openjdk openjdk bot added the rfr label Mar 18, 2021
@openjdk openjdk bot changed the title 4890732: support optional GZIP fields in GZIPOutputStream 4890732: GZIPOutputStream doesn't support, in fact thwarts, use of optional GZIP fields Mar 18, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Mar 18, 2021

@linzang This issue is referenced in the PR title - it will now be updated.

@mlbridge
Copy link

@mlbridge mlbridge bot commented Mar 18, 2021

@AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented Mar 18, 2021

/csr

@openjdk openjdk bot added the csr label Mar 18, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Mar 18, 2021

@AlanBateman has indicated that a compatibility and specification (CSR) request is needed for this pull request.
@linzang please create a CSR request and add link to it in JDK-4890732. This pull request cannot be integrated until the CSR request is approved.

@linzang
Copy link
Contributor Author

@linzang linzang commented Mar 18, 2021

/csr 8263793

@openjdk
Copy link

@openjdk openjdk bot commented Mar 18, 2021

@linzang usage: /csr [needed|unneeded], requires that the issue the pull request refers to links to an approved CSR request.

@linzang
Copy link
Contributor Author

@linzang linzang commented Mar 18, 2021

/csr needed

@openjdk
Copy link

@openjdk openjdk bot commented Mar 18, 2021

@linzang an approved CSR request is already required for this pull request.

@mlbridge
Copy link

@mlbridge mlbridge bot commented Mar 18, 2021

Mailing list message from Lance Andersen on core-libs-dev:

Hi Lin,

Thank you for your contribution.

I have no looked at this in any detail. A few general comments:

* This will require a CSR
* Please update your PR to include test coverage demonstrating that the data can be written and then read back

Best,
Lance

On Mar 18, 2021, at 6:57 AM, Lin Zang <lzang at openjdk.java.net<mailto:lzang at openjdk.java.net>> wrote:

4890732: GZIPOutputStream doesn't support, in fact thwarts, use of optional GZIP fields

-------------

Commit messages:
- remove trailing spaces
- 4890732: support optional GZIP fields in GZIPOutputStream

Changes: https://git.openjdk.java.net/jdk/pull/3072/files
Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3072&range=00
Issue: https://bugs.openjdk.java.net/browse/JDK-4890732
Stats: 184 lines in 1 file changed: 184 ins; 0 del; 0 mod
Patch: https://git.openjdk.java.net/jdk/pull/3072.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/3072/head:pull/3072

PR: https://git.openjdk.java.net/jdk/pull/3072

[cid:E1C4E2F0-ECD0-4C9D-ADB4-B16CA7BCB7FC at home]

Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037
Oracle Java Engineering
1 Network Drive
Burlington, MA 01803
Lance.Andersen at oracle.com<mailto:Lance.Andersen at oracle.com>

@linzang
Copy link
Contributor Author

@linzang linzang commented Mar 19, 2021

The CSR has been submitted at https://bugs.openjdk.java.net/browse/JDK-8263793

Copy link
Contributor

@LanceAndersen LanceAndersen left a comment

Hi Lin,

Again, thank you for taking on this 19+ year feature request.

I have not done a deep dive on the CSR, but wanted to get a few comments back to you on a quick scan of the last PR update.

Personally, I would like to see more testing for a change such as this given the age of the code:

  • Please do not modify the existing test, you can either create a new test(s) or add tests to the existing test class
  • We should capture Gzip files with these headers set from other tools and store the Gzip in an array within the test which can then be written to disk so the tests can validate interoperability. Please see some of the other Zip tests for an example
  • We should have tests that include some , but not all of the additional fields as I believe they are all optional according to the RFC.
  • Please include some negative tests

I have also include some additional comments within the code

* | XLEN |...XLEN bytes of "extra field"...|
* +---+---+=================================+
*/
int xlen = extraFieldBytes.length;

This comment has been minimized.

@LanceAndersen

LanceAndersen Mar 22, 2021
Contributor

On a quick look at the RFC, I noticed the following:


2.3.1.1. Extra field

     If the FLG.FEXTRA bit is set, an "extra field" is present in
     the header, with total length XLEN bytes.  It consists of a
     series of subfields, each of the form:

        +---+---+---+---+==================================+
        |SI1|SI2|  LEN  |... LEN bytes of subfield data ...|
        +---+---+---+---+==================================+

     SI1 and SI2 provide a subfield ID, typically two ASCII letters
     with some mnemonic value.  Jean-Loup Gailly
     <gzip@prep.ai.mit.edu> is maintaining a registry of subfield
     IDs; please send him any subfield ID you wish to use.  Subfield
     IDs with SI2 = 0 are reserved for future use.  The following
     IDs are currently defined:

This does not seem to be accounted for or is it?

This comment has been minimized.

@linzang

linzang Mar 26, 2021
Author Contributor

Yes, This should be considered, I will add logic in the code.
However, I did some search and found that there is only one subfield defined, which is mentioned in RFC:
SI1 SI2 Data


0x41 ('A') 0x70 ('P') Apollo file type information

or SI2 = 0 is reserved.

It seems this subfield definition is no longe maintained. (FYI, https://stackoverflow.com/questions/65188890/what-gzip-extra-field-subfields-exist)

So is it ok if make it only accept these two cases for FLG.FEXTRA and reject others?

@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2021, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.

This comment has been minimized.

@LanceAndersen

LanceAndersen Mar 22, 2021
Contributor

The copyright would be 2020, 2021,

* |...original file name, zero-terminated...|
* +=========================================+
*/
out.write(filename);

This comment has been minimized.

@LanceAndersen

LanceAndersen Mar 22, 2021
Contributor

The RFC indicates:

         If FNAME is set, an original file name is present,
        terminated by a zero byte.  The name must consist of ISO
        8859-1 (LATIN-1) characters; on operating systems using
        EBCDIC or any other character set for file names, the name
        must be translated to the ISO LATIN-1 character set.  This
        is the original name of the file being compressed, with any
        directory components removed, and, if the file being
        compressed is on a file system with case insensitive names,
        forced to lower case. There is no original file name if the
        data was compressed from a source other than a named file;
        for example, if the source was stdin on a Unix system, there
        is no file name.

It looks like there is no validation being done so I am not sure what the expectation is.

This comment has been minimized.

@linzang

linzang Mar 26, 2021
Author Contributor

on operating systems using
EBCDIC or any other character set for file names, the name
must be translated to the ISO LATIN-1 character set. This
is the original name of the file being compressed, with any
directory components removed, and, if the file being
compressed is on a file system with case insensitive names,
forced to lower case. There is no original file name if the
data was compressed from a source other than a named file;
for example, if the source was stdin on a Unix system, there
is no file name.

IMHO, this looks much like the rule that user should take care of, I think here we only need to accept a String and decode it with "ISO-8859-1" to byte array then write to the header. Is this reasonable?

* +===================================+
*/
out.write(fileComment);
out.write(0);

This comment has been minimized.

@LanceAndersen

LanceAndersen Mar 22, 2021
Contributor

The RFC states:

        If FCOMMENT is set, a zero-terminated file comment is
        present.  This comment is not interpreted; it is only
        intended for human consumption.  The comment must consist of
        ISO 8859-1 (LATIN-1) characters.  Line breaks should be
        denoted by a single line feed character (10 decimal).

So should the characters be validates as well as line breaks?

This comment has been minimized.

@linzang

linzang Mar 26, 2021
Author Contributor

Same as file name , I think it is the users responsibility to pass the right encoded String as an argument. if constructor is designed to accept a String here, I think it can decode to byte array with ISO 8859-1, and then write to the header.

Moreover,I can't figure out a way to testing the encoding information of the String, do you have any clue about validate it?

* @param out the output stream
* @param generateHeaderCRC
* if {@code true} the header will include the CRC16 of the header.
* @param extraFieldBytes the byte array of extra filed,

This comment has been minimized.

@LanceAndersen

LanceAndersen Mar 22, 2021
Contributor

filed -> field

* if {@code true} the header will include the CRC16 of the header.
* @param extraFieldBytes the byte array of extra filed,
* the generated header would calculate the byte[] size
* and fill it before the byte[] in header.

This comment has been minimized.

@LanceAndersen

LanceAndersen Mar 22, 2021
Contributor

This is not clear what you are trying to articulate here

*
* @param generateHeaderCRC
* if {@code true} the header will include the CRC16 of the header.
* @param extraFieldBytes the byte array of extra filed,

This comment has been minimized.

@LanceAndersen

LanceAndersen Mar 22, 2021
Contributor

filed -> field

int xlen = extraFieldBytes.length;
if (xlen > 0xffff) {
throw new ZipException("extra field size out of range");
}

This comment has been minimized.

@LanceAndersen

LanceAndersen Mar 22, 2021
Contributor

Where is the ZipException documented that is being thrown?

@linzang
Copy link
Contributor Author

@linzang linzang commented Mar 24, 2021

Hi Lance,
Thanks a lot for your review. I will update the PR ASAP.
May I ask your help to also review the CSR? Thanks!

BRs,
Lin

@LanceAndersen
Copy link
Contributor

@LanceAndersen LanceAndersen commented Mar 24, 2021

@linzang
Copy link
Contributor Author

@linzang linzang commented Mar 26, 2021

Dear Lance,

Sorry that I reply so late, I got stuck at some work recently.

  • We should have tests that include some , but not all of the additional fields as I believe they are all optional according to the RFC.

I see Joe's comments in the CSR about the encode of the byte array of String-alike field such file name, I checked that the RFC has description it is "ISO8859-1". So do you think it is ok to change the argument type to String and add the encoding decription in the documented comments?

And Joe also suggested to make all optional header field as a class (I propose to name it "gzipHeaderMetaRecord" ). And then the constructor could accept it and process related flag and fields. Moreover it seems this class cloud also be used in GzipInputStream,Do you think it is ok to also change it here? Thanks!

BRs,
Lin

@AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented Mar 27, 2021

GZIPInputStream/GZIPOutputStream are JDK 1.1 era classes that aren't well specified and I think we'll have to do some improvements as part of extending the support. For example, the GZIPOutputStream constructors (existing and proposed) do not specify that they write the member header.

As regards the new constructors then I think new parameters will need to be clearly specified and/or linked to their description in the RFC. The types of some of the parameters may need to be re-examined, maybe the file name and comment should be provided as Strings (that will help with the discussion about the encoding to 8859_1). I think the javadoc will need to make it clear on what flags/values are allowed and the behavior when other values are used. It might be that the class will need enums and other classes to provide a better API.

@mlbridge
Copy link

@mlbridge mlbridge bot commented Mar 28, 2021

Mailing list message from Lance Andersen on core-libs-dev:

On Mar 27, 2021, at 12:44 PM, Alan Bateman <alanb at openjdk.java.net<mailto:alanb at openjdk.java.net>> wrote:

Dear Lance?

Sorry that I reply so late, I got stuck at some work recently.

* We should have tests that include some , but not all of the additional fields as I believe they are all optional according to the RFC.

I see Joe's comments in the CSR about the encode of the byte array of String-alike field such file name, I checked that the RFC has description it is "ISO8859-1". So do you think it is ok to change the argument type to String and add the encoding decription in the documented comments?

And Joe also suggested to make all optional header field as a class (I propose to name it "gzipHeaderMetaRecord" ). And then the constructor could accept it and process related flag and fields. Moreover it seems this class cloud also be used in GzipInputStream?Do you think it is ok to also change it here? Thanks!

BRs,
Lin

GZIPInputStream/GZIPOutputStream are JDK 1.1 era classes that aren't well specified and I think we'll have to do some improvements as part of extending the support. For example, the GZIPOutputStream constructors (existing and proposed) do not specify that they write the member header.

That is a reasonable suggestion to indicate what is and is not done.

As regards the new constructors then I think new parameters will need to be clearly specified and/or linked to their description in the RFC. The types of some of the parameters may need to be re-examined, maybe the file name and comment should be provided as Strings (that will help with the discussion about the encoding to 8859_1).

Agree and that will also align it with ZipEntry(String), Zipentry::setComment

I think the javadoc will need to make it clear on what flags/values are allowed and the behavior when other values are used.

The Gzip spec is kind of vague in this area for example with the SI1 and SI2 for the Extra Field only one value set is specified but I can imagine additional values being somewhere.

It might be that the class will need enums and other classes to provide a better API.

I am not sure how much if at all these extra header fields are used. Certainly if the API now sets them, we should probably provide a way to access them as well.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3072

[cid:E1C4E2F0-ECD0-4C9D-ADB4-B16CA7BCB7FC at home]

Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037
Oracle Java Engineering
1 Network Drive
Burlington, MA 01803
Lance.Andersen at oracle.com<mailto:Lance.Andersen at oracle.com>

@linzang linzang changed the title 4890732: GZIPOutputStream doesn't support, in fact thwarts, use of optional GZIP fields 4890732: GZIPOutputStream doesn't support optional GZIP fields Mar 30, 2021
@openjdk openjdk bot added the rfr label Apr 15, 2021
@linzang
Copy link
Contributor Author

@linzang linzang commented Apr 26, 2021

Dear @AlanBateman @LanceAndersen,
May I ask your help to review whether the usage of Record and Builder pattern is reasonable in the PR? Thanks

BRs,
Lin

@mlbridge
Copy link

@mlbridge mlbridge bot commented Apr 26, 2021

Mailing list message from Lance Andersen on core-libs-dev:

Hi Lin,

Sorry for not replying earlier, I thought I had.

I believe we should still flush out the API proposal on the CoreLibs alias before continuing to move forward with updates to the PR (as was suggested by both Alan and I)

For example, the updates to the PR does not include any proposed changes to GZIPInputStream and this should be something we should come to an agreement on as it can possibly impact the direction. I am not sure we need to add multiple constructors to GZIPOutputStream as part of the proposed change.

It would also be useful to know where is the actual pain point, that is, is there a tool or API not having these fields settable for that is causing an issue? I ask so that we can make sure that we are taking that into consideration.

Please note, that I am not trying to discourage your contribution or work to date, I just want to make sure we get agreement on the way forward as it not only impact the PR, but the CSR which will be needed as well.

Best
Lance

On Apr 26, 2021, at 7:40 AM, Lin Zang <lzang at openjdk.java.net<mailto:lzang at openjdk.java.net>> wrote:

On Thu, 8 Apr 2021 08:54:06 GMT, Alan Bateman <alanb at openjdk.org<mailto:alanb at openjdk.org>> wrote:

Dear All,
May I ask your help to review this change? Thanks!

BRs,
Lin

Dear All,
May I ask your help to review this change? Thanks!

@LanceAndersen Do you have cycles to help Lin? This proposal will require discussion, they may be case for the header to be a record for example. My personal view is that the PR should be set aside until there is at least at least some agreement on the API.

Dear @AlanBateman @LanceAndersen,
May I ask your help to review whether the usage of Record and Builder pattern is reasonable in the PR? Thanks

BRs,
Lin

-------------

PR: https://git.openjdk.java.net/jdk/pull/3072

[cid:E1C4E2F0-ECD0-4C9D-ADB4-B16CA7BCB7FC at home]

Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037
Oracle Java Engineering
1 Network Drive
Burlington, MA 01803
Lance.Andersen at oracle.com<mailto:Lance.Andersen at oracle.com>

1 similar comment
@mlbridge
Copy link

@mlbridge mlbridge bot commented Apr 26, 2021

Mailing list message from Lance Andersen on core-libs-dev:

Hi Lin,

Sorry for not replying earlier, I thought I had.

I believe we should still flush out the API proposal on the CoreLibs alias before continuing to move forward with updates to the PR (as was suggested by both Alan and I)

For example, the updates to the PR does not include any proposed changes to GZIPInputStream and this should be something we should come to an agreement on as it can possibly impact the direction. I am not sure we need to add multiple constructors to GZIPOutputStream as part of the proposed change.

It would also be useful to know where is the actual pain point, that is, is there a tool or API not having these fields settable for that is causing an issue? I ask so that we can make sure that we are taking that into consideration.

Please note, that I am not trying to discourage your contribution or work to date, I just want to make sure we get agreement on the way forward as it not only impact the PR, but the CSR which will be needed as well.

Best
Lance

On Apr 26, 2021, at 7:40 AM, Lin Zang <lzang at openjdk.java.net<mailto:lzang at openjdk.java.net>> wrote:

On Thu, 8 Apr 2021 08:54:06 GMT, Alan Bateman <alanb at openjdk.org<mailto:alanb at openjdk.org>> wrote:

Dear All,
May I ask your help to review this change? Thanks!

BRs,
Lin

Dear All,
May I ask your help to review this change? Thanks!

@LanceAndersen Do you have cycles to help Lin? This proposal will require discussion, they may be case for the header to be a record for example. My personal view is that the PR should be set aside until there is at least at least some agreement on the API.

Dear @AlanBateman @LanceAndersen,
May I ask your help to review whether the usage of Record and Builder pattern is reasonable in the PR? Thanks

BRs,
Lin

-------------

PR: https://git.openjdk.java.net/jdk/pull/3072

[cid:E1C4E2F0-ECD0-4C9D-ADB4-B16CA7BCB7FC at home]

Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037
Oracle Java Engineering
1 Network Drive
Burlington, MA 01803
Lance.Andersen at oracle.com<mailto:Lance.Andersen at oracle.com>

@linzang
Copy link
Contributor Author

@linzang linzang commented Apr 30, 2021

Dear Lance,
Sorry for late response.

the updates to the PR does not include any proposed changes to GZIPInputStream and this should be something we should come to an agreement on as it can possibly impact the direction. I am not sure we need to add multiple constructors to GZIPOutputStream as part of the proposed change.

With the latest change, the gzip header data could be composed by GZIPHeaderBuilder, But it seems ithere still need a way to write the header data when GZipOutputStream is created.Therefore IMHO at least one more constructor is required which could accept the header data and then write them.

For GZIPInputStream, as there is GZIPHeaderBuilder now, I think there is no need to add any constructor for it. We can use the builder to generate gzip header data when parsing the Gzip header. For example, re-write the readHeader() to generate gzip header data, and provide api that user could access the header data.

It would also be useful to know where is the actual pain point, that is, is there a tool or API not having these fields settable for that is causing an issue? I ask so that we can make sure that we are taking that into consideration.

Frankly, I only experienced the use of extra gzip header field when work on the gziped heap dump file generated by jmap -dump with gz option. It uses the extra field to provide some meta info that maybe used for decompressing. I didn't see the usage of these header info in other place. So I believe that not having these fields settable is OK at present (at least).

The intention that I try to provide this PR is that I think, since the gzip file spec has described the header field, maybe in some scenario user may need a way to access/set it in Java.

Thanks.
Lin

@linzang
Copy link
Contributor Author

@linzang linzang commented May 24, 2021

Dear Lance(@LanceAndersen),

I have updated the CSR at https://bugs.openjdk.java.net/browse/JDK-8263793, which considered changes both for GZIPOutputStream and GZIPInputStream. Would you like to help check whether it looks reasonable?

I also did some investigation recently, and it shows there are rare cases that the optional fields are used in gzip file, and gunzip ingores them when uncompressing.

So I believe the meaning of the PR is just to add the ability of setting/getting gzip file header info that defined in gzip specification. As the use case is rare at present, I am not sure whether it is acceptable to have this PR.

Thanks,
Lin

@openjdk
Copy link

@openjdk openjdk bot commented May 24, 2021

@linzang this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout gzip-field
git fetch https://git.openjdk.java.net/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push
@openjdk openjdk bot removed the merge-conflict label May 24, 2021
@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Jun 21, 2021

@linzang This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Jul 19, 2021

@linzang This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this Jul 19, 2021
@linzang
Copy link
Contributor Author

@linzang linzang commented Jul 26, 2021

/open

@openjdk openjdk bot reopened this Jul 26, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jul 26, 2021

@linzang @HostUserDetails{id=56812395, username='linzang', fullName='null'} this pull request is now open

@linzang
Copy link
Contributor Author

@linzang linzang commented Jul 26, 2021

Dear All,
I have updated the CSR with the latest change. https://bugs.openjdk.java.net/browse/JDK-8263793 , may I ask your help to review and give comments?
Thanks!

Copy link
Contributor

@LanceAndersen LanceAndersen left a comment

Thank you for reviving the discussion.

I have not gone through the latest update in detail but there are some changes that are needed. Before moving forward with the CSR, I would like to give time for additional feedback on naming and design.

I am not sure the builder names withXXX are the preferred naming pattern.

*
* @author Lin Zang
* @since 18
*

This comment has been minimized.

@LanceAndersen

LanceAndersen Jul 26, 2021
Contributor

The overview section needs more detail and examples of the usage

Please remove the author tag as we have moved away from this tag (you can see who was involved via the file history

}
return this;
}

This comment has been minimized.

@LanceAndersen

LanceAndersen Jul 26, 2021
Contributor

Is this really needed as a public method ?

This comment has been minimized.

@linzang

linzang Aug 2, 2021
Author Contributor

From the spec RFC-1952(https://www.ietf.org/rfc/rfc1952.txt), the FHCRC is an optional field which user can decide whether to use it or not. So here I make it public.

May be we can enable it by default? Let's discuss it and make the decision :)

* @since 18
*/
public GZIPHeaderBuilder withFileComment(String fileComment) {
if (fileComment == null || fileComment.length() == 0) {

This comment has been minimized.

@LanceAndersen

LanceAndersen Jul 26, 2021
Contributor

What happens if the String contains characters outside of ISO_8859_1?

This comment has been minimized.

@linzang

linzang Aug 2, 2021
Author Contributor

good point! I am not sure whether there should be a check for the characters encoding, I didn't find any implementation that does the check (e.g. the gzip.c implementation) . But I prefer the idea of checking here, as the specification has defined the encoding.

@AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented Jul 27, 2021

I have not gone through the latest update in detail but there are some changes that are needed. Before moving forward with the CSR, I would like to give time for additional feedback on naming and design.

This proposal will need a few iterations to get to the right API. There are several issues with the proposed GZIPHeaderBuilder, also GZIPHeaderData is mutable (having byte[] as elements in a record is a hazard). I will try to make time in the coming weeks to help.

@linzang
Copy link
Contributor Author

@linzang linzang commented Aug 2, 2021

Dear @AlanBateman and @LanceAndersen,
Sorry for late response, and really appreciated for your help on looking at this PR. I will update the pr based on Lance's comments first. Thanks!

Lin

@openjdk openjdk bot removed the rfr label Aug 2, 2021
@openjdk openjdk bot added the rfr label Aug 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
3 participants