-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8303920: Avoid calling out to python in DataDescriptorSignatureMissing test #12959
Conversation
…ta descriptor signature.
👋 Welcome back eirbjo! A progress list of the required criteria for merging this PR into |
Webrevs
|
* optional signature. | ||
* - ZipInputStream cannot handle the missing signature | ||
* | ||
* descriptor signatures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello Eirik, I think the summary should no longer have references to python. Since this test was introduced for https://bugs.openjdk.org/browse/JDK-8056934, perhaps we can just change the @summary
of this test definition to say:
@summary Verify the ability to read zip files whose local header data descriptor is missing the optional signature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I updated the summary according to your suggestion. I left it there to keep the history/context of the test, but I guess that's a bit detailed and is already captured in the bug.
@@ -1,5 +1,5 @@ | |||
/* | |||
* Copyright 2012 Google, Inc. All Rights Reserved. | |||
* Copyright 2012, 2023 Google, Inc. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's my understanding that unless you are doing this change on behalf of Google, you shouldn't be changing that line. Instead, one has to add another line below that, of the form:
Copyright (c) 2023, Oracle and/or its affiliates. All rights reserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no affiliation to Google. I reverted the change to their copyright line and added an Oracle one.
There is very little code left from Martin here (the Zip64 note in the header, the class name and some whitespace), but let's keep it anyhow.
It's good that this test is being revived and no longer relies on an external tool/program to generate a zip which was triggering the issue in https://bugs.openjdk.org/browse/JDK-8056934. The changes look good to me. In order to verify that without the fix in https://bugs.openjdk.org/browse/JDK-8056934, this test fails (i.e. reproduces the issue), I reverted the fix that was done in that issue ( diff --git a/test/jdk/java/util/zip/DataDescriptorSignatureMissing.java b/test/jdk/java/util/zip/DataDescriptorSignatureMissing.java
index 5efdc59de63..636cecb4851 100644
--- a/test/jdk/java/util/zip/DataDescriptorSignatureMissing.java
+++ b/test/jdk/java/util/zip/DataDescriptorSignatureMissing.java
@@ -39,6 +39,7 @@ import java.nio.charset.StandardCharsets;
import java.util.zip.*;
import static org.testng.Assert.assertEquals;
+import static org.testng.Assert.assertNotNull;
public class DataDescriptorSignatureMissing {
@@ -55,10 +56,12 @@ public class DataDescriptorSignatureMissing {
try (ZipInputStream in = new ZipInputStream(
new ByteArrayInputStream(zip))) {
ZipEntry first = in.getNextEntry();
+ assertNotNull(first, "Zip file is unexpectedly missing first entry");
assertEquals(first.getName(), "first");
assertEquals(in.readAllBytes(), "first".getBytes(StandardCharsets.UTF_8));
ZipEntry second = in.getNextEntry();
+ assertNotNull(second, "Zip file is unexpectedly missing second entry");
assertEquals(second.getName(), "second");
assertEquals(in.readAllBytes(), "second".getBytes(StandardCharsets.UTF_8));
}
so that instead of running into a NullPointerException, the test will (rightly) reproduce and report the missing second entry? |
/contributor add @jaikiran |
@eirbjo Syntax:
User names can only be used for users in the census associated with this repository. For other contributors you need to supply the full name and email address. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now looks good to me. Thank you for doing these changes. I'll run this test on our CI just to be sure there isn't any obvious issues.
Before integrating, please wait for another review from Lance or others who have more knowledge of this area.
@eirbjo This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 274 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@jaikiran, @LanceAndersen, @irisclark) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
/contributor add jpai |
@eirbjo |
Thanks for taking time to do this thorough review and especially for running the regressing case. Much appreciated! I'll wait for a second review before integrating. |
… fields need to be update to account for the four missing signature bytes.
Since we strip 4 bytes from the first entry's data descriptor, we need to account for this by reducing the second CEN header's LOC offset by 4. Similarly, the END header's CEN offset also needs adjustment. |
@eirbjo This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
@eirbjo This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the |
@eirbjo This pull request is now open |
Reopening this PR. Before being closed for inactivity, this PR was reviewed by @jaikiran, who requested that another reviewer with experience in this area also take a look at it before integration. I think it would be good to have this regression test run automatic again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for trying to move this forward. Please see my comments below.
Another question, is the zip that is generated by this test readable by other zip tools such as info-zip, Apache Common-compress, winzip?
@@ -1,5 +1,6 @@ | |||
/* | |||
* Copyright 2012 Google, Inc. All Rights Reserved. | |||
* Copyright (c) 2023, Oracle and/or its affiliates. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure the copyright can be updated this way @irisclark, could you provide guidance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way of updating the copyright was suggested by @jaikiran in the March 10th comment above. Would be nice to get this clarified, yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is actually very little left of Martin's code after my rewrite, besides whitespace, some curly braces and a couple of imports. Hardly defensible IP, but then I Am Not a Lawyer :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I exchanged messages with Iris and she is comfortable with the updates to the copyright
* No way to adapt the technique in this test to get a ZIP64 zip file | ||
* without data descriptors was found. | ||
* | ||
* @ignore 8303920 This test has brittle dependencies on an external working python. | ||
* @run testng DataDescriptorSignatureMissing | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we convert this please to use junit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converted to junit as suggested.
} | ||
/** | ||
* Produce a ZIP file where the first entry has a signature-less data descriptor | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be useful to show the what the internal zip representation of the LOC and CEN looks like to make it clear what a signature-less data descriptor is meant to be for future maintainers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment including some structural examples. (I personally feel it maybe ended up a bit excessive)
…background info on (signature-less) data descriptors, for the benefit of future maintainers.
|
Thinking some more about this, I would like to see us keep the Zip generated by python, store it in a byte array (or equivalent) as it also validate that we can still process the zip given this was the original test and the zip is being generated by a 3rd party tool. This would be an additional test along with your proposed enhancements |
Lance, I was finally able to reproduce the original issue using Python 3.4.4. (Which was a challenge to install given its archaic dependencies!) I was able to verfy that the missing signature is the ONLY difference between the input and output files. (Except updated LOC and CEN offsets accounting for the missing bytes). Additionally, I independently removed the signature files from the input file, this produced an output file binary identical to Python's. Given that the one and only difference introduced by the Python script is covered by the test in this PR, I'm not sure I see any additional value in adding a test with the binary test vector produced by Python. I think it will just increase our maintenance costs, without adding any real value or coverage. If you see this differently, that's of course ok. Just let me know and I'll create the test with the encoded binary ZIP (which I have easily available now). Waiting for your guidance, thanks :-) |
If you verified that we are a complete match, I am good with that. Thank you for validating |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are in a good place with the test clean up. Thanks for your efforts here
I can kick of a test run internally next week or perhaps Sunday
@@ -1,5 +1,6 @@ | |||
/* | |||
* Copyright 2012 Google, Inc. All Rights Reserved. | |||
* Copyright (c) 2023, Oracle and/or its affiliates. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I exchanged messages with Iris and she is comfortable with the updates to the copyright
Thanks for your reviews, Lance and Iris! FWIW, the test ran fine on Github Actions, including on
https://github.com/eirbjo/jdk/actions/runs/6696663322/job/18196803138 |
Disregard this comment, since it confuses this PR with #12991 :-) |
Have an internal mach5 run going and will let you know when it completes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest Mach 5 run looks fine
/integrate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello Eirik, thank you for this change - it looks really good.
Lance and Iris too have approved the PR and Lance has noted that the mach5 run came back fine. I'll go ahead and sponsor this now. /sponsor |
Going to push as commit 07eaea8.
Your commit was automatically rebased without conflicts. |
Please review this PR which brings the DataDescriptorSignatureMissing test back to life.
This test currently calls out to Python to create a test vector ZIP with a Data Descriptor without the recommended but optional signature. The Python dependency has turned out to be very brittle, so the test is currently marked with
@ignore
The PR replaces Python callouts with directly creating the test vector ZIP in the test itself. We can then remove the
@ignore
tag and run this useful test automatically.Progress
Issue
Reviewers
Contributors
<jpai@openjdk.org>
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/12959/head:pull/12959
$ git checkout pull/12959
Update a local copy of the PR:
$ git checkout pull/12959
$ git pull https://git.openjdk.org/jdk.git pull/12959/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 12959
View PR using the GUI difftool:
$ git pr show -t 12959
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/12959.diff
Webrev
Link to Webrev Comment