8313765: Invalid CEN header (invalid zip64 extra data field size) #15273

LanceAndersen · 2023-08-14T14:52:00Z

This PR updates the extra field validation added as part of JDK-8302483 to deal with issues seen with 3rd party tools/libraries where a ZipException may be encountered when opening select APK, ZIP or JAR files. Please see refer to the links provided at the end the description for the more information ::

ZipException: Invalid CEN header (invalid zip64 extra data field size)

Extra field includes padding :

----------------#1--------------------
[Central Directory Header]
  0x3374: Signature    : 0x02014b50
  0x3378: Created Zip Spec :    0xa [1.0]
  0x3379: Created OS   :    0x0 [MS-DOS]
  0x337a: VerMadeby    :    0xa [0, 1.0]
  0x337b: VerExtract   :    0xa [1.0]
  0x337c: Flag      :   0x800
  0x337e: Method     :    0x0 [STORED]
  0x3380: Last Mod Time  : 0x385ca437 [Thu Feb 28 20:33:46 EST 2008]
  0x3384: CRC       : 0x694c6952
  0x3388: Compressed Size :   0x624
  0x338c: Uncompressed Size:   0x624
  0x3390: Name Length   :   0x1b
 0x3392: Extra Length  :    0x7
		[tag=0xcafe, sz=0, data= ]
				->[tag=cafe, size=0]
  0x3394: Comment Length :    0x0
  0x3396: Disk Start   :    0x0
  0x3398: Attrs      :    0x0
  0x339a: AttrsEx     :    0x0
  0x339e: Loc Header Offset:    0x0
  0x33a2: File Name    : res/drawable/size_48x48.jpg

The extra field tag of 0xcafe has its data size set to 0. and the extra length is 7. It is expected that you can use the tag's data size to traverse the extra fields.

The BND tool added problematic data to the extra field:

----------------#359--------------------
[Central Directory Header]
   0x600b4: Signature        : 0x02014b50
   0x600b8: Created Zip Spec :       0x14 [2.0]
   0x600b9: Created OS       :        0x0 [MS-DOS]
   0x600ba: VerMadeby        :       0x14 [0, 2.0]
   0x600bb: VerExtract       :       0x14 [2.0]
   0x600bc: Flag             :      0x808
   0x600be: Method           :        0x8 [DEFLATED]
   0x600c0: Last Mod Time    : 0x2e418983 [Sat Feb 01 17:12:06 EST 2003]
   0x600c4: CRC              : 0xd8f689cb
   0x600c8: Compressed Size  :      0x23e
   0x600cc: Uncompressed Size:      0x392
   0x600d0: Name Length      :       0x20
   0x600d2: Extra Length     :        0x8
		[tag=0xbfef, sz=61373, data=        
  0x600d4: Comment Length   :        0x0
   0x600d6: Disk Start       :        0x0
   0x600d8: Attrs            :        0x0
   0x600da: AttrsEx          :        0x0
   0x600de: Loc Header Offset:    0x4f2fe
   0x600e2: File Name        : net/n3/nanoxml/CDATAReader.class

In the above example, the extra length is 0x8 and the tag size is 61373 which exceeds the extra length.

zip -T would also report an error:

zip -T foo.jar
net/n3/nanoxml/CDATAReader.class bad extra-field entry:
EF block length (61373 bytes) exceeds remaining EF data (4 bytes)
test of foo.jar FAILED

Some releases of Ant and commons-compress create CEN Zip64 extra headers with a size of 0 when Zip64 mode is required :

----------------#63--------------------
[Central Directory Header]
  0x2fded9: Signature        : 0x02014b50
  0x2fdedd: Created Zip Spec :       0x2d [4.5]
  0x2fdede: Created OS       :        0x3 [UNIX]
  0x2fdedf: VerMadeby        :      0x32d [3, 4.5]
  0x2fdee0: VerExtract       :       0x2d [4.5]
  0x2fdee1: Flag             :      0x800
  0x2fdee3: Method           :        0x8 [DEFLATED]
  0x2fdee5: Last Mod Time    : 0x43516617 [Thu Oct 17 12:48:46 EDT 2013]
  0x2fdee9: CRC              :        0x0
  0x2fdeed: Compressed Size  :        0x2
  0x2fdef1: Uncompressed Size:        0x0
  0x2fdef5: Name Length      :       0x22
  0x2fdef7: Extra Length     :        0x4
       [tag=0x0001, sz=0, data= ]
         ->ZIP64: 
  0x2fdef9: Comment Length   :        0x0
  0x2fdefb: Disk Start       :        0x0
  0x2fdefd: Attrs            :        0x0
  0x2fdeff: AttrsEx          : 0x81a40000
  0x2fdf03: Loc Header Offset:     0x1440
  0x2fdf07: File Name        : .xdk_version_12.1.0.2.0_production

[Local File Header]
    0x1440: Signature   :   0x04034b50
    0x1444: Version     :         0x2d    [4.5]
    0x1446: Flag        :        0x800
    0x1448: Method      :          0x8    [DEFLATED]
    0x144a: LastMTime   :   0x43516617    [Thu Oct 17 12:48:46 EDT 2013]
    0x144e: CRC         :          0x0
    0x1452: CSize       :   0xffffffff
    0x1456: Size        :   0xffffffff
    0x145a: Name Length :         0x22    [.xdk_version_12.1.0.2.0_production]
    0x145c: ExtraLength :         0x14
       [tag=0x0001, sz=16, data= 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 ]
           ->ZIP64: size *0x0 csize *0x2 *0x2d04034b500003 
    0x145e: File Name  : [.xdk_version_12.1.0.2.0_production]

Notice the CEN Extra length differs for the same tag in the LOC.

As we are validating the Zip64 extra fields, we are not expecting the data size to be 0.

Mach5 tiers 1-6 and the relevant JCK tests continue to pass with the above changes.

The following 3rd party tools have (or have pending) fixes to address the issues highlighted above:

Apache Commons-compress fix for Empty CEN Zip64 Extra Headers fixed in Commons-compress 1.11 (2016)
Ant fix for Empty CEN Zip64 Extra Headers in process is available in Ant 1.10.14 (2023).
BND issue with writing invalid Extra Headers and is fixed in BND 5.3 (2021)
The maven-bundle-plugin 5.1.5 includes the BND 5.3 patch.

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8313765: Invalid CEN header (invalid zip64 extra data field size) (Bug - P2)

Reviewers

Volker Simonis (@simonis - Reviewer)
Alan Bateman (@AlanBateman - Reviewer)
Sean Coffey (@coffeys - Reviewer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/15273/head:pull/15273
$ git checkout pull/15273

Update a local copy of the PR:
$ git checkout pull/15273
$ git pull https://git.openjdk.org/jdk.git pull/15273/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 15273

View PR using the GUI difftool:
$ git pr show -t 15273

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/15273.diff

Webrev

Link to Webrev Comment

bridgekeeper · 2023-08-14T14:53:23Z

👋 Welcome back lancea! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2023-08-14T14:54:41Z

@LanceAndersen The following labels will be automatically applied to this pull request:

core-libs
nio

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

mlbridge · 2023-08-14T16:10:59Z

Webrevs

shipilev · 2023-08-14T16:24:15Z

Please merge from master to get clean GHA runs.

AlanBateman · 2023-08-14T17:16:36Z

It's unfortunate that there are tools and plugins in the eco system that have these issues. I think you've got the right balance here, meaning tolerating a zip64 extra block with a block size of 0 and rejecting corrupted extra blocks added by older versions of the BND plugin.

simonis

Hi Lance,
In general it looks good, but I have some suggestion that I think could slightly improve the patch.

simonis · 2023-08-14T17:15:21Z

src/java.base/share/classes/java/util/zip/ZipFile.java

@@ -1342,14 +1361,15 @@ private static boolean isZip64ExtBlockSizeValid(int blockSize) {
            /*
             * As the fields must appear in order, the block size indicates which
             * fields to expect:
+             *  0 - May be written out by Ant and Apache Commons Compress Library


I don't like that isZip64ExtBlockSizeValid() still accepts 0 as valid input. I think we should fully handle the zero case in checkZip64ExtraFieldValues() (also see my comments there).

Hi Volker,

I understand your point and I had done that previously but decided I did not like the flow of the code that way which is why I moved the check. I prefer to leave it as is.

I don't think this is a question of "taste" because isZip64ExtBlockSizeValid() suggests that the method will check for valid sizes and to my understanding 0 is not a valid input. This method might also be called from other places in the future which do not handle the zero case appropriately.

In any case, I'm ready to accept this as a case of "Disagree and Commit" :) but in that case please update at least the comment below to something like "..Note we do not need to check blockSize is >= 8 as we know its length is at least 8 by now" because "..from the call to isZip64ExtBlockSizeValid()" isn't true any more.

I think I agree with Volker that it would be better if isZip64ExtBlockSizeValid continued to return false for block size 0.

OK, I have made the suggest change that you both prefer.

Thank you for your input

I'm also happy to see isZip64ExtBlockSizeValid rejecting 0. This logic could be useful when implementing support for valid Zip64 fields for small entries in ZipInputStream, like #12524 attempted to do. (The PR was closed by the bots in the end).

I guess this method could be moved to ZipUtils if JDK-8303866 is ever implemented.

simonis · 2023-08-14T17:19:49Z

src/java.base/share/classes/java/util/zip/ZipFile.java

@@ -1307,6 +1317,15 @@ private void checkZip64ExtraFieldValues(int off, int blockSize, long csize,
            if (!isZip64ExtBlockSizeValid(blockSize)) {
                zerror("Invalid CEN header (invalid zip64 extra data field size)");
            }
+            // if ZIP64_EXTID blocksize == 0, validate csize and size


If you put this block in front of the call to isZip64ExtBlockSizeValid() we don't have to handle the blockSize == 0 case in isZip64ExtBlockSizeValid().

This will also make the following comment true again:

// Note we do not need to check blockSize is >= 8 as // we know its length is at least 8 from the call to // isZip64ExtBlockSizeValid()

simonis · 2023-08-14T17:25:33Z

src/jdk.zipfs/share/classes/jdk/nio/zipfs/ZipFileSystem.java

                }
                switch (tag) {
                case EXTID_ZIP64 :
                    // Check to see if we have a valid block size
                    if (!isZip64ExtBlockSizeValid(sz)) {
                        throw new ZipException("Invalid CEN header (invalid zip64 extra data field size)");
                    }
+                    // if ZIP64_EXTID blocksize == 0, validate csize, size and


Same here. Just put this block before the call to isZip64ExtBlockSizeValid() than you don't have to handle the sz == 0 case there.

simonis · 2023-08-14T17:26:43Z

src/jdk.zipfs/share/classes/jdk/nio/zipfs/ZipFileSystem.java

             *  8 - uncompressed size
             * 16 - uncompressed size, compressed size
             * 24 - uncompressed size, compressed sise, LOC Header offset
             * 28 - uncompressed size, compressed sise, LOC Header offset,
             * and Disk start number
             */
            return switch(blockSize) {
-                case 8, 16, 24, 28 -> true;
+                case 0, 8, 16, 24, 28 -> true;


Don't need to handle the zero case here if you rearrange the code in readExtra() as suggested above.

mrserb · 2023-08-14T17:59:19Z

It's unfortunate that there are tools and plugins in the eco system that have these issues. I think you've got the right balance here, meaning tolerating a zip64 extra block with a block size of 0 and rejecting corrupted extra blocks added by older versions of the BND plugin.

I think I already asked this question, but it disappeared in the latest PR: Why our code has an assumption that the extended block has some kind of limitation of the size, like 9,16,24,28, there are no such limitations in the zip specification:
https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT

     4.5.3 -Zip64 Extended Information Extra Field (0x0001):

      The following is the layout of the zip64 extended 
      information "extra" block. If one of the size or
      offset fields in the Local or Central directory
      record is too small to hold the required data,
      a Zip64 extended information record is created.
      The order of the fields in the zip64 extended 
      information record is fixed, but the fields MUST
      only appear if the corresponding Local or Central
      directory record field is set to 0xFFFF or 0xFFFFFFFF.

      Note: all fields stored in Intel low-byte/high-byte order.

        Value      Size       Description
        -----      ----       -----------
(ZIP64) 0x0001     2 bytes    Tag for this "extra" block type
        Size       2 bytes    Size of this "extra" block
        Original 
        Size       8 bytes    Original uncompressed file size
        Compressed
        Size       8 bytes    Size of compressed data
        Relative Header
        Offset     8 bytes    Offset of local header record
        Disk Start
        Number     4 bytes    Number of the disk on which
                              this file starts 

      This entry in the Local header MUST include BOTH original
      and compressed file size fields. If encrypting the

It probably comes from the Wiki page: https://en.wikipedia.org/wiki/ZIP_(file_format) but it is not a spec.

Note the spec also says that an extended block should be created at least in this case

     " size or
      offset fields in the Local or Central directory
      record is too small to hold the required data,
      a Zip64 extended information record is created."

It does not say that the block cannot be empty or have any other size if all fields in the body of the zip file are correct/valid.

For example, take a look at the code in the ZipEntry where we accept any size of that block and just checked that it has required data in it.

mrserb · 2023-08-14T18:03:20Z

src/jdk.zipfs/share/classes/jdk/nio/zipfs/ZipFileSystem.java

+                            throw new ZipException("Invalid CEN header (invalid zip64 extra data field size)");
+                        }
+                        break;
+                    }
                    if (size == ZIP64_MINVAL) {


Note that we always increase "pos" only in case of "_MINVAL". If the values of size and csize are correct/valid in the "body" of the zip file and only locoff is negative then we should skip two fields in the extra block and read the third one. Otherwise, we may read some random values and throw an exception.

I am not sure I (quite) understand your question completely..

How ZIpFS::readExtra has navigated these fields has not changed

If you have a tool that creates a zip/jar that demonstrates an issue that might need further examination, please provide a test case, the tool that created the zip/jar in question and open a new bug.

The 8302483 changed this code to throw an exception, this is why I am looking into it.
You can compare the code in this file and the same code in the ZipFile in the checkZip64ExtraFieldValues method or the code in the ZipEntry#setExtra0, where we do not increase the "off" but instead checks for "off+8" or "off + 16". So if we need to read only the third field we should read "pos+16" but for the current implementation we will read it at "pos+0" since the pos was not bumped by the code for two other fields.

simonis · 2023-08-14T18:18:07Z

test/jdk/java/util/zip/ZipFile/ReadNonStandardExtraHeadersTest.java

+        fmt.format("%n    };%n");
+        return sb.toString();
+    }
+}


No newline at end of the file.

LanceAndersen · 2023-08-14T18:32:15Z

It's unfortunate that there are tools and plugins in the eco system that have these issues. I think you've got the right balance here, meaning tolerating a zip64 extra block with a block size of 0 and rejecting corrupted extra blocks added by older versions of the BND plugin.

I think I already asked this question, but it disappeared in the latest PR: Why our code has an assumption that the extended block has some kind of limitation of the size, like 9,16,24,28, there are no such limitations in the zip specification: https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT
     4.5.3 -Zip64 Extended Information Extra Field (0x0001):

      The following is the layout of the zip64 extended 
      information "extra" block. If one of the size or
      offset fields in the Local or Central directory
      record is too small to hold the required data,
      a Zip64 extended information record is created.
      The order of the fields in the zip64 extended 
      information record is fixed, but the fields MUST
      only appear if the corresponding Local or Central
      directory record field is set to 0xFFFF or 0xFFFFFFFF.

      Note: all fields stored in Intel low-byte/high-byte order.

        Value      Size       Description
        -----      ----       -----------
(ZIP64) 0x0001     2 bytes    Tag for this "extra" block type
        Size       2 bytes    Size of this "extra" block
        Original 
        Size       8 bytes    Original uncompressed file size
        Compressed
        Size       8 bytes    Size of compressed data
        Relative Header
        Offset     8 bytes    Offset of local header record
        Disk Start
        Number     4 bytes    Number of the disk on which
                              this file starts 

      This entry in the Local header MUST include BOTH original
      and compressed file size fields. If encrypting the 
It probably comes from the Wiki page: https://en.wikipedia.org/wiki/ZIP_(file_format) but it is not a spec.

Note the spec also says that an extended block should be created at least in this case
     " size or
      offset fields in the Local or Central directory
      record is too small to hold the required data,
      a Zip64 extended information record is created."
It does not say that the block cannot be empty or have any other size if all fields in the body of the zip file are correct/valid.

I am not understanding your point. There is a specific order for the Zip64 fields based on which fields have the Magic value. the spec does also not suggest that an empty Zip64 extra field can be written to the CEN when there is a Zip64 with data written to the LOC.

If you have a zip which demonstrates an issue not addressed, Please provide an a test case, with the tool created the zip and it be can looked at.

mrserb · 2023-08-14T18:41:26Z

I am not understanding your point. There is a specific order for the Zip64 fields based on which fields have the Magic value. the spec does also not suggest that an empty Zip64 extra field can be written to the CEN when there is a Zip64 with data written to the LOC.

Yes, there is a specific order of fields that should be stored in the extended block if some of the data in the "body" is negative. But as you pointed out in this case the empty block or block bigger than necessary to store the size/csize/locoff is not prohibited by the spec. For example, take a look at the code in the ZipEntry where we accept any size of that block and just checked that it has required data in it.

If you disagree then point to the part of the spec which blocks such sizes.

mrserb · 2023-08-14T19:04:51Z

I am not understanding your point. There is a specific order for the Zip64 fields based on which fields have the Magic value. the spec does also not suggest that an empty Zip64 extra field can be written to the CEN when there is a Zip64 with data written to the LOC.

Yes, there is a specific order of fields that should be stored in the extended block if some of the data in the "body" is negative. But as you pointed out in this case the empty block or block bigger than necessary to store the size/csize/locoff is not prohibited by the spec. For example, take a look at the code in the ZipEntry where we accept any size of that block and just checked that it has required data in it.

If you disagree then point to the part of the spec which blocks such sizes.

This is how it is implemented by the "unzip"
https://github.com/madler/zlib/blob/04f42ceca40f73e2978b50e93806c2a18c1281fc/contrib/minizip/unzip.c#L1035C68-L1035C76 , the dataSize is accepted as is.

simonis · 2023-08-14T19:09:38Z

There's one final thing I want to mention. Your current test cases (i.e. the corrupted zip-files in ReadNonStandardExtraHeadersTest.) only reproduce the problem with ZipFile but not with ZipFileSystem. That's because of a slightly different logic in ZipFileSystem$Entry::readExtra() because of which the error will only be triggered if the ZIP64_EXTID Extra Block will be followed by another Extra Block. So we need a Zip file with a 0-length ZIP64_EXTID Extra Block followed by another Extra Block in order to trigger the issue.

I've therefore attached a zip file (ZeroLengthZIP64EXTID.zip) which was created by ANT 1.10.7 from a directory containing a single, zero-length file and the following build.xml file:

<project name="ZipFiles" default="zip">
    <target name="zip">
        <zip destfile="/tmp/output.zip" zip64Mode="always" createUnicodeExtraFields="always">
            <fileset dir="/tmp/testfiles" includes="**/*" />
        </zip>
    </target>
</project>

zip64Mode="always" is required in order to produce a ZIP64_EXTID Extra Field with zero length (which triggers the problem).
createUnicodeExtraFields="always" is required in order to produce a second Extra Field after ZIP64_EXTID.

I think it would be good if you could add that as test case as well.
ZeroLengthZIP64EXTID.zip

LanceAndersen · 2023-08-14T20:28:08Z

I am not understanding your point. There is a specific order for the Zip64 fields based on which fields have the Magic value. the spec does also not suggest that an empty Zip64 extra field can be written to the CEN when there is a Zip64 with data written to the LOC.

Yes, there is a specific order of fields that should be stored in the extended block if some of the data in the "body" is negative. But as you pointed out in this case the empty block or block bigger than necessary to store the size/csize/locoff is not prohibited by the spec. For example, take a look at the code in the ZipEntry where we accept any size of that block and just checked that it has required data in it.
If you disagree then point to the part of the spec which blocks such sizes.

This is how it is implemented by the "unzip" https://github.com/madler/zlib/blob/04f42ceca40f73e2978b50e93806c2a18c1281fc/contrib/minizip/unzip.c#L1035C68-L1035C76 , the dataSize is accepted as is.

4.6.2 Third-party Extra Fields MUST include a Header ID using
the format defined in the section of this document
titled Extensible Data Fields (section 4.5).

The Data Size field indicates the size of the following
data block. Programs can use this value to skip to the
next header block, passing over any data blocks that are
not of interest.

Zip -T would also report errors with a BND modified jar:

zip -T bad.jar

net/n3/nanoxml/CDATAReader.class bad extra-field entry:
EF block length (61373 bytes) exceeds remaining EF data (4 bytes)
test of bad.jar FAILED

zip error: Zip file invalid, could not spawn unzip, or wrong unzip (original files unmodified)

zipdetails would also fail with the above jar

…a header with a data size

mrserb · 2023-08-14T21:13:24Z

net/n3/nanoxml/CDATAReader.class bad extra-field entry:
EF block length (61373 bytes) exceeds remaining EF data (4 bytes)
test of bad.jar FAILED
zip error: Zip file invalid, could not spawn unzip, or wrong unzip (original files unmodified)

zipdetails would also fail with the above jar

It seems that error " EF block length (30837 bytes) exceeds remaining EF data" caused by the fact the size was too big for the actual zipfile, which I think is a different issue, but you can try to unzip that file, and you will get a result w/o errors. unzip implementation is linked above.

simonis · 2023-08-14T21:15:34Z

test/jdk/java/util/zip/ZipFile/ReadNonStandardExtraHeadersTest.java

        // Create the Zip file to read
        Files.write(VALID_APK, VALID_APK_FILE);
        Files.write(VALID_APACHE_COMPRESS_JAR, COMMONS_COMPRESS_JAR);
+        Files.write(VALID_ANT_JAR, ANT_ZIP64_UNICODE_EXTRA_JAR);
+        Files.write(VALID_ANT_JAR, ANT_ZIP64_UNICODE_EXTRA_ZIP);


This should probably read VALID_ANT_ZIP instead of VALID_ANT_JAR.

simonis · 2023-08-14T21:16:27Z

test/jdk/java/util/zip/ZipFile/ReadNonStandardExtraHeadersTest.java

    }

    /**
     * Zip and Jars files to validate we can open
     */
    private static Stream<Path> zipFilesToTest() {
-        return Stream.of(VALID_APK, VALID_APACHE_COMPRESS_JAR);
+        return Stream.of(VALID_APK, VALID_APACHE_COMPRESS_JAR, VALID_ANT_JAR);


And here you probably want to add VALID_ANT_ZIP in addition to VALID_ANT_JAR.

Yep, already caught that typo, forgot to save before I committed :-)

mrserb · 2023-08-15T00:07:01Z

TEST.zip

try this example, zip -T passed, unzip works fine, but openjdk rejects it.

… tweak a comment

simonis

Thanks for doing the additional changes. This looks good to me now.

openjdk · 2023-08-15T17:50:44Z

@LanceAndersen This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8313765: Invalid CEN header (invalid zip64 extra data field size)

Reviewed-by: simonis, alanb, coffeys

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 33 new commits pushed to the master branch:

b80001d: 8314209: Wrong @SInCE tag for RandomGenerator::equiDoubles
ef6db5c: 8314211: Add NativeLibraryUnload event
49ddb19: 8313760: [REDO] Enhance AES performance
d46f0fb: 8313720: C2 SuperWord: wrong result with -XX:+UseVectorCmov -XX:+UseCMoveUnconditionally
38687f1: 8314262: GHA: Cut down cross-compilation sysroots deeper
a602624: 8314020: Print instruction blocks in byte units
0b12480: 8314233: C2: assert(assertion_predicate_has_loop_opaque_node(iff)) failed: unexpected
e1fdef5: 8314324: "8311557: [JVMCI] deadlock with JVMTI thread suspension" causes various failures
2bd2fae: 4346610: Adding JSeparator to JToolBar "pushes" buttons added after separator to edge
6a15860: 8314163: os::print_hex_dump prints incorrectly for big endian platforms and unit sizes larger than 1
... and 23 more: https://git.openjdk.org/jdk/compare/4b2703ad39f8160264eb30c797824cc93a6b56e2...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

simonis · 2023-08-15T17:56:09Z

Other than that there are no limitation on the size of extended block, it could be 0, 20, 100 , etc. But it should contain correct data if necessary and should not be larger than the surrounding "chunk".

This seems to be a very "free" interpretation of the specification to me. According to my understanding, the valid sizes of 8, 16, 24 or 28 as described in the Wikipedia article are a direct consequence of the specification which only allows for a fixed set of entries in the ZIP64 Extra Field. Already the zero-length case is questionable because a ZIP64 Extra Field should only be created if required, however we have to handle it here for backward compatibility reasons.

AlanBateman · 2023-08-15T18:22:00Z

src/jdk.zipfs/share/classes/jdk/nio/zipfs/ZipFileSystem.java

+                    // size, and locoff to make sure the fields != ZIP64_MAGICVAL
+                    if (sz == 0) {
+                        if ( csize == ZIP64_MINVAL || size == ZIP64_MINVAL ||
+                                locoff == ZIP64_MINVAL) {


Minor nit but you can drop the space in "( csize)" and put the third condition on L3099 to make it easier to read.

For the comment, it looks like it is missing a comma after "== 0". Either that or change it to start with "Some older version of Apache Ant and Apache Commons ...".

Addressed in the latest update. Thank you!

mrserb · 2023-08-15T21:05:16Z

This seems to be a very "free" interpretation of the specification to me. According to my understanding, the valid sizes of 8, 16, 24 or 28 as described in the Wikipedia article are a direct consequence of the specification.

I have provided a test.zip file above which passed the zip integrity test via "zip -T" and can be unzip w/o errors, but rejected by the openjdk. That zip was created based on the actual specification, and not on the wiki.

simonis · 2023-08-15T21:38:41Z

I have provided a test.zip file above which passed the zip integrity test via "zip -T" and can be unzip w/o errors, but rejected by the openjdk. That zip was created based on the actual specification, and not on the wiki.

Did you create that zip file manually or was it created by a tool and if by a tool than which one? I think we must differentiate here between functional compatibility with a tool like "zip", compatibility with a specification and the compatibility with existing zip files and zip files created by common tools.

The latter is important and required in order to avoid regressions (and I think that's exactly what we're fixing with this PR). Compatibility with a specification is great as long as it doesn't collide with the previous point. Behavioral compatibility with a tool like "zip" is the least important in this list and I think as long as the file in question is not an artifact commonly created by popular tools it is fine to behave different for edge cases.

mrserb · 2023-08-15T22:17:48Z

Did you create that zip file manually or was it created by a tool and if by a tool than which one? I think we must differentiate here between functional compatibility with a tool like "zip", compatibility with a specification and the compatibility with existing zip files and zip files created by common tools.

That was created manually and then repacked by the zip.

The latter is important and required in order to avoid regressions (and I think that's exactly what we're fixing with this PR). Compatibility with a specification is great as long as it doesn't collide with the previous point. Behavioral compatibility with a tool like "zip" is the least important in this list and I think as long as the file in question is not an artifact commonly created by popular tools it is fine to behave different for edge cases.

That file is accepted by zip, by the latest JDK8u382, by the JDK20 GA, and rejected by the 20.0.2. That is a regression in the latest update of JDK11+ which we trying to solve here.

simonis · 2023-08-15T23:13:00Z

That was created manually and then repacked by the zip.

That file is accepted by zip, by the latest JDK8u382, by the JDK20 GA, and rejected by the 20.0.2. That is a regression in the latest update of JDK11+ which we trying to solve here.

In my opinion we should resolve the regression for existing zip files and zip files which are commonly created by popular tools.

As far as I understand you can manually create "artificial" zip files which can be processed by the zip tool and previous versions of the JDK but not by new ones. As long as these kind of files aren't automatically generated by common tools, I don't see that as a real regression. I'm not even sure if we should fix that at all because hardly anybody is manually creating such zip files except maybe for attackers who intend to break the JDK.

I recommend we should instead fix the real problem as quickly as possible and create a new issue for potential additional improvements if you think that's necessary.

mrserb · 2023-08-16T02:22:08Z

As far as I understand you can manually create "artificial" zip files which can be processed by the zip tool and previous versions of the JDK but not by new ones.

It can be processed by the new/latest version of JDK8.

As long as these kind of files aren't automatically generated by common tools, I don't see that as a real regression.

It is clearly a regression. All that new checks should be proved to be based on some statement from the specification otherwise - such checks should be changed or deleted. As of now the strict check of the size does not based on the spec nor the behavior of the zip cmd.

mrserb · 2023-08-16T04:16:52Z

My overall point is that it will be unfortunate if users will be able to open some files on Linux/macOS/Windows using default programs but will not be able to do that using Java.

AlanBateman

Latest changes looks okay.

AlanBateman · 2023-08-16T07:29:13Z

That file is accepted by zip, by the latest JDK8u382, by the JDK20 GA, and rejected by the 20.0.2. That is a regression in the latest update of JDK11+ which we trying to solve here.

@mrserb Have you tested your ZIP file with -Djdk.util.zip.disableZip64ExtraFieldValidation=true? That's the system property to disable the additional checking and is the "get out of jail card" for anyone running into issues. As always with changes like this, or other changes that tighten up checking, there is a risk that it will break something, hence the system property to give existing deployments a workaround to continue. In this case, the original change exposed an issue with a number of Apache projects (see the linked bugs in their issue trackers) and a bad bug in the BND tool that was fixed a few years ago. The system property is the temporary workaround until the deployment has versions of the libraries produced with updated versions of these tools, or a JDK update that tolerates a 0 block size.

I think the main lesson with all this is that the JDK doesn't have enough "interop" tests in this area. There are dozens of tools and plugins that generate their own ZIP or JAR files. The addition of the ZIP64 extensions a few number of years ago ushered in a lot of interop issues due to different interpretations of the spec. The changes in PR expands the tests a bit but I think a follow on work will be required to massively expand the number of sample ZIP and JAR files that the JDK is tested with.

coffeys · 2023-08-16T12:07:33Z

nice work Lance, thanks for the comprehensive write up also.

mrserb · 2023-08-16T14:45:25Z

@mrserb Have you tested your ZIP file with -Djdk.util.zip.disableZip64ExtraFieldValidation=true? That's the system property to disable the additional checking and is the "get out of jail card" for anyone running into issues. As always with changes like this, or other changes that tighten up checking, there is a risk that it will break something, hence the system property to give existing deployments a workaround to continue. In this case, the original change exposed an issue with a number of Apache projects (see the linked bugs in their issue trackers) and a bad bug in the BND tool that was fixed a few years ago. The system property is the temporary workaround until the deployment has versions of the libraries produced with updated versions of these tools, or a JDK update that tolerates a 0 block size.

I disagree for a few reasons, using that property will completely disable the appropriate patch for a fix in the CPU, and it will be possible to have/accept some malicious zip files which may trigger some unfortunate behavior. That is not what we would like to recommend doing. Validation of the negative values is much more important.

The bug fixed by the BND was clearly a bug when some "random" value was used as the size of the component which was unrelated to the size of the chunk nor the size of the zip file.
The bug we discussed here related to the size of the block which is properly set, for some reason an additional validation was added for it, and it is still not mentioned from where that validation has come, there is no such thing in the spec nor in the behavior of the common tools such as zip/unzip, Windows Explorer, macOS Archive Utility and it passed integrity test. So why these checks are forced so hard?

AlanBateman · 2023-08-16T15:36:53Z

I disagree for a few reasons, using that property will completely disable the appropriate patch for a fix in the CPU, and it will be possible to have/accept some malicious zip files which may trigger some unfortunate behavior. That is not what we would like to recommend doing. Validation of the negative values is much more important.

Changes that introduce new checks or dial up validation are often risky changes. The JDK has a long history of introducing such changes with a system property or some means to temporarily disable the stricter checking, at least when the spec allows it. You may disagree with this long standing practice but it is a necessary evil to give a temporary workaround for environments that might need a bit of time to fix something after a JDK upgrade. There is of course risk in that but I don't think we can get into that discussion here.

As I think has already been said, we can't engage with you in this PR on the reasons why additional checking was added in a security update.

LanceAndersen · 2023-08-16T15:41:03Z

/integrate

openjdk · 2023-08-16T15:42:42Z

Going to push as commit 13f6450.
Since your change was applied there have been 35 commits pushed to the master branch:

24e896d: 8310275: Bug in assignment operator of ReservedMemoryRegion
1925508: 8314144: gc/g1/ihop/TestIHOPStatic.java fails due to extra concurrent mark with -Xcomp
b80001d: 8314209: Wrong @SInCE tag for RandomGenerator::equiDoubles
ef6db5c: 8314211: Add NativeLibraryUnload event
49ddb19: 8313760: [REDO] Enhance AES performance
d46f0fb: 8313720: C2 SuperWord: wrong result with -XX:+UseVectorCmov -XX:+UseCMoveUnconditionally
38687f1: 8314262: GHA: Cut down cross-compilation sysroots deeper
a602624: 8314020: Print instruction blocks in byte units
0b12480: 8314233: C2: assert(assertion_predicate_has_loop_opaque_node(iff)) failed: unexpected
e1fdef5: 8314324: "8311557: [JVMCI] deadlock with JVMTI thread suspension" causes various failures
... and 25 more: https://git.openjdk.org/jdk/compare/4b2703ad39f8160264eb30c797824cc93a6b56e2...master

Your commit was automatically rebased without conflicts.

openjdk · 2023-08-16T15:42:53Z

@LanceAndersen Pushed as commit 13f6450.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

mrserb · 2023-08-16T17:15:51Z

As I think has already been said, we can't engage with you in this PR on the reasons why additional checking was added in a security update.

I think you have an assumption that this check for exact size(8/16/24) bytes are related to the change fixed by the security update, I am pretty sure that's the wrong assumption.

Fix for JDK-8313765

519d330

LanceAndersen changed the title ~~Fix for JDK-8313765~~ 8313765: Invalid CEN header (invalid zip64 extra data field size) Aug 14, 2023

openjdk bot added nio nio-dev@openjdk.org core-libs core-libs-dev@openjdk.org labels Aug 14, 2023

Minor comment word smithing

0420ab1

LanceAndersen marked this pull request as ready for review August 14, 2023 15:52

openjdk bot added the rfr Pull request is ready for review label Aug 14, 2023

Merge branch 'master' into extraHeaders-JDK-8313765

6ab557c

simonis reviewed Aug 14, 2023

View reviewed changes

mrserb reviewed Aug 14, 2023

View reviewed changes

simonis reviewed Aug 14, 2023

View reviewed changes

Add newline

e6bae32

eirbjo mentioned this pull request Aug 14, 2023

[core] JDK 11.0.20+9 breaks jadx decompilation functionality skylot/jadx#1962

Closed

LanceAndersen added 2 commits August 14, 2023 16:46

Added an additional test

5f27087

Added an additional test with with 0 extra header followed by an extr…

eee210a

…a header with a data size

Add additional zip to the DataProvider so it is exercised

8b0d236

simonis suggested changes Aug 14, 2023

View reviewed changes

Revise retrieval of jdk.util.zip.disableZip64ExtraFieldValidation and…

68a1bd3

… tweak a comment

simonis approved these changes Aug 15, 2023

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label Aug 15, 2023

AlanBateman reviewed Aug 15, 2023

View reviewed changes

Cleaned up spacing and added missing comma

27bb0ec

simonis approved these changes Aug 15, 2023

View reviewed changes

AlanBateman approved these changes Aug 16, 2023

View reviewed changes

coffeys approved these changes Aug 16, 2023

View reviewed changes

openjdk bot added the integrated Pull request has been integrated label Aug 16, 2023

openjdk bot closed this Aug 16, 2023

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Aug 16, 2023

cherylking mentioned this pull request Aug 21, 2023

Determine if plugins can fix recent issue with installing certain features on later Java versions OpenLiberty/ci.maven#1716

Open

iBotPeaches mentioned this pull request Aug 23, 2023

[BUG] iBotPeaches/Apktool#3290

Closed

8313765: Invalid CEN header (invalid zip64 extra data field size) #15273

8313765: Invalid CEN header (invalid zip64 extra data field size) #15273

Conversation

LanceAndersen commented Aug 14, 2023 • edited

Progress

Issue

Reviewers

Reviewing

Webrev

bridgekeeper bot commented Aug 14, 2023

openjdk bot commented Aug 14, 2023

mlbridge bot commented Aug 14, 2023 • edited

Webrevs

shipilev commented Aug 14, 2023

AlanBateman commented Aug 14, 2023

simonis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrserb commented Aug 14, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LanceAndersen commented Aug 14, 2023

mrserb commented Aug 14, 2023

mrserb commented Aug 14, 2023

simonis commented Aug 14, 2023

LanceAndersen commented Aug 14, 2023

mrserb commented Aug 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrserb commented Aug 15, 2023

simonis left a comment

Choose a reason for hiding this comment

openjdk bot commented Aug 15, 2023 • edited

simonis commented Aug 15, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrserb commented Aug 15, 2023

simonis commented Aug 15, 2023

mrserb commented Aug 15, 2023 • edited

simonis commented Aug 15, 2023

mrserb commented Aug 16, 2023

mrserb commented Aug 16, 2023

AlanBateman left a comment

Choose a reason for hiding this comment

AlanBateman commented Aug 16, 2023 • edited

coffeys commented Aug 16, 2023

mrserb commented Aug 16, 2023 • edited

AlanBateman commented Aug 16, 2023

LanceAndersen commented Aug 16, 2023

openjdk bot commented Aug 16, 2023

openjdk bot commented Aug 16, 2023

mrserb commented Aug 16, 2023

LanceAndersen commented Aug 14, 2023 •

edited

mlbridge bot commented Aug 14, 2023 •

edited

mrserb commented Aug 14, 2023 •

edited

openjdk bot commented Aug 15, 2023 •

edited

mrserb commented Aug 15, 2023 •

edited

AlanBateman commented Aug 16, 2023 •

edited

mrserb commented Aug 16, 2023 •

edited