Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extraction of 5GB (ZIP64) archive created with Go failed #423

Closed
philip-firstorder opened this issue Jun 26, 2019 · 52 comments

Comments

Projects
None yet
3 participants
@philip-firstorder
Copy link

commented Jun 26, 2019

To reproduce you need to zip a file bigger than 4GB with another smaller file, in my case my resulting archive it 5GB.

ZIP Method: Storage
Archiver: https://golang.org/src/archive/zip/writer.go

Error when unarchiving:
Screenshot 2019-06-26 at 16 21 15

I managed to unarchive from terminal using the unzip command. Also using The Unarchiver (Mac), 7zip (Windows). So the archive is not corrupted.

Here are the headers I printed from terminal.

$ zipinfo -v F3.zip

Archive:  F3.zip
There is no zipfile comment.

End-of-central-directory record:
-------------------------------

  Zip archive file size:                5928918633 (0000000161641E69h)
  Actual end-cent-dir record offset:    5928918532 (0000000161641E04h)
  Expected end-cent-dir record offset:  5928918532 (0000000161641E04h)
  (based on the length of the central directory and its expected offset)

  This zipfile constitutes the sole disk of a single-part archive; its
  central directory contains 2 entries.
  The central directory is 270 (000000000000010Eh) bytes long,
  and its (expected) offset in bytes from the beginning of the zipfile
  is 5928918262 (0000000161641CF6h).


Central directory entry #1:
---------------------------

  Archive 5GB.zip

  offset of local header from start of archive:   0
                                                  (0000000000000000h) bytes
  file system or operating system of origin:      MS-DOS, OS/2 or NT FAT
  version of encoding software:                   2.0
  minimum file system compatibility required:     MS-DOS, OS/2 or NT FAT
  minimum software version required to extract:   4.5
  compression method:                             none (stored)
  file security status:                           not encrypted
  extended local header:                          yes
  file last modified on (DOS date/time):          2019 Jun 19 19:34:28
  file last modified on (UT extra field modtime): 2019 Jun 19 19:34:28 local
  file last modified on (UT extra field modtime): 2019 Jun 19 17:34:28 UTC
  32-bit CRC value (hex):                         06691a94
  compressed size:                                5720868503 bytes
  uncompressed size:                              5720868503 bytes
  length of filename:                             15 characters
  length of extra field:                          37 bytes
  length of file comment:                         40 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  non-MSDOS external file attributes:             000000 hex
  MS-DOS file attributes (00 hex):                none

  The central-directory extra field contains:
  - A subfield with ID 0x5455 (universal time) and 5 data bytes.
    The local extra field has UTC/GMT modification time.
  - A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 24 data bytes:
    97 86 fd 54 01 00 00 00 97 86 fd 54 01 00 00 00 00 00 00 00 00 00 00 00.

------------------------- file comment begins ----------------------------
5d0105dcabf05d83dd8014bb/Archive 5GB.zip
-------------------------- file comment ends -----------------------------

Central directory entry #2:
---------------------------

  There are an extra -4 bytes preceding this file.

  DJI 0044.mov

  offset of local header from start of archive:   5720868581
                                                  (0000000154FD86E5h) bytes
  file system or operating system of origin:      MS-DOS, OS/2 or NT FAT
  version of encoding software:                   2.0
  minimum file system compatibility required:     MS-DOS, OS/2 or NT FAT
  minimum software version required to extract:   2.0
  compression method:                             none (stored)
  file security status:                           not encrypted
  extended local header:                          yes
  file last modified on (DOS date/time):          2019 Jun 24 17:04:38
  file last modified on (UT extra field modtime): 2019 Jun 24 17:04:39 local
  file last modified on (UT extra field modtime): 2019 Jun 24 15:04:39 UTC
  32-bit CRC value (hex):                         68b23c95
  compressed size:                                208049614 bytes
  uncompressed size:                              208049614 bytes
  length of filename:                             12 characters
  length of extra field:                          37 bytes
  length of file comment:                         37 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  non-MSDOS external file attributes:             000000 hex
  MS-DOS file attributes (00 hex):                none

  The central-directory extra field contains:
  - A subfield with ID 0x5455 (universal time) and 5 data bytes.
    The local extra field has UTC/GMT modification time.
  - A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 24 data bytes:
    ce 95 66 0c 00 00 00 00 ce 95 66 0c 00 00 00 00 e5 86 fd 54 01 00 00 00.

------------------------- file comment begins ----------------------------
5d013d45409327f33177b621/DJI 0044.mov
-------------------------- file comment ends -----------------------------

@philip-firstorder philip-firstorder changed the title ZIP64 archive not opened (5GB) Extraction of 5GB (ZIP64) archive failed Jun 26, 2019

@aonez

This comment has been minimized.

Copy link
Owner

commented Jun 27, 2019

First time using Go. Can you provide the script used for the compression?

@aonez aonez self-assigned this Jun 27, 2019

@aonez

This comment has been minimized.

Copy link
Owner

commented Jun 27, 2019

@philip-firstorder I just need an example that sets the Store method.

@aonez

This comment has been minimized.

Copy link
Owner

commented Jun 27, 2019

Maybe I'm just missing how to call the bundled zip package in Go?

@aonez aonez added this to the Look at milestone Jun 27, 2019

@aonez aonez added the zip label Jun 27, 2019

@aonez

This comment has been minimized.

Copy link
Owner

commented Jun 27, 2019

I'll take a look at #422 once I can create Store packages

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jun 27, 2019

Just use this: https://golangcode.com/create-zip-files-in-go/
And change to from zip.Deflate to zip.Store
header.Method = zip.Store

@aonez

This comment has been minimized.

Copy link
Owner

commented Jun 27, 2019

Did a quick test and it results in a corrupted file. I'm gonna do some more tests.

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jun 27, 2019

If you can manage to unarchive from terminal using the unzip command. Or using The Unarchiver (Mac/7zip (Windows), then the archive is not really corrupted.

@aonez

This comment has been minimized.

Copy link
Owner

commented Jun 27, 2019

My test consist in a big MKV file (~5GB) and another file of indifferent size, similar to the test you've reported.

Both 7-Zip and The Unarchiver warn there's some corruption. The Unarchiver, after dismissing the warning, extracts properly the big file and ignores the little one while 7-Zip extracts both. unzip works and throws no warning/error.

Using Keka in the default configuration throws an error message and the result files are the same as with The Unarchiver, since it uses unar. If I set it to use p7zip instead (defaults write com.aone.keka UnzipWithUNAR false), it also throws an error but the result is the same as in 7-Zip, all files are extracted.

So even if the file can be read and extracted, I think either the script or the Go library is creating bad headers.

@aonez aonez changed the title Extraction of 5GB (ZIP64) archive failed Extraction of 5GB (ZIP64) archive created with Go failed Jun 27, 2019

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jun 27, 2019

Thank you for your tests, It’s very good you could reproduce the error.

If the go library creates bad headers, then we need to find out what those are. Can you compare their headers with a valid archive to see the differences?

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 1, 2019

I did more tests archiving/unarchiving an Archive 5GB.zip file together with a Small image.png here are the unarchiving results:

Go.zip: archived with Go

Archive Utility (10.11): Error 2 - No such file or directory => Nothing extracted
BetterZip (4.2.4): Both files successfully extracted in the temp folder, but at the end I get the error => The archive is either damaged or in an unsupported format.
The transcript window may contain further details => Then after pressing OK on the error dialog the temp folder is deleted
Keka unar (1.1.17): Error code 1 using "kekaunar" Unknown error => Archive 5GB.zip is successfully extracted, but not Small image.png
Keka p7zip (1.1.17): Error code 2 using "p7zip" Fatal error => Both files successfully extracted in the Go folder
Keka unar (1.1.18): Both files successfully extracted in the Go folder
Keka p7zip (1.1.18): Error code 2 using "p7zip" Fatal error => Both files successfully extracted in the Go folder
Stuffit Expander (16.0.6): No errors => Both files successfully extracted
The Unarchiver (4.1.0): There was a problem while reading the contents of "Go.zip": Data is corrupted => clicked Continue => Archive 5GB.zip is extracted, but not Small image.png
Terminal UnZip (6.00) $ unzip Go.zip: No errors => Both files successfully extracted

Keka.zip: archived with Keka (1.1.17)

Archive Utility (10.11): Error 2 - No such file or directory => Nothing extracted
BetterZip (4.2.4): Both files successfully extracted
Keka unar (1.1.17): No errors => Both files successfully extracted
Keka p7zip (1.1.17): No errors => Both files successfully extracted in the Keka folder
Stuffit Expander (16.0.6): Expansion failed. The structure of the archive is damaged => Nothing extracted
The Unarchiver (4.1.0): There was a problem while reading the contents of "Keka.zip": Data is corrupted => clicked Continue => Archive 5GB.zip is extracted, but not Small image.png
Terminal UnZip (6.00) $ unzip Keka.zip: No errors => Both files successfully extracted

Archive Utility.zip: archived with Finder (macOS 10.14.5)

Archive Utility (10.11): No errors => Both files successfully extracted
BetterZip (4.2.4): The archive is either damaged or in an unsupported format. => Archive 5GB has size on disk only 1.43GB, no file can be extracted
Keka unar (1.1.17): Error code 1 using "kekaunar". Unknown error => Archive 5GB has only 1.42GB extracted, Small image.png is fully extracted in the Archive Utility folder
Keka p7zip (1.1.17): Error code 2 using "p7zip" Fatal error => Nothing extracted
Stuffit Expander (16.0.6): Expansion failed. The structure of the archive is damaged => Nothing extracted
The Unarchiver (4.1.0): Could not extract "Archive 5GB.zip" from the "Archive Utility.zip" archive: The archive file is incomplete => clicked Continue => Archive 5GB has only 1.42GB extracted, Small image.png is fully extracted in the Archive Utility folder
Terminal UnZip (6.00) $ unzip "Archive Utility.zip": See errors below => Archive 5GB has only 1.42GB extracted, Small image.png is fully extracted, empty __MACOSX folder extracted next to the other files in the root

warning [Archive Utility.zip]:  4294967296 extra bytes at beginning or within zipfile
  (attempting to process anyway)
file #1:  bad zipfile offset (local header sig):  4294967296
  (attempting to re-compensate)
  inflating: Archive 5GB.zip         
  error:  invalid compressed data to inflate
file #2:  bad zipfile offset (local header sig):  1417650120
  (attempting to re-compensate)
   creating: __MACOSX/
  inflating: __MACOSX/._Archive 5GB.zip  
  inflating: Small image.png         
  inflating: __MACOSX/._Small image.png  
@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 1, 2019

And here are the directory records for all the files:

$ zipinfo -v Go.zip

Archive:  Go.zip
There is no zipfile comment.

End-of-central-directory record:
-------------------------------

  Zip archive file size:                5721140330 (000000015501AC6Ah)
  Actual end-cent-dir record offset:    5721140229 (000000015501AC05h)
  Expected end-cent-dir record offset:  5721140229 (000000015501AC05h)
  (based on the length of the central directory and its expected offset)

  This zipfile constitutes the sole disk of a single-part archive; its
  central directory contains 2 entries.
  The central directory is 196 (00000000000000C4h) bytes long,
  and its (expected) offset in bytes from the beginning of the zipfile
  is 5721140033 (000000015501AB41h).


Central directory entry #1:
---------------------------

  Archive 5GB.zip

  offset of local header from start of archive:   0
                                                  (0000000000000000h) bytes
  file system or operating system of origin:      MS-DOS, OS/2 or NT FAT
  version of encoding software:                   2.0
  minimum file system compatibility required:     MS-DOS, OS/2 or NT FAT
  minimum software version required to extract:   4.5
  compression method:                             none (stored)
  file security status:                           not encrypted
  extended local header:                          yes
  file last modified on (DOS date/time):          2019 Jun 19 19:34:28
  file last modified on (UT extra field modtime): 2019 Jun 19 19:34:28 local
  file last modified on (UT extra field modtime): 2019 Jun 19 17:34:28 UTC
  32-bit CRC value (hex):                         06691a94
  compressed size:                                5720868503 bytes
  uncompressed size:                              5720868503 bytes
  length of filename:                             15 characters
  length of extra field:                          37 bytes
  length of file comment:                         0 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  non-MSDOS external file attributes:             000000 hex
  MS-DOS file attributes (00 hex):                none

  The central-directory extra field contains:
  - A subfield with ID 0x5455 (universal time) and 5 data bytes.
    The local extra field has UTC/GMT modification time.
  - A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 24 data bytes:
    97 86 fd 54 01 00 00 00 97 86 fd 54 01 00 00 00 00 00 00 00 00 00 00 00.

  There is no file comment.

Central directory entry #2:
---------------------------

  There are an extra -4 bytes preceding this file.

  Small image.png

  offset of local header from start of archive:   5720868581
                                                  (0000000154FD86E5h) bytes
  file system or operating system of origin:      MS-DOS, OS/2 or NT FAT
  version of encoding software:                   2.0
  minimum file system compatibility required:     MS-DOS, OS/2 or NT FAT
  minimum software version required to extract:   2.0
  compression method:                             none (stored)
  file security status:                           not encrypted
  extended local header:                          yes
  file last modified on (DOS date/time):          2019 Jul 1 16:08:48
  file last modified on (UT extra field modtime): 2019 Jul 1 16:08:49 local
  file last modified on (UT extra field modtime): 2019 Jul 1 14:08:49 UTC
  32-bit CRC value (hex):                         2ad431fd
  compressed size:                                271382 bytes
  uncompressed size:                              271382 bytes
  length of filename:                             15 characters
  length of extra field:                          37 bytes
  length of file comment:                         0 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  non-MSDOS external file attributes:             000000 hex
  MS-DOS file attributes (00 hex):                none

  The central-directory extra field contains:
  - A subfield with ID 0x5455 (universal time) and 5 data bytes.
    The local extra field has UTC/GMT modification time.
  - A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 24 data bytes:
    16 24 04 00 00 00 00 00 16 24 04 00 00 00 00 00 e5 86 fd 54 01 00 00 00.

  There is no file comment.

$ zipinfo -v Keka.zip

Archive:  Keka.zip
There is no zipfile comment.

End-of-central-directory record:
-------------------------------

  Zip archive file size:                5721140319 (000000015501AC5Fh)
  Actual end-cent-dir record offset:    5721140221 (000000015501ABFDh)
  Expected end-cent-dir record offset:  5721140221 (000000015501ABFDh)
  (based on the length of the central directory and its expected offset)

  This zipfile constitutes the sole disk of a single-part archive; its
  central directory contains 2 entries.
  The central directory is 226 (00000000000000E2h) bytes long,
  and its (expected) offset in bytes from the beginning of the zipfile
  is 5721139995 (000000015501AB1Bh).


Central directory entry #1:
---------------------------

  Archive 5GB.zip

  offset of local header from start of archive:   0
                                                  (0000000000000000h) bytes
  file system or operating system of origin:      Unix
  version of encoding software:                   6.3
  minimum file system compatibility required:     Unix
  minimum software version required to extract:   4.5
  compression method:                             none (stored)
  file security status:                           not encrypted
  extended local header:                          no
  file last modified on (DOS date/time):          2019 Jun 19 19:34:28
  32-bit CRC value (hex):                         06691a94
  compressed size:                                5720868503 bytes
  uncompressed size:                              5720868503 bytes
  length of filename:                             15 characters
  length of extra field:                          56 bytes
  length of file comment:                         0 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  Unix file attributes (100755 octal):            -rwxr-xr-x
  MS-DOS file attributes (20 hex):                arc 

  The central-directory extra field contains:
  - A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 16 data bytes:
    97 86 fd 54 01 00 00 00 97 86 fd 54 01 00 00 00.
  - A subfield with ID 0x000a (PKWARE Win32) and 32 data bytes.  The first
    20 are:   00 00 00 00 01 00 18 00 00 aa b8 3e c5 26 d5 01 80 31 62 05.

  There is a local extra field with ID 0x5855 (old Info-ZIP Unix/OS2/NT) and
  8 data bytes (GMT modification/access times only).

  There is no file comment.

Central directory entry #2:
---------------------------

  There are an extra -44 bytes preceding this file.

  Small image.png

  offset of local header from start of archive:   5720868568
                                                  (0000000154FD86D8h) bytes
  file system or operating system of origin:      Unix
  version of encoding software:                   6.3
  minimum file system compatibility required:     Unix
  minimum software version required to extract:   4.5
  compression method:                             none (stored)
  file security status:                           not encrypted
  extended local header:                          no
  file last modified on (DOS date/time):          2019 Jul 1 16:09:00
  32-bit CRC value (hex):                         2ad431fd
  compressed size:                                271382 bytes
  uncompressed size:                              271382 bytes
  length of filename:                             15 characters
  length of extra field:                          48 bytes
  length of file comment:                         0 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  Unix file attributes (100644 octal):            -rw-r--r--
  MS-DOS file attributes (20 hex):                arc 

  The central-directory extra field contains:
  - A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 8 data bytes:
    d8 86 fd 54 01 00 00 00.
  - A subfield with ID 0x000a (PKWARE Win32) and 32 data bytes.  The first
    20 are:   00 00 00 00 01 00 18 00 00 26 9e 87 16 30 d5 01 00 da 62 8c.

  There is a local extra field with ID 0x5855 (old Info-ZIP Unix/OS2/NT) and
  8 data bytes (GMT modification/access times only).

  There is no file comment.

$ zipinfo -v "Archive Utility.zip"

Archive:  Archive Utility.zip
There is no zipfile comment.

End-of-central-directory record:
-------------------------------

  Zip archive file size:                5712833727 (000000015482ECBFh)
  Actual end-cent-dir record offset:    5712833705 (000000015482ECA9h)
  Expected end-cent-dir record offset:  1417866409 (000000005482ECA9h)
  (based on the length of the central directory and its expected offset)

  This zipfile constitutes the sole disk of a single-part archive; its
  central directory contains 5 entries.
  The central directory is 381 (000000000000017Dh) bytes long,
  and its (expected) offset in bytes from the beginning of the zipfile
  is 1417866028 (000000005482EB2Ch).

warning [Archive Utility.zip]:  4294967296 extra bytes at beginning or within zipfile
  (attempting to process anyway)

Central directory entry #1:
---------------------------

  Archive 5GB.zip

  offset of local header from start of archive:   0
                                                  (0000000000000000h) bytes
  file system or operating system of origin:      Unix
  version of encoding software:                   2.1
  minimum file system compatibility required:     MS-DOS, OS/2 or NT FAT
  minimum software version required to extract:   2.0
  compression method:                             deflated
  compression sub-type (deflation):               normal
  file security status:                           not encrypted
  extended local header:                          yes
  file last modified on (DOS date/time):          2019 Jun 19 19:34:28
  file last modified on (UT extra field modtime): 2019 Jun 19 19:34:28 local
  file last modified on (UT extra field modtime): 2019 Jun 19 17:34:28 UTC
  32-bit CRC value (hex):                         06691a94
  compressed size:                                1417650043 bytes
  uncompressed size:                              1425901207 bytes
  length of filename:                             15 characters
  length of extra field:                          12 bytes
  length of file comment:                         0 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  Unix file attributes (100755 octal):            -rwxr-xr-x
  MS-DOS file attributes (00 hex):                none

  The central-directory extra field contains:
  - A subfield with ID 0x5855 (old Info-ZIP Unix/OS2/NT) and 8 data bytes:
    a2 17 1a 5d 24 72 0a 5d.

  There is no file comment.

Central directory entry #2:
---------------------------

  There are an extra 16 bytes preceding this file.

  __MACOSX/

  offset of local header from start of archive:   1417650120
                                                  (00000000547F9FC8h) bytes
  file system or operating system of origin:      Unix
  version of encoding software:                   2.1
  minimum file system compatibility required:     MS-DOS, OS/2 or NT FAT
  minimum software version required to extract:   1.0
  compression method:                             none (stored)
  file security status:                           not encrypted
  extended local header:                          no
  file last modified on (DOS date/time):          2019 Jul 1 16:28:22
  file last modified on (UT extra field modtime): 2019 Jul 1 16:28:21 local
  file last modified on (UT extra field modtime): 2019 Jul 1 14:28:21 UTC
  32-bit CRC value (hex):                         00000000
  compressed size:                                0 bytes
  uncompressed size:                              0 bytes
  length of filename:                             9 characters
  length of extra field:                          12 bytes
  length of file comment:                         0 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  Unix file attributes (040775 octal):            drwxrwxr-x
  MS-DOS file attributes (00 hex):                none

  The central-directory extra field contains:
  - A subfield with ID 0x5855 (old Info-ZIP Unix/OS2/NT) and 8 data bytes:
    85 18 1a 5d 85 18 1a 5d.

  There is no file comment.

Central directory entry #3:
---------------------------

  __MACOSX/._Archive 5GB.zip

  offset of local header from start of archive:   1417650175
                                                  (00000000547F9FFFh) bytes
  file system or operating system of origin:      Unix
  version of encoding software:                   2.1
  minimum file system compatibility required:     MS-DOS, OS/2 or NT FAT
  minimum software version required to extract:   2.0
  compression method:                             deflated
  compression sub-type (deflation):               normal
  file security status:                           not encrypted
  extended local header:                          yes
  file last modified on (DOS date/time):          2019 Jun 19 19:34:28
  file last modified on (UT extra field modtime): 2019 Jun 19 19:34:28 local
  file last modified on (UT extra field modtime): 2019 Jun 19 17:34:28 UTC
  32-bit CRC value (hex):                         ef6dc38c
  compressed size:                                137 bytes
  uncompressed size:                              230 bytes
  length of filename:                             26 characters
  length of extra field:                          12 bytes
  length of file comment:                         0 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  Unix file attributes (100644 octal):            -rw-r--r--
  MS-DOS file attributes (00 hex):                none

  The central-directory extra field contains:
  - A subfield with ID 0x5855 (old Info-ZIP Unix/OS2/NT) and 8 data bytes:
    a2 17 1a 5d 24 72 0a 5d.

  There is no file comment.

Central directory entry #4:
---------------------------

  There are an extra 16 bytes preceding this file.

  Small image.png

  offset of local header from start of archive:   1417650400
                                                  (00000000547FA0E0h) bytes
  file system or operating system of origin:      Unix
  version of encoding software:                   2.1
  minimum file system compatibility required:     MS-DOS, OS/2 or NT FAT
  minimum software version required to extract:   2.0
  compression method:                             deflated
  compression sub-type (deflation):               normal
  file security status:                           not encrypted
  extended local header:                          yes
  file last modified on (DOS date/time):          2019 Jul 1 16:09:00
  file last modified on (UT extra field modtime): 2019 Jul 1 16:09:00 local
  file last modified on (UT extra field modtime): 2019 Jul 1 14:09:00 UTC
  32-bit CRC value (hex):                         2ad431fd
  compressed size:                                214941 bytes
  uncompressed size:                              271382 bytes
  length of filename:                             15 characters
  length of extra field:                          12 bytes
  length of file comment:                         0 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  Unix file attributes (100644 octal):            -rw-r--r--
  MS-DOS file attributes (00 hex):                none

  The central-directory extra field contains:
  - A subfield with ID 0x5855 (old Info-ZIP Unix/OS2/NT) and 8 data bytes:
    e9 17 1a 5d fc 13 1a 5d.

  There is no file comment.

Central directory entry #5:
---------------------------

  There are an extra 16 bytes preceding this file.

  __MACOSX/._Small image.png

  offset of local header from start of archive:   1417865418
                                                  (000000005482E8CAh) bytes
  file system or operating system of origin:      Unix
  version of encoding software:                   2.1
  minimum file system compatibility required:     MS-DOS, OS/2 or NT FAT
  minimum software version required to extract:   2.0
  compression method:                             deflated
  compression sub-type (deflation):               normal
  file security status:                           not encrypted
  extended local header:                          yes
  file last modified on (DOS date/time):          2019 Jul 1 16:09:00
  file last modified on (UT extra field modtime): 2019 Jul 1 16:09:00 local
  file last modified on (UT extra field modtime): 2019 Jul 1 14:09:00 UTC
  32-bit CRC value (hex):                         ca1cd7ca
  compressed size:                                522 bytes
  uncompressed size:                              1153 bytes
  length of filename:                             26 characters
  length of extra field:                          12 bytes
  length of file comment:                         0 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  Unix file attributes (100644 octal):            -rw-r--r--
  MS-DOS file attributes (00 hex):                none

  The central-directory extra field contains:
  - A subfield with ID 0x5855 (old Info-ZIP Unix/OS2/NT) and 8 data bytes:
    e9 17 1a 5d fc 13 1a 5d.

  There is no file comment.
@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 1, 2019

Just to be clear, let me know if this assumptions are correct:

  • Go.zip: compressed using Go
  • Keka.zip: compressed using Keka (v1.1.17)
  • Archive Utility.zip: compressed using Finder (macOS 10.14)
@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 1, 2019

Yes. To be precise: macOS 10.14.5 (18F132). It's very interesting how Finder compresses the big files, seems totally not standard.

Also I noticed that Keka doesn't set the "UT extra field modtime"

@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 1, 2019

It's very interesting how Finder compresses the big files, seems totally not standard

64 bit ZIP are badly build using Finder at least since Mac OS X 10.9, already investigated that one on #18. macOS 10.15 has enhancements to the Archive Utility, did not checked it yet but most probably they finally fixed that one.

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 1, 2019

Let's hope they did, everyone is angry at them on the forums :)

Anyways zip64 will be more and more used, so we need to solve these problems for all standard vendors as it will only get worse with time.

For unarchiving I also updated the Keka results using both unar and p7zip. For archiving I used the default settings.

@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 1, 2019

Also I noticed that Keka doesn't set the "UT extra field modtime"

It does store it, just in another way:

  There is a local extra field with ID 0x5855 (old Info-ZIP Unix/OS2/NT) and
  8 data bytes (GMT modification/access times only).

This should be enhanced.

Let's hope they did, everyone is angry at them on the forums :)

Just did a quick test in the beta, sadly I think it still does it wrong.

@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 1, 2019

The Unarchiver (4.1.0): There was a problem while reading the contents of "Keka.zip": Data is corrupted => clicked Continue => Archive 5GB.zip is extracted, but not Small image.png

I can't reproduce this one, works for me.

Archive Utility (10.11): Error 2 - No such file or directory => Nothing extracted

Also the error from Archive Utility looks like this in my tests:

Screen Shot 2019-07-01 at 18 31 16

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 2, 2019

Can you try with these 5.72 GB archives that I used for my tests?
(These links will expire on 16.07.2019)

They all contain these same 2 files:

@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 2, 2019

With your test files I get the same results as you. Thanks for providing them!

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 2, 2019

Nice! Did you also try to archive the 2 files on your side with Keka and then unarchive with all the vendors to see if you get the same messages?

This way we know if the different behaviour comes from the archiving stage or the unarchiving stage.

@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 2, 2019

Did you also try to archive the 2 files on your side with Keka and then unarchive with all the vendors to see if you get the same messages

Yes did that and got the same results.

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 2, 2019

Very good, now you need to identify what field causes the problem. This is a difficult bug, but if we can identify it then we can open tickets also for the other vendors to make sure they don't create bad archives that our clients cannot open.

@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 2, 2019

I don't get why the latest The Unarchiver 4.1.0 fails. The Unarchiver 3.11.1 and unar work.

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 2, 2019

Do you have a link for the 4.1.0 sources so I can check the differences?

@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 2, 2019

I think it's no more open source. You have the XADMaster sources, but that one works.

@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 3, 2019

Sorry I was mistaken. There’re in fact 8 bytes, not 4. So forget that message. I was looking at the writer.go code yesterday but all seemed ok...

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 3, 2019

I also did some tests today with 2 blank files:

$ mkfile -n 5000000000 ~/Desktop/5GB_big_empty_file
$ mkfile -n 5000 ~/Desktop/5KB_small_empty_file

Archived them with Go and Keka and inspected them with Hex Fiend:

Go.zip

Screenshot 2019-07-03 at 18 26 59

The selected bytes is the data descriptor of the previous file, which Keka doesn't have.

Screenshot 2019-07-03 at 18 30 09

This was a bug that appended an empty json {} followed by a new line characted. I corrected it, the 3 bytes should not appear anymore.
Screenshot 2019-07-03 at 18 29 15

$ zipinfo -v go.zip
Archive:  go.zip
There is no zipfile comment.

End-of-central-directory record:
-------------------------------

  Zip archive file size:                5000005461 (000000012A060755h)
  Actual end-cent-dir record offset:    5000005360 (000000012A0606F0h)
  Expected end-cent-dir record offset:  5000005360 (000000012A0606F0h)
  (based on the length of the central directory and its expected offset)

  This zipfile constitutes the sole disk of a single-part archive; its
  central directory contains 2 entries.
  The central directory is 204 (00000000000000CCh) bytes long,
  and its (expected) offset in bytes from the beginning of the zipfile
  is 5000005156 (000000012A060624h).


Central directory entry #1:
---------------------------

  5GB_big_empty_file

  offset of local header from start of archive:   0
                                                  (0000000000000000h) bytes
  file system or operating system of origin:      MS-DOS, OS/2 or NT FAT
  version of encoding software:                   2.0
  minimum file system compatibility required:     MS-DOS, OS/2 or NT FAT
  minimum software version required to extract:   4.5
  compression method:                             none (stored)
  file security status:                           not encrypted
  extended local header:                          yes
  file last modified on (DOS date/time):          2019 Jul 3 13:30:00
  file last modified on (UT extra field modtime): 2019 Jul 3 13:30:00 local
  file last modified on (UT extra field modtime): 2019 Jul 3 11:30:00 UTC
  32-bit CRC value (hex):                         5c316f50
  compressed size:                                5000000000 bytes
  uncompressed size:                              5000000000 bytes
  length of filename:                             18 characters
  length of extra field:                          37 bytes
  length of file comment:                         0 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  non-MSDOS external file attributes:             000000 hex
  MS-DOS file attributes (00 hex):                none

  The central-directory extra field contains:
  - A subfield with ID 0x5455 (universal time) and 5 data bytes.
    The local extra field has UTC/GMT modification time.
  - A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 24 data bytes:
    00 f2 05 2a 01 00 00 00 00 f2 05 2a 01 00 00 00 00 00 00 00 00 00 00 00.

  There is no file comment.

Central directory entry #2:
---------------------------

  There are an extra -4 bytes preceding this file.

  5KB_small_empty_file

  offset of local header from start of archive:   5000000081
                                                  (000000012A05F251h) bytes
  file system or operating system of origin:      MS-DOS, OS/2 or NT FAT
  version of encoding software:                   2.0
  minimum file system compatibility required:     MS-DOS, OS/2 or NT FAT
  minimum software version required to extract:   2.0
  compression method:                             none (stored)
  file security status:                           not encrypted
  extended local header:                          yes
  file last modified on (DOS date/time):          2019 Jul 3 13:30:10
  file last modified on (UT extra field modtime): 2019 Jul 3 13:30:10 local
  file last modified on (UT extra field modtime): 2019 Jul 3 11:30:10 UTC
  32-bit CRC value (hex):                         d8e50ea8
  compressed size:                                5000 bytes
  uncompressed size:                              5000 bytes
  length of filename:                             20 characters
  length of extra field:                          37 bytes
  length of file comment:                         0 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  non-MSDOS external file attributes:             000000 hex
  MS-DOS file attributes (00 hex):                none

  The central-directory extra field contains:
  - A subfield with ID 0x5455 (universal time) and 5 data bytes.
    The local extra field has UTC/GMT modification time.
  - A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 24 data bytes:
    88 13 00 00 00 00 00 00 88 13 00 00 00 00 00 00 51 f2 05 2a 01 00 00 00.

  There is no file comment

Keka.zip

Screenshot 2019-07-03 at 18 10 50

Screenshot 2019-07-03 at 18 11 21

Screenshot 2019-07-03 at 18 11 34

$ zipinfo -v keka.zip
Archive:  keka.zip
There is no zipfile comment.

End-of-central-directory record:
-------------------------------

  Zip archive file size:                5000005450 (000000012A06074Ah)
  Actual end-cent-dir record offset:    5000005352 (000000012A0606E8h)
  Expected end-cent-dir record offset:  5000005352 (000000012A0606E8h)
  (based on the length of the central directory and its expected offset)

  This zipfile constitutes the sole disk of a single-part archive; its
  central directory contains 2 entries.
  The central directory is 234 (00000000000000EAh) bytes long,
  and its (expected) offset in bytes from the beginning of the zipfile
  is 5000005118 (000000012A0605FEh).


Central directory entry #1:
---------------------------

  5GB_big_empty_file

  offset of local header from start of archive:   0
                                                  (0000000000000000h) bytes
  file system or operating system of origin:      Unix
  version of encoding software:                   6.3
  minimum file system compatibility required:     Unix
  minimum software version required to extract:   4.5
  compression method:                             none (stored)
  file security status:                           not encrypted
  extended local header:                          no
  file last modified on (DOS date/time):          2019 Jul 3 12:52:10
  32-bit CRC value (hex):                         5c316f50
  compressed size:                                5000000000 bytes
  uncompressed size:                              5000000000 bytes
  length of filename:                             18 characters
  length of extra field:                          56 bytes
  length of file comment:                         0 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  Unix file attributes (100600 octal):            -rw-------
  MS-DOS file attributes (20 hex):                arc 

  The central-directory extra field contains:
  - A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 16 data bytes:
    00 f2 05 2a 01 00 00 00 00 f2 05 2a 01 00 00 00.
  - A subfield with ID 0x000a (PKWARE Win32) and 32 data bytes.  The first
    20 are:   00 00 00 00 01 00 18 00 00 a9 22 5d 8d 31 d5 01 80 20 ec bb.

  There is a local extra field with ID 0x5855 (old Info-ZIP Unix/OS2/NT) and
  8 data bytes (GMT modification/access times only).

  There is no file comment.

Central directory entry #2:
---------------------------

  There are an extra -44 bytes preceding this file.

  5KB_small_empty_file

  offset of local header from start of archive:   5000000068
                                                  (000000012A05F244h) bytes
  file system or operating system of origin:      Unix
  version of encoding software:                   6.3
  minimum file system compatibility required:     Unix
  minimum software version required to extract:   4.5
  compression method:                             none (stored)
  file security status:                           not encrypted
  extended local header:                          no
  file last modified on (DOS date/time):          2019 Jul 3 12:55:04
  32-bit CRC value (hex):                         d8e50ea8
  compressed size:                                5000 bytes
  uncompressed size:                              5000 bytes
  length of filename:                             20 characters
  length of extra field:                          48 bytes
  length of file comment:                         0 characters
  disk number on which file begins:               disk 1
  apparent file type:                             binary
  Unix file attributes (100600 octal):            -rw-------
  MS-DOS file attributes (20 hex):                arc 

  The central-directory extra field contains:
  - A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 8 data bytes:
    44 f2 05 2a 01 00 00 00.
  - A subfield with ID 0x000a (PKWARE Win32) and 32 data bytes.  The first
    20 are:   00 00 00 00 01 00 18 00 80 5d 40 c4 8d 31 d5 01 80 eb 0b 8b.

  There is a local extra field with ID 0x5855 (old Info-ZIP Unix/OS2/NT) and
  8 data bytes (GMT modification/access times only).

  There is no file comment.

@aonez aonez modified the milestones: Look at, 1.1.18 Jul 5, 2019

@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 5, 2019

I got it now. Go does not fill the disk number in the ZIP64 extra data:

			// append a zip64 extra block to Extra
			var buf [28]byte // 2x uint16 + 3x uint64
			eb := writeBuf(buf[:])
			eb.uint16(zip64ExtraId)
			eb.uint16(24) // size = 3x uint64
			eb.uint64(h.UncompressedSize64)
			eb.uint64(h.CompressedSize64)
			eb.uint64(h.offset)
			h.Extra = append(h.Extra, buf[:]...)

XADMaster's unar expects all fields to be informed, so it fails while parsing that data. Informing the disk creates a file that unar (The Unarchiver and any app that uses XADMaster) parses properly without any error. I did not checked the code of p7zip but warns (extracts all data properly) even when informing that data. This is an example of the code edited to inform the disk data:

			// append a zip64 extra block to Extra
			var buf [32]byte // 2x uint16 + 3x uint64 + 1x uint32
			eb := writeBuf(buf[:])
			eb.uint16(zip64ExtraId)
			eb.uint16(28) // size = 3x uint64 + 1x uint32
			eb.uint64(h.UncompressedSize64)
			eb.uint64(h.CompressedSize64)
			eb.uint64(h.offset)
			eb.uint32(0) // Number of disk
			h.Extra = append(h.Extra, buf[:]...)

So here Go does it perfectly, since those fields need only to be present if they are informed. Looking at the rest of the code makes me think Go only creates a volume, not multi-volume archives. So the disk field should not be present in the extra block:

   4.5.3 -Zip64 Extended Information Extra Field (0x0001):

      The following is the layout of the zip64 extended 
      information "extra" block. If one of the size or
      offset fields in the Local or Central directory
      record is too small to hold the required data,
      a Zip64 extended information record is created.
      The order of the fields in the zip64 extended 
      information record is fixed, but the fields MUST
      only appear if the corresponding Local or Central
      directory record field is set to 0xFFFF or 0xFFFFFFFF.

      Note: all fields stored in Intel low-byte/high-byte order.

        Value      Size       Description
        -----      ----       -----------
(ZIP64) 0x0001     2 bytes    Tag for this "extra" block type
        Size       2 bytes    Size of this "extra" block
        Original 
        Size       8 bytes    Original uncompressed file size
        Compressed
        Size       8 bytes    Size of compressed data
        Relative Header
        Offset     8 bytes    Offset of local header record
        Disk Start
        Number     4 bytes    Number of the disk on which
                              this file starts 

Already fixed that one in the parser for 1.1.18. Thanks a lot for all the detailed information @philip-firstorder!

@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 5, 2019

This build should fix this issue: Keka-r3332

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 5, 2019

			// append a zip64 extra block to Extra
			var buf [32]byte // 2x uint16 + 3x uint64 + 1x uint32
			eb := writeBuf(buf[:])
			eb.uint16(zip64ExtraId)
			eb.uint16(28) // size = 3x uint64 + 1x uint32
			eb.uint64(h.UncompressedSize64)
			eb.uint64(h.CompressedSize64)
			eb.uint32(0) // Number of disk
			h.Extra = append(h.Extra, buf[:]...)

I didn't find this code anywhere, can you post a link to it?

@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 5, 2019

Go does not fill the disk number

You have it there 😊

@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 5, 2019

Ahhh! The modified one is not posted anywhere. I've modified in my local writer.go just to test.

But here Go does it perfectly

With that I meant with their code (the first one). I'm editing the comment.

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 5, 2019

This build should fix this issue: Keka-r3332

Tested this and all files are correctly extracted in the Go folder, however:
when using p7zip defaults write com.aone.keka UnzipWithUNAR false I still get this error message in the end:
Screenshot 2019-07-05 at 12 15 52

But it works, both methods unzip all contents, which I even checked in Hex Fiend for integrity.

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 5, 2019

Ahhh! The modified one is not posted anywhere. I've modified in my local writer.go just to test.

But here Go does it perfectly

With that I meant with their code (the first one). I'm editing the comment.

But in that case you forgot to add this line:

eb.uint64(h.offset)
@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 5, 2019

when using p7zip defaults write com.aone.keka UnzipWithUNAR false

I did not looked into p7zip code, no fix there.

you forgot to add this line

Right! Added that one.

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 5, 2019

Can you create an archive with your modified go writer and then test it with p7zip?

Cause then you can open a ticket to Go to fix it, because indeed they missed the extra flag.

@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 5, 2019

Can you create an archive with your modified go writer and then test it with p7zip?

I did and it warn anyway.

they missed the extra flag

The point is that those fields are not fixed. They only need to appear if in the local/central directory are set as 0xFFFF or 0xFFFFFFFF. Go does not create disks, so it does not inform that field.

Since I did not checked p7zip code I'm not sure what triggers the warning. This check should be done first.

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 5, 2019

I found this code in go reader, which is linked to this issue:

// Should have consumed the whole header.
// But popular zip & JAR creation tools are broken and
// may pad extra zeros at the end, so accept those
// too. See golang.org/issue/8186.
for _, v := range b {
	if v != 0 {
		return ErrFormat
	}
}

I think go writer shouldn't add this extra line from you

eb.uint32(0) // Number of disk line

because they didn't set the corresponding disk to FFFFFFFF just few lines below:

b = b[4:] // skip disk number start and internal file attr (2x uint16)

However I think now that their assumption that popular zip & JAR tools are broken is wrong, as maybe that represents the extra disk flag

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 5, 2019

Also a note regarding reading from LOCAL HEADERS from Streamed archives:

The Compressed and Uncompressed length cannot be known beforehand, so they are set to 0. This information is put both in the data descriptor after each file data is streamed and also in the Central directory.

$ zipdetails -v go.zip

000000000 000000004 50 4B 03 04 LOCAL HEADER #1       04034B50
000000004 000000001 14          Extract Zip Spec      14 '2.0' <= Can't know if it's zip64
000000005 000000001 00          Extract OS            00 'MS-DOS'
000000006 000000002 08 00       General Purpose Flag  0008
                                [Bit  3]              1 'Streamed'
000000008 000000002 00 00       Compression Method    0000 'Stored'
00000000A 000000004 C0 6B E3 4E Last Mod Time         4EE36BC0 'Wed Jul  3 13:30:00 2019'
00000000E 000000004 00 00 00 00 CRC                   00000000 <= See here
000000012 000000004 00 00 00 00 Compressed Length     00000000 <= See here
000000016 000000004 00 00 00 00 Uncompressed Length   00000000  <= See here
00000001A 000000002 12 00       Filename Length       0012
00000001C 000000002 09 00       Extra Length          0009
00000001E 000000012 35 47 42 5F Filename              '5GB_big_empty_file'
                    62 69 67 5F
                    65 6D 70 74
                    79 5F 66 69
                    6C 65
000000030 000000002 55 54       Extra ID #0001        5455 'UT: Extended Timestamp'
000000032 000000002 05 00         Length              0005
000000034 000000001 01            Flags               '01 mod'
000000035 000000004 B8 91 1C 5D   Mod Time            5D1C91B8 'Wed Jul  3 13:30:00 2019'
000000039 02A05F200 ...         PAYLOAD

Unexpecded END at offset 2A05F239, value 00000000
Done

In the archives generated with keka, the file sizes are also set in the Local headers, because the archive is not streamed.

$ zipdetails -v keka.zip

000000000 000000004 50 4B 03 04 LOCAL HEADER #1       04034B50
000000004 000000001 2D          Extract Zip Spec      2D '4.5'
000000005 000000001 03          Extract OS            03 'Unix'
000000006 000000002 00 00       General Purpose Flag  0000
000000008 000000002 00 00       Compression Method    0000 'Stored'
00000000A 000000004 85 66 E3 4E Last Mod Time         4EE36685 'Wed Jul  3 12:52:10 2019'
00000000E 000000004 50 6F 31 5C CRC                   5C316F50
000000012 000000004 FF FF FF FF Compressed Length     FFFFFFFF  <= See here
000000016 000000004 FF FF FF FF Uncompressed Length   FFFFFFFF  <= See here
00000001A 000000002 12 00       Filename Length       0012
00000001C 000000002 14 00       Extra Length          0014
00000001E 000000012 35 47 42 5F Filename              '5GB_big_empty_file'
                    62 69 67 5F
                    65 6D 70 74
                    79 5F 66 69
                    6C 65
000000030 000000002 01 00       Extra ID #0001        0001 'ZIP64'
000000032 000000002 10 00         Length              0010
000000034 000000008 00 F2 05 2A   Uncompressed Size   000000012A05F200  <= See here
                    01 00 00 00
00000003C 000000008 00 F2 05 2A   Compressed Size     000000012A05F200  <= See here
                    01 00 00 00
000000044 000001388 ...         PAYLOAD


Unexpecded END at offset 000013CC, value 00000000
Done

The unarchivers should ignore the local headers anyways, but was worth the mention.

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 11, 2019

I wrote an email to Robert Rezabek from BetterZip (p7zip) which also is giving errors.

If p7zip gets fixed then Keka (p7zip) and many other unarchivers will eventually get fixed.

@aonez

This comment has been minimized.

Copy link
Owner

commented Jul 12, 2019

I wrote an email to Igor Pavlow from BetterZip

Igor is the developer of 7-Zip, not BetterZip.

@macitbetter

This comment has been minimized.

Copy link

commented Jul 12, 2019

In p7zip 16.02 check out CInArchive::ReadExtra(...) in the file CPP/7zip/Archive/Zip/ZipIn.cpp. Looks right to me, all the checks for 0xFFFFFFFF are there.

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 12, 2019

I wrote an email to Igor Pavlow from BetterZip

Igor is the developer of 7-Zip, not BetterZip.

Terribly sorry about the confusion, I edited my message.

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 12, 2019

In p7zip 16.02 check out CInArchive::ReadExtra(...) in the file CPP/7zip/Archive/Zip/ZipIn.cpp. Looks right to me, all the checks for 0xFFFFFFFF are there.

I don't have an environment to debug but I wonder is it's this line CPP/7zip/Archive/Zip/ZipIn.cpp#L674

  if (remain != 0)
  {
    ExtraMinorError = true;
    // 7-Zip before 9.31 created incorrect WsAES Extra in folder's local headers.
    // so we don't return false, but just set warning flag
    // return false;
  }

In 7zip 19.00 the same function was modified and looks like this: https://github.com/kornelski/7z/blob/cb75c2b5bf0d347114d59ff7ba9b51d435c01e40/CPP/7zip/Archive/Zip/ZipIn.cpp#L982

@philip-firstorder

This comment has been minimized.

Copy link
Author

commented Jul 12, 2019

Igor from 7zip discussed here this problem. But I still get a warning message with 7zip v19.00 so I wrote to him in the thread to test the Go.zip.zip archive.

  • Headers Error There are some data after the end of the payload (solved on my side)

  • Extra_ERROR Zip64_ERROR UT Descriptor
    BUG by Go when they set extra headers even if their corresponding fields are NOT 0xFFFFFFFF.

For example when compressed and uncompressed are 0xFFFFFFFF and offset is 0, they they add all 3 fields = 24 bytes, instead of just the first 2 maxed out ones = 16 bytes

This is why for entry #1: 5GB_big_empty_file with offset 0 Go wrongly adds 24 bytes:

A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 24 data bytes (compressed, uncompressed and offset):
    00 f2 05 2a 01 00 00 00 00 f2 05 2a 01 00 00 00 00 00 00 00 00 00 00 00.

Instead the Keka (p7zip) archive has 16 bytes (compressed and uncompressed):

A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 16 data bytes:
    00 f2 05 2a 01 00 00 00 00 f2 05 2a 01 00 00 00.

Also for entry #2: 5KB_small_empty_file Go adds all 24 bytes (compressed, uncompressed and offset) instead of ONLY 8 bytes (offset). Because for this second small file it is ONLY the offset value that is maxed out.

I will open a pull request to Go to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.