Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems to unzip file / support zip64 locator format #104

Open
beckmi opened this issue Oct 7, 2020 · 5 comments
Open

Problems to unzip file / support zip64 locator format #104

beckmi opened this issue Oct 7, 2020 · 5 comments

Comments

@beckmi
Copy link

beckmi commented Oct 7, 2020

I have a zip archive with only one compressed file in it. Using unzip it is possible to uncompress it. Using zziplib it is not possible because the dirent of this file is as follows:

(gdb) p dirent
$4 = {z_magic = "PK\001\002", z_encoder = {version = "-", ostype = ""}, z_extract = {version = "-", ostype = ""}, z_flags = "\b", z_compr = "\b", z_dostime = {time = "\353S", date = "GQ"}, z_crc32 = "|#\301\215", z_csize = "\377\377\377\377",
z_usize = "\377\377\377\377", z_namlen = "\034", z_extras = "\034", z_comment = "\000", z_diskstart = "\000", z_filetype = "\000", z_filemode = "\000\000\000", z_offset = "\377\377\377\377"}

It is possible to list the content, but it is not possible to open the compressed file, because z_offset is -1.
Attached the file in question. It is from the sftp side of ENTSOE-E. All their compressed files have the same problem.

@beckmi
Copy link
Author

beckmi commented Oct 7, 2020

2020_10_OutagesPUReasons.zip

the lost attachment ...

@gdraheim
Copy link
Owner

gdraheim commented Jan 4, 2021

It seems that PKWARE has changed its published standards.

4.3.9.2 When compressing files, compressed and uncompressed sizes
SHOULD be stored in ZIP64 format (as 8 byte values) when a
file's size exceeds 0xFFFFFFFF. However ZIP64 format MAY be
used regardless of the size of a file. When extracting, if
the zip64 extended information extra field is present for
the file the compressed and uncompressed sizes will be 8
byte values.

4.4.16 relative offset of local header: (4 bytes)
This is the offset from the start of the first disk on
which this file appears, to where the local header SHOULD
be found. If an archive is in ZIP64 format and the value
in this field is 0xFFFFFFFF, the size will be in the
corresponding 8 byte zip64 extended information extra field.

4.5.3 -Zip64 Extended Information Extra Field (0x0001):
The following is the layout of the zip64 extended
information "extra" block. If one of the size or
offset fields in the Local or Central directory
record is too small to hold the required data,
a Zip64 extended information record is created.
The order of the fields in the zip64 extended
information record is fixed, but the fields MUST
only appear if the corresponding Local or Central
directory record field is set to 0xFFFF or 0xFFFFFFFF.

Note: all fields stored in Intel low-byte/high-byte order.

 Value                 Size       Description
    0x0001                      2 bytes    Tag for this "extra" block type (ZIP64)
    Size                        2 bytes    Size of this "extra" block
    Original Size               8 bytes    Original uncompressed file size
    Compressed Size             8 bytes    Size of compressed data
    Relative Header Offset      8 bytes    Offset of local header record
    Disk Start  Number          4 bytes    Number of the disk on which  this file starts 

So far the zziplib can read a ZIP64 central directory but it does not read a ZIP64 extras block.

The real bug here is the fact that the file you provided does NOT provide a ZIP64 central directory (magic PK\6\6) but only a normal ZIP central directory (magic PK\6\5) so that the use of a ZIP64 extras block is atleast unintended .... as the usage of 0xFFFF as an extension marker was defined for the ZIP64 file format.

It could be implemented however.

@gdraheim
Copy link
Owner

gdraheim commented Jan 4, 2021

After a bit more debugging I can see that the ZIP64-trailer is not used but instead there is a ZIP64-locator (PK\6\7). The pkware appnote documentation says that it was introduced in version 6.2 in 2004/2005. Bewildering as it may seem but the zziplib is older (going back to the 1990ies).

I did check if I can implement the functionality but quite some logic needs to be changed here, so it is a real feature request instead of just a bug fix. I am sorry but this will not come around anytime soon.

gdraheim added a commit that referenced this issue Jan 4, 2021
@beckmi
Copy link
Author

beckmi commented Jan 4, 2021

Nevertheless – thanks for the work done so far. Is an open spec downloadable?

Have found it. For others who search it: https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT

@gdraheim
Copy link
Owner

gdraheim commented Jan 4, 2021

Yes, that's it - when pkware did create the first zip.exe they were shipping the package with a APPNOTE.TXT file which did describe (parts of) the file format. That name has stuck referring to the standardisation proposal later.

Here's the official archive = https://support.pkware.com/home/pkzip/developer-tools/appnote/application-note-archives

@gdraheim gdraheim changed the title Problems to unzip file Problems to unzip file / support zip64 locator format Jan 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants