Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small fix for missing Zip64 EOCDR / EOCDL if cumulative size > 4Gib and all entrys are < 4Gib #106

Merged
merged 4 commits into from
Sep 11, 2023

Conversation

St4NNi
Copy link
Contributor

@St4NNi St4NNi commented Aug 26, 2023

This is a very small PR that fixes #105 .

Explanation:

If the cumulative file size of all zip entrys exceeds the 4 Gib maximum for standard Zip a Zip64 EOCDR & EOCDL is needed.

This was not set automatically if no single entry exceeded the limit, the PR fixes this by checking the force_no_zip64 flag and updating the is_zip64 boolean if its allowed, otherwise the LargeFile error is raised:

if cd_offset > NON_ZIP64_MAX_SIZE as u64 {
    if self.force_no_zip64 {
        return Err(crate::error::ZipError::Zip64Needed(crate::error::Zip64ErrorCase::LargeFile));
    } else {
        self.is_zip64 = true;
    }
}

Afterwards the correct EOCDR & EOCDL for Zip64 will be added.

Additional Problem

#105 revealed an additional problem with Zip64 handling:

Zip64 is needed in one of the following cases:

  • (A) One entry is above 4 GiB either compressed or uncompressed
  • (B) The offset that points to the start of an entry is beyond 4Gib in the file (this is specified in the central directory file header)

The previous implementation checked only if A was the case and completely ignored B. The specification says that if the compressed and uncompressed sizes are below 4 Gib the regular fields in the file headers can (and should) be used [ref].

The actual problem was as expected the lh_offset in the central directory file header. This needed to be upper bounded to NON_ZIP64_MAX_SIZE but additionally a Zip64 extended extra field with relative_header_offset was needed (to specify the u64 offset for the actual beginning of the entry).

This is only required in the Central directory and does not need to be added in the local file header that comes in front of the raw data, "valid" fields from the regular format should be skipped completely (in order) in the Zip64 extra fields. Since both compressed and uncompressed sizes are valid no Zip64 extra field is necessary in the local file header. The extra field is only required in the CD file header.

Fix

6118ad7 / 3e080b7 fix this issue by creating a Zip64ExtendedInformationExtraFieldBuilder that keeps track of the added options and creates a valid ZIP64 extra entry via its build method.

If only case B occurs, the extra information is purely added to the central directory file header. If A is the case (with or without B) the local file header is extended by a Zip64 extra entry. The builder also keeps track of the actual expected data_size and removes the need for the hard-coded 16 bytes (because a pure relative_header_offset is only 8 bytes and relative offset + sizes is 24 bytes (8 + 16) in addition to the 16 bytes for (uncompressed + compressed) sizes only.

Caveat: This is more or less just a PoC additional tests especially for the entry_stream are needed to verify that Zip64 handling now behaves correctly.

@Majored
Copy link
Owner

Majored commented Sep 11, 2023

Apologies for the delay here, and thank you for the amount of work put into debugging this/presenting a fix.

Merging. 👍

@Majored Majored merged commit 2adce3f into Majored:main Sep 11, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect Zip64 Header Implementation
2 participants