Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive/zip: zip64 extra headers problems #33116

Open
philip-firstorder opened this issue Jul 15, 2019 · 14 comments · May be fixed by #33118
Open

archive/zip: zip64 extra headers problems #33116

philip-firstorder opened this issue Jul 15, 2019 · 14 comments · May be fixed by #33118
Labels

Comments

@philip-firstorder
Copy link

@philip-firstorder philip-firstorder commented Jul 15, 2019

What version of Go are you using (go version)?

$ go version
go1.12.1 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/philip/Library/Caches/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/philip/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/nd/dfzcj9w53w35kbvytmjwczg80000gn/T/go-build036595790=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Created 2 blank files:

$ mkfile -n 5000000000 ~/Desktop/5GB_big_empty_file
$ mkfile -n 5000 ~/Desktop/5KB_small_empty_file

Archived them with Go/archive/zip in the EXACT ORDER as above using:

fileHeader := &zip.FileHeader{
	Method: zip.Store, // no compression, very important
}

Then inspected the result in terminal with zipinfo -v and the contents with Hex Fiend:.

Then I tested the archive against all popular unarchivers and spend the last 2 weeks double-checking with their respective developers to confirm the problem:

What did you expect to see?

Document: https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT

4.5.3

The order of the fields in the zip64 extended
information record is fixed, but the fields MUST
only appear if the corresponding Local or Central
directory record field is set to 0xFFFF or 0xFFFFFFFF.

Not documented here is what to do when the size is EXACTLY 0xFFFFFFFF. But since the main purpose of archiving files is to make them as small as possible, then it makes more sense to NOT add a useless zip64 with a 0 value taking 8 extra data bytes for nothing.

Thus the Central-directory extra fields should contain only 16 and 8 data bytes, see correct 7zip code, for ONLY the fields > 0xFFFFFFFF:

Entry 1. Only 16 bytes for compressed and uncompressed sizes

Central directory entry #1:
---------------------------
  5GB_big_empty_file

  offset of local header from start of archive:   0
                                                  (0000000000000000h) bytes
...
  compressed size:                                5000000000 bytes
  uncompressed size:                              5000000000 bytes
...
  The central-directory extra field contains:
  - A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 16 data bytes:
    00 f2 05 2a 01 00 00 00 00 f2 05 2a 01 00 00 00.

Entry 2. Only 8 bytes for the offset of local header

Central directory entry #2:
---------------------------
  5KB_small_empty_file

  offset of local header from start of archive:   5000000068
                                                  (000000012A05F244h) bytes
...
  compressed size:                                5000 bytes
  uncompressed size:                              5000 bytes
...
  The central-directory extra field contains:
  - A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 8 data bytes:
    44 f2 05 2a 01 00 00 00.

What did you see instead?

Go wrongly adds 24 data bytes in the extra fields for both files:

Central directory entry #1:
---------------------------
  5GB_big_empty_file

  offset of local header from start of archive:   0 <= this should NOT be added
                                                  (0000000000000000h) bytes
...
  compressed size:                                5000000000 bytes
  uncompressed size:                              5000000000 bytes
...
  The central-directory extra field contains:
  - A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 24 data bytes:
    00 f2 05 2a 01 00 00 00 00 f2 05 2a 01 00 00 00 00 00 00 00 00 00 00 00.

Central directory entry #2:
---------------------------
  5KB_small_empty_file

  offset of local header from start of archive:   5000000081
                                                  (000000012A05F251h) bytes
 ...
  compressed size:                                5000 bytes <= this should NOT be added
  uncompressed size:                              5000 bytes <= this should NOT be added
 ...
  The central-directory extra field contains:
  - A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 24 data bytes:
    88 13 00 00 00 00 00 00 88 13 00 00 00 00 00 00 51 f2 05 2a 01 00 00 00.

This causes errors or warning messages in some unarchivers.

Source code explanation

src/archive/zip/writer.go#L101

if h.isZip64() || h.offset >= uint32max { // <= This check should be strictly > uint32max
	// the file needs a zip64 header. store maxint in both
	// 32 bit size fields (and offset later) to signal that the
	// zip64 extra header should be used. // <= These 3 fields should NOT be all set together
	b.uint32(uint32max) // compressed size  // <= set only if h.CompressedSize64 > uint32max
	b.uint32(uint32max) // uncompressed size  // <= set only if h.UncompressedSize64 > uint32max

	// append a zip64 extra block to Extra
	var buf [28]byte // 2x uint16 + 3x uint64 // <= These 3x uint64 fields should NOT be all set together
	eb := writeBuf(buf[:])
	eb.uint16(zip64ExtraID)
	eb.uint16(24) // size = 3x uint64 // <= This could be either 8, 16 or 24
	eb.uint64(h.UncompressedSize64) // <= set only if h.UncompressedSize64 > uint32max
	eb.uint64(h.CompressedSize64) // <= set only if h.CompressedSize64 > uint32max
	eb.uint64(h.offset) // <= set only if h.offset > uint32max
	h.Extra = append(h.Extra, buf[:]...)
} else {
	b.uint32(h.CompressedSize) // <= not needed if checked above
	b.uint32(h.UncompressedSize) // <= not needed if checked above
}

Proposed Solution

Replacing the above code with this should solve the problems.

if h.CompressedSize64 > uint32max {
	b.uint32(uint32max)
} else {
	b.uint32(uint32(h.CompressedSize))
}
if h.UncompressedSize64 > uint32max {
	b.uint32(uint32max)
} else {
	b.uint32(uint32(h.UncompressedSize))
}

// append a zip64 extra block to Extra
if h.CompressedSize64 > uint32max || h.UncompressedSize > uint32max || h.offset > uint32max {

	zip64ExtraSize := uint16(0)
	if h.CompressedSize64 > uint32max {
		zip64ExtraSize += 8
	}
	if h.UncompressedSize64 > uint32max {
		zip64ExtraSize += 8
	}
	if h.offset > uint32max {
		zip64ExtraSize += 8
	}

	var buf []byte
	buf = make([]byte, 4+zip64ExtraSize) // 2x uint16 + zip64ExtraSize
	eb := writeBuf(buf[:])
	eb.uint16(zip64ExtraID)
	eb.uint16(zip64ExtraSize)
	if h.CompressedSize64 > uint32max {
		eb.uint64(h.UncompressedSize64)
	}
	if h.UncompressedSize64 > uint32max {
		eb.uint64(h.CompressedSize64)
	}
	if h.offset > uint32max {
		eb.uint64(h.offset)
	}
	h.Extra = append(h.Extra, buf[:]...)
}
philip-firstorder added a commit to firstorder-gmbh/go that referenced this issue Jul 15, 2019
The central-directory extra fields can contain either 8, 16 or 24 data bytes for ONLY the fields > uint32max (0xFFFFFFFF).

They should not be all set when either CompressedSize, UncompressedSize or offset  > uint32max

Fixes golang#33116
@gopherbot
Copy link

@gopherbot gopherbot commented Jul 15, 2019

Change https://golang.org/cl/186178 mentions this issue: archive/zip: fix zip64 extra headers problems

philip-firstorder added a commit to firstorder-gmbh/go that referenced this issue Jul 15, 2019
Should be h.UncompressedSize64 > uint32max

Fixes golang#33116
@odeke-em odeke-em changed the title zip64 extra headers problems archive/zip: zip64 extra headers problems Jul 15, 2019
@odeke-em
Copy link
Member

@odeke-em odeke-em commented Jul 15, 2019

Thank you for this report @philip-firstorder as well as for the CL, and welcome to the Go project!

I shall kindly page @dsnet to take a look.

@philip-firstorder
Copy link
Author

@philip-firstorder philip-firstorder commented Jul 29, 2019

@odeke-em Thank you for forwarding this. I will be using my fork in production till this gets released.

@philip-firstorder
Copy link
Author

@philip-firstorder philip-firstorder commented Nov 10, 2019

Hi @odeke-em, I quoted the official documentation in the description of this issue.

I also think I understand why the confusion with
4.3.9.2 which says:

However ZIP64 format MAY be used regardless of the size of a file.

This refers to the local file headers and data descriptors, which are usually ignored by legacy unarchivers in favor of the central directory records at the end of the zip file.

4.3.6 Overall .ZIP file format:

  [local file header 1]
  [encryption header 1]
  [file data 1]
  [data descriptor 1]
  . 
  .
  .
  [local file header n]
  [encryption header n]
  [file data n]
  [data descriptor n]
  [archive decryption header] 
  [archive extra data record] 
  [central directory header 1]
  .
  .
  .
  [central directory header n]
  [zip64 end of central directory record]
  [zip64 end of central directory locator] 
  [end of central directory record]

The bug I discovered was in the central directory code, which should follow the 4.5.3 chapter in the documentation, not 4.3.9.2.

Does that make sense?

@odeke-em
Copy link
Member

@odeke-em odeke-em commented Nov 10, 2019

@odeke-em
Copy link
Member

@odeke-em odeke-em commented Mar 3, 2020

How’s it going @philip-firstorder? Go1.15 development is currently in session. I left some feedback on your change. Please take a look, thanks!

@philip-firstorder
Copy link
Author

@philip-firstorder philip-firstorder commented Mar 3, 2020

Hello @odeke-em. Unfortunately my knowledge in unit-testing in Go is limited. Can you please refer to the original programmer of the zip testing code?
Since I implemented the fix in my code I never had an issue again, so I can pretty mich guarantee that it will work.
Also, Keka, The Unarchiver and some other vendors all released fix updates for this issue.

@odeke-em
Copy link
Member

@odeke-em odeke-em commented Mar 3, 2020

@philip-firstorder
Copy link
Author

@philip-firstorder philip-firstorder commented Mar 3, 2020

Alright, I will look over that and get back at you.

@odeke-em
Copy link
Member

@odeke-em odeke-em commented Mar 3, 2020

For easy reference, here is the content that I suggested https://go-review.googlesource.com/c/go/+/186178/2#message-edf24149016442dc4300e59c92570de4d8c049d9 with the test and the commit message update,
and then down below I found for you relevant parts of the zip spec to quote in your commit message https://go-review.googlesource.com/c/go/+/186178/2#message-28c353e3634d77f238f682169c7f0dc9007f5442

I wanted to say again, thank you for filing this issue, for investigating it and for owning it and sending a change fixing. You’ve got this and I look forward to interacting with you even more!

@odeke-em odeke-em added NeedsFix and removed NeedsInvestigation labels Apr 9, 2020
@odeke-em
Copy link
Member

@odeke-em odeke-em commented Apr 9, 2020

@philip-firstorder kindly pinging you as the Go development tree closes in 3 weeks,
and it has been more than a month since my last ping :)

@philip-firstorder
Copy link
Author

@philip-firstorder philip-firstorder commented Apr 9, 2020

@philip-firstorder kindly pinging you as the Go development tree closes in 3 weeks,
and it has been more than a month since my last ping :)

Hi, sorry for the delay but can some programmer at Google do the testing? My knowledge and time hinder me to further this issue.

@odeke-em
Copy link
Member

@odeke-em odeke-em commented Apr 9, 2020

@philip-firstorder
Copy link
Author

@philip-firstorder philip-firstorder commented Apr 9, 2020

You could go right ahead and start the CL from scratch with the changes you want to make, no problem for me. Sorry I cannot be of more help here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

3 participants
You can’t perform that action at this time.