Description
What version of Go are you using (go version
)?
$ go version go1.12.1 darwin/amd64
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (go env
)?
go env
Output
$ go env GOARCH="amd64" GOBIN="" GOCACHE="/Users/philip/Library/Caches/go-build" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="darwin" GOOS="darwin" GOPATH="/Users/philip/go" GOPROXY="" GORACE="" GOROOT="/usr/local/go" GOTMPDIR="" GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64" GCCGO="gccgo" CC="clang" CXX="clang++" CGO_ENABLED="1" GOMOD="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/nd/dfzcj9w53w35kbvytmjwczg80000gn/T/go-build036595790=/tmp/go-build -gno-record-gcc-switches -fno-common"
What did you do?
Created 2 blank files:
$ mkfile -n 5000000000 ~/Desktop/5GB_big_empty_file
$ mkfile -n 5000 ~/Desktop/5KB_small_empty_file
Archived them with Go/archive/zip in the EXACT ORDER as above using:
fileHeader := &zip.FileHeader{
Method: zip.Store, // no compression, very important
}
Then inspected the result in terminal with zipinfo -v and the contents with Hex Fiend:.
Then I tested the archive against all popular unarchivers and spend the last 2 weeks double-checking with their respective developers to confirm the problem:
- Keka (checked with aonez #423)
- The Unarchiver (checked with PaulTaykalo #100)
- BetterZip (checked with Robert Rezabek)
- 7Zip (checked with Igor Pavlov #f7ca)
What did you expect to see?
Document: https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT
4.5.3
The order of the fields in the zip64 extended
information record is fixed, but the fields MUST
only appear if the corresponding Local or Central
directory record field is set to 0xFFFF or 0xFFFFFFFF.
Not documented here is what to do when the size is EXACTLY 0xFFFFFFFF. But since the main purpose of archiving files is to make them as small as possible, then it makes more sense to NOT add a useless zip64 with a 0 value taking 8 extra data bytes for nothing.
Thus the Central-directory extra fields should contain only 16 and 8 data bytes, see correct 7zip code, for ONLY the fields > 0xFFFFFFFF:
Entry 1. Only 16 bytes for compressed and uncompressed sizes
Central directory entry #1:
---------------------------
5GB_big_empty_file
offset of local header from start of archive: 0
(0000000000000000h) bytes
...
compressed size: 5000000000 bytes
uncompressed size: 5000000000 bytes
...
The central-directory extra field contains:
- A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 16 data bytes:
00 f2 05 2a 01 00 00 00 00 f2 05 2a 01 00 00 00.
Entry 2. Only 8 bytes for the offset of local header
Central directory entry #2:
---------------------------
5KB_small_empty_file
offset of local header from start of archive: 5000000068
(000000012A05F244h) bytes
...
compressed size: 5000 bytes
uncompressed size: 5000 bytes
...
The central-directory extra field contains:
- A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 8 data bytes:
44 f2 05 2a 01 00 00 00.
What did you see instead?
Go wrongly adds 24 data bytes in the extra fields for both files:
Central directory entry #1:
---------------------------
5GB_big_empty_file
offset of local header from start of archive: 0 <= this should NOT be added
(0000000000000000h) bytes
...
compressed size: 5000000000 bytes
uncompressed size: 5000000000 bytes
...
The central-directory extra field contains:
- A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 24 data bytes:
00 f2 05 2a 01 00 00 00 00 f2 05 2a 01 00 00 00 00 00 00 00 00 00 00 00.
Central directory entry #2:
---------------------------
5KB_small_empty_file
offset of local header from start of archive: 5000000081
(000000012A05F251h) bytes
...
compressed size: 5000 bytes <= this should NOT be added
uncompressed size: 5000 bytes <= this should NOT be added
...
The central-directory extra field contains:
- A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 24 data bytes:
88 13 00 00 00 00 00 00 88 13 00 00 00 00 00 00 51 f2 05 2a 01 00 00 00.
This causes errors or warning messages in some unarchivers.
Source code explanation
src/archive/zip/writer.go#L101
if h.isZip64() || h.offset >= uint32max { // <= This check should be strictly > uint32max
// the file needs a zip64 header. store maxint in both
// 32 bit size fields (and offset later) to signal that the
// zip64 extra header should be used. // <= These 3 fields should NOT be all set together
b.uint32(uint32max) // compressed size // <= set only if h.CompressedSize64 > uint32max
b.uint32(uint32max) // uncompressed size // <= set only if h.UncompressedSize64 > uint32max
// append a zip64 extra block to Extra
var buf [28]byte // 2x uint16 + 3x uint64 // <= These 3x uint64 fields should NOT be all set together
eb := writeBuf(buf[:])
eb.uint16(zip64ExtraID)
eb.uint16(24) // size = 3x uint64 // <= This could be either 8, 16 or 24
eb.uint64(h.UncompressedSize64) // <= set only if h.UncompressedSize64 > uint32max
eb.uint64(h.CompressedSize64) // <= set only if h.CompressedSize64 > uint32max
eb.uint64(h.offset) // <= set only if h.offset > uint32max
h.Extra = append(h.Extra, buf[:]...)
} else {
b.uint32(h.CompressedSize) // <= not needed if checked above
b.uint32(h.UncompressedSize) // <= not needed if checked above
}
Proposed Solution
Replacing the above code with this should solve the problems.
if h.CompressedSize64 > uint32max {
b.uint32(uint32max)
} else {
b.uint32(uint32(h.CompressedSize))
}
if h.UncompressedSize64 > uint32max {
b.uint32(uint32max)
} else {
b.uint32(uint32(h.UncompressedSize))
}
// append a zip64 extra block to Extra
if h.CompressedSize64 > uint32max || h.UncompressedSize > uint32max || h.offset > uint32max {
zip64ExtraSize := uint16(0)
if h.CompressedSize64 > uint32max {
zip64ExtraSize += 8
}
if h.UncompressedSize64 > uint32max {
zip64ExtraSize += 8
}
if h.offset > uint32max {
zip64ExtraSize += 8
}
var buf []byte
buf = make([]byte, 4+zip64ExtraSize) // 2x uint16 + zip64ExtraSize
eb := writeBuf(buf[:])
eb.uint16(zip64ExtraID)
eb.uint16(zip64ExtraSize)
if h.CompressedSize64 > uint32max {
eb.uint64(h.UncompressedSize64)
}
if h.UncompressedSize64 > uint32max {
eb.uint64(h.CompressedSize64)
}
if h.offset > uint32max {
eb.uint64(h.offset)
}
h.Extra = append(h.Extra, buf[:]...)
}