Skip to content

archive/zip: Create() does not overwrite duplicate filenames, leading to unnecessary bloating of the resulting ZIP file #66810

@wjkoh

Description

@wjkoh

Go version

go version go1.22.2 darwin/arm64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/Users/wjkoh/Library/Caches/go-build'
GOENV='/Users/wjkoh/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/wjkoh/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/wjkoh/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/opt/homebrew/Cellar/go/1.22.2/libexec'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/opt/homebrew/Cellar/go/1.22.2/libexec/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.22.2'
GCCGO='gccgo'
AR='ar'
CC='cc'
CXX='c++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/y1/jsmqrm1s6y55qh9s2rmw39lh0000gn/T/go-build548572318=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

I called Create() with the same filename multiple times. https://go.dev/play/p/hu1QDLK7BnT

What did you see happen?

I have noticed that when using Create() with the same filename multiple times, the resulting ZIP file becomes increasingly larger. This becomes a problem when trying to extract the files using common unzip utilities such as the default archiver on MacOS or the unzip command. These utilities are unable to handle duplicate filenames and will only output a single file when there are multiple files with the same name in a ZIP file. This is confusing and inefficient at the same time.

What did you expect to see?

In my opinion, there are two potential solutions to this issue. First, Create() could prevent multiple calls with the same filename. Alternatively, it could overwrite the previously added file with the same name using the new file content, rather than simply appending it. However, I believe this may cause unnecessary overhead. In such a scenario, adding a caution to the documentation of Create() would be beneficial for users.

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocumentationIssues describing a change to documentation.FrozenDueToAgeNeedsFixThe path to resolution is known, but the work has not been done.help wanted

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions