Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive/tar: write NUL in USTAR header in overflow situations #24599

Closed
StefanScherer opened this issue Mar 29, 2018 · 10 comments
Closed

archive/tar: write NUL in USTAR header in overflow situations #24599

StefanScherer opened this issue Mar 29, 2018 · 10 comments
Milestone

Comments

@StefanScherer
Copy link

What version of Go are you using (go version)?

1.10

Does this issue reproduce with the latest release?

yes, also happens with 1.10.1

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/go"
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build238205497=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Users start to complaint that Packer 1.2.1 (hashicorp/packer#5990) creates Vagrant box files that cannot be extracted on macOS. These Vagrant boxes are technically tar files with large VM files in it. Packer 1.2.1 is compiled with Golang 1.10, earlier versions were compiled with Golang 1.9. The code to create the tar file hasn't changed in Packer.

I've extracted the problem to a smaller example code and compiled it with different Golang versions.
I have created a gist https://gist.github.com/StefanScherer/9dfeb9fbf6bc5dc35090d698c2703bbc with following files:

  • createtar.go - sample code to create a tar out.tar from an input directory in.
  • Dockerfile - I have compiled the Golang binaries for macOS in a Linux container to easily switch Golang versions
  • test.sh - a test script to be called on macOS to build the macOS binaries, the test input file and check the tar files created
  • .dockerignore - to speed up rebuilds
  • output-of-test-sh.txt - the complete output of the test.sh script

What did you expect to see?

I expect that the tar files produced can be extracted with the built-in tar (bsdtar 2.8.3 - libarchive 2.8.3) in macOS 10.13.3.

What did you see instead?

The binary compiled with Golang 1.9 creates a tarfile with a 10 GByte file in it that can be listed and extracted correctly on macOS.

Create tar with darwin binary compiled with Golang 1.9
2018/03/29 19:14:19 Create tar from dir: in => out.tar
2018/03/29 19:14:19 Tar add: 'in/10Gigfile' to 'out.tar'
-rw-------  0 501    20 10737418240 Mar 29 18:55 10Gigfile

The binary compiled with Golang 1.10 creates a tarfile with a 10 GByte file in it that can be listed as an empty file. The tarfile itself is 10 GByte. Trying to extract the tar file leads to an error.

Create tar with darwin binary compiled with Golang 1.10
2018/03/29 19:14:46 Create tar from dir: in => out.tar
2018/03/29 19:14:46 Tar add: 'in/10Gigfile' to 'out.tar'
-rw-------  0 stefan staff       0 Mar 29 18:55 10Gigfile
@ianlancetaylor
Copy link
Contributor

CC @dsnet

@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 29, 2018
@ianlancetaylor ianlancetaylor added this to the Go1.11 milestone Mar 29, 2018
@dsnet
Copy link
Member

dsnet commented Mar 29, 2018

This is difficult for me to reproduce as I don't have the docker environment setup. Can you upload the compressed problematic tar file? If that's still too large, the first 1024 bytes will do.

@dsnet
Copy link
Member

dsnet commented Mar 29, 2018

I reproduced the problem by just creating a 10GiB file, didn't need to do all the docker stuff. Investigating.

@StefanScherer
Copy link
Author

Added some xxd dumps of the first 4096 byte and also a diff to the gist.

@dsnet
Copy link
Member

dsnet commented Mar 29, 2018

This is a bug in libarchive.

What changed from Go1.9 to Go1.10?

In Go1.9, the archive/tar package was a mess where the exact format used (USTAR, PAX, GNU) was poorly determined, and did not always generate a valid tar file. That is, in certain situations, it would generate an invalid hybrid file using both features from PAX and GNU (even though most tar tools would still happily parse the archive). In Go1.10, this was fixed such that the Writer would always consistently use USTAR when possible, then PAX, and then GNU.

Thus, for the 10GiB file you are writing, Go1.9 happened to choose the GNU format, which libarchive understood. However, Go1.10 chose to use the PAX format (since USTAR can't represent files larger than 8GiB).

What's wrong with libarchive?

The PAX specification says:

size: The size of the file in octets, expressed as a decimal number using digits from the ISO/IEC 646:1991 standard. This record shall override the size field in the following header block(s).

Thus, the presence of a size PAX record should override whatever value is in the subsequent USTAR header. However, libarchive v2.8.3 has a bug where it uses the stale USTAR header and does not properly use the PAX record.

This problem was filed against libarchive in libarchive/libarchive#880 and subsequently fixed in v3.3.2 and on.

What can you do?

Go1.10 gives the user finer control over the output format. You can see tar.Header.Format to tar.FormatGNU to work around the libarchive bug.

What can Go do?

For the cases where a PAX record overwrites a USTAR field, we could encode the NUL byte in the field instead of the '0' character. The NUL byte is a better indication that USTAR field cannot represent the value. This shouldn't be necessary, but could help users with old versions of bsdtar.

@dsnet dsnet removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 29, 2018
@dsnet dsnet changed the title archive/tar: Creating incompatible tar files, macOS tar shows 0 byte length archive/tar: write NUL in USTAR header in overflow situations Mar 29, 2018
@StefanScherer
Copy link
Author

Thanks @dsnet for the fast review and excellent tip for a work around.

Indeed, adding the following line

		header.Format = tar.FormatGNU

helps.

@dsnet
Copy link
Member

dsnet commented Mar 29, 2018

@StefanScherer, great to hear.

It turns out that writing NULs in the size field doesn't help bsdtar, since the condition they have never evaluates to true. As there's nothing we can do here, I'm closing this.

@dsnet dsnet closed this as completed Mar 29, 2018
@mwhooker
Copy link

mwhooker commented Mar 29, 2018

Thanks for the investigating, @dsnet. Is it correct to say that future versions of go will use the patched version of libarchive, and create valid 8gb PAX archives without manually setting the header format?

@dsnet
Copy link
Member

dsnet commented Mar 29, 2018

Is it correct to say that future versions of go will use the patched version of libarchive, and create valid 8gb PAX archives without manually setting the header?

I'm not sure what you mean. libarchive is distributed entirely separately from the Go toolchain. Did you mean archive/tar? If so, then yes, archive/tar will continue to use the PAX format as the default over GNU.

@mwhooker
Copy link

ah, I misunderstood, for some reason I was thinking golang used the libarchive implementation. Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants