Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build id seems to prevent stdlib from being reproducible #1391

Closed
jayconrod opened this issue Mar 19, 2018 · 16 comments
Closed

Build id seems to prevent stdlib from being reproducible #1391

jayconrod opened this issue Mar 19, 2018 · 16 comments
Labels

Comments

@jayconrod
Copy link
Contributor

Original comment by @steeve #1357 (comment)

@steeve
Copy link
Contributor

steeve commented Mar 19, 2018

Relevant source code seems to live at:
https://github.com/golang/go/blob/master/src/cmd/go/internal/work/buildid.go#L23-L31

Although one quick fix would be to overwrite the build id in the objects files, kind of like what we do with .a archives.

@jayconrod
Copy link
Contributor Author

If the build id is a hash of the inputs, it should be reproducible. I suspect that it's hashing absolute paths that vary across different runs. If that's the case, we can try to build it at a deterministic location.

Modifying the .a files may be difficult to support. They don't have a stable format across Go releases.

@steeve
Copy link
Contributor

steeve commented Apr 2, 2018

@steeve
Copy link
Contributor

steeve commented Apr 2, 2018

And golang/go#22491

@steeve
Copy link
Contributor

steeve commented Apr 2, 2018

Reading buildid.go, it seems the build is divided into several parts by a /. Indeed, when looking at the diff, for instance:

37,38c37,38
< 00000240  49 69 78 62 6e 54 46 35  61 53 30 4e 61 59 50 45  |IixbnTF5aS0NaYPE|
< 00000250  49 53 78 73 2f 73 71 38  38 4e 51 53 47 58 6a 6b  |ISxs/sq88NQSGXjk|
---
> 00000240  58 59 7a 63 77 6e 63 4e  4f 4b 4e 59 7a 35 57 53  |XYzcwncNOKNYz5WS|
> 00000250  44 70 43 44 2f 73 71 38  38 4e 51 53 47 58 6a 6b  |DpCD/sq88NQSGXjk|

We can see that the second part is the same, only the first changes, which is, according to https://github.com/golang/go/blob/master/src/cmd/go/internal/work/buildid.go#L23-L31, the actionID. Which is kind of a good news, because the contentID is the same.

@steeve
Copy link
Contributor

steeve commented Apr 2, 2018

@steeve
Copy link
Contributor

steeve commented Apr 2, 2018

If we could print the value of the input buffer at https://github.com/golang/go/blob/ad0ebc3994fc7f74434d922b80401e680162c7d1/src/cmd/go/internal/work/exec.go#L295 we could see what changes, I guess.

@steeve
Copy link
Contributor

steeve commented Apr 2, 2018

It seems to be using gcflags and asmflags to produce the hash. Since we are using trimpath, my guess is that the sandbox paths may end up in the binary.
https://github.com/golang/go/blob/ad0ebc3994fc7f74434d922b80401e680162c7d1/src/cmd/go/internal/work/exec.go#L219-L223

@steeve
Copy link
Contributor

steeve commented Apr 2, 2018

It's also using the cgo flags as well: https://github.com/golang/go/blob/ad0ebc3994fc7f74434d922b80401e680162c7d1/src/cmd/go/internal/work/exec.go#L198-L209

Which pretty much kills setting absolute paths as CGO flags.

@steeve
Copy link
Contributor

steeve commented Apr 2, 2018

Perhaps we can leverage -toolexec to intercept the calls and make the paths absolute there.

	-toolexec 'cmd args'
		a program to use to invoke toolchain programs like vet and asm.
		For example, instead of running asm, the go command will run
		'cmd args /path/to/asm <arguments for asm>'.

@steeve
Copy link
Contributor

steeve commented Apr 2, 2018

Here is the list of packages that have build ids that don't match after two builds on linux:

crypto/aes.a
crypto/elliptic.a
crypto/internal/cipherhw.a
crypto/md5.a
crypto/rc4.a
crypto/sha1.a
crypto/sha256.a
crypto/sha512.a
hash/crc32.a
internal/cpu.a
math.a
math/big.a
net.a
os/signal.a
os/signal/internal/pty.a
os/user.a
plugin.a
reflect.a
runtime.a
runtime/cgo.a
runtime/debug.a
runtime/internal/atomic.a
strings.a
sync/atomic.a
syscall.a

@steeve
Copy link
Contributor

steeve commented Apr 2, 2018

The common attribute between those packages is that they are using assembly or cgo, for instance:
https://github.com/golang/go/tree/master/src/crypto/aes
https://github.com/golang/go/tree/master/src/hash/crc32
https://github.com/golang/go/tree/master/src/os/user

@steeve
Copy link
Contributor

steeve commented Apr 2, 2018

@steeve
Copy link
Contributor

steeve commented Apr 2, 2018

After removing the asmflags from the stdlib builder, the sha1.a package now has the same buildid, and thus is identical between two runs. Not sure how risky that is, though.

@steeve
Copy link
Contributor

steeve commented Apr 2, 2018

The weird thing is that the gcflags don't seem to be affected by this, though.

@steeve
Copy link
Contributor

steeve commented Jun 8, 2019

I think we can close this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants