New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go, cmd/cgo: repeatable builds on Solaris #13247

Open
neild opened this Issue Nov 14, 2015 · 20 comments

Comments

Projects
None yet
@neild
Contributor

neild commented Nov 14, 2015

Binaries built with "go build" that use cgo or include packages that use cgo contain references to a temporary directory. Multiple builds for the same binary will produce inconsistent results.

Simple reproduction:

// foo.go
package main

// #include <math.h>
// #cgo LDFLAGS: -lm
import "C"

import "fmt"

func main() {
        fmt.Println(C.sqrt(4))
}
$ go build foo.go && md5sum foo
d4cc4febe540953e8115417476adc4a4  foo
$ go build foo.go && md5sum foo
28f5f670a48e6f72a2f31405d9fbf2cc  foo
$ strings foo | grep go-build
/tmp/go-build847878379/command-line-arguments/_obj/_cgo_export.c
/tmp/go-build847878379/command-line-arguments/_obj/foo.cgo2.c
/tmp/go-build988195549/runtime/cgo/_obj/_cgo_export.c
/tmp/go-build988195549/runtime/cgo/_obj/cgo.cgo2.c

Some build systems require reproducible results: The same inputs should produce precisely the same outputs. The above behavior violates that requirement.

The problem appears to be that the gcc command invoked by "go build" includes the absolute path of the source file in $WORKDIR, which gcc then bakes into the resulting object file.

One fix might be to execute gcc from within $WORKDIR. There is, however, a comment in cmd/go/build.go indicating that the current behavior is intentional: "We always pass absolute paths of source files so that the error messages will include the full path to a file in need of attention."

Another possibility might be to use -fdebug-prefix-map to elide $WORKDIR from the debugging information written by gcc. I don't know if this can be generalized to other compilers.

@neild

This comment has been minimized.

Contributor

neild commented Nov 14, 2015

Change (*builder).ccompile to execute gcc from within $WORKDIR: Works, but something is still inserting $WORKDIR into the object file. I haven't yet figured out what.

$ strings bin | grep go-build
/tmp/go-build502746782/cgotest/_obj
/tmp/go-build159213004/runtime/cgo/_obj

Change (*builder).ccompilerCmd to include -fdebug-prefix-map: Works, but the compiler command line appears to be included in the object file somewhere:

$ strings bin | grep go-build
GNU C 4.8.4 -m64 -mtune=generic -march=x86-64 -g -O2 -fPIC -fno-working-directory -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build629323208=WORK -fstack-protector
GNU C 4.8.4 -m64 -mtune=generic -march=x86-64 -g -O2 -fPIC -fno-working-directory -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build738763988=WORK -fstack-protector

Also -fdebug-prefix-map doesn't appear to be supported by clang, so that's probably not a useful approach to continue pursuing.

@ianlancetaylor ianlancetaylor added this to the Go1.6 milestone Nov 14, 2015

@ALTree

This comment has been minimized.

Member

ALTree commented Nov 14, 2015

@mdempsky

This comment has been minimized.

Member

mdempsky commented Nov 25, 2015

We could post-process the compiled .o file's DWARF debug strings to replace all occurrences of /tmp/go-build12345 with /tmp/go-buildXXXXX.

@mdempsky

This comment has been minimized.

Member

mdempsky commented Nov 26, 2015

Hm, post-processing the DWARF debug strings doesn't seem to reliably work either, as the debug string's ordering is also influenced by the actual build directory name:

$ go build -o foo1 foo.go
$ go build -o foo2 foo.go
$ diff -u <(llvm-dwarfdump-3.5 foo1) <(llvm-dwarfdump-3.5 foo2)
[...]
@@ -134762,9 +134762,9 @@
 0x0000009e: "GNU C 4.8.4 -m64 -mtune=generic -march=x86-64 -g -O2 -fPIC -fmessage-length=0 -fstack-protector"
 0x000000fe: "short int"
 0x00000108: "complex double"
-0x00000117: "_cgo_e6732a3e38c3_Cfunc_sqrt"
-0x00000134: "_cgo_topofstack"
-0x00000144: "/tmp/go-buildXXXXXXXXX/command-line-arguments/_obj/foo.cgo2.c"
+0x00000117: "/tmp/go-buildXXXXXXXXX/command-line-arguments/_obj/foo.cgo2.c"
+0x00000155: "_cgo_e6732a3e38c3_Cfunc_sqrt"
+0x00000172: "_cgo_topofstack"
 0x00000182: "stktop"
 0x00000189: "/tmp/go-buildXXXXXXXXX/runtime/cgo/_obj/_cgo_export.c"
 0x000001bf: "/tmp/go-buildXXXXXXXXX/runtime/cgo/_obj/cgo.cgo2.c"
@mdempsky

This comment has been minimized.

Member

mdempsky commented Nov 26, 2015

BTW, the temporary directory generated by cmd/link for external linking is also being embedded into the output files (as a FILE symbol in the ELF .symtab):

$ strings foo1 foo2 | grep go-link
/tmp/go-link-676379844/go.o
/tmp/go-link-961034572/go.o
@rsc

This comment has been minimized.

Contributor

rsc commented Dec 17, 2015

I looked briefly into this but it's like gcc is actively working against us. I love the fact that the -fdebug-prefix-map command-line argument gets embedded into the binary when the whole point of that flag is to specify something to keep out.

https://go-review.googlesource.com/17943 has my work in progress. Until gcc or clang wants to cooperate I don't see much point in heroics. I do see that clang seems to have added support for -fdebug-prefix-map very recently, not that we can rely on it being available.

@rsc rsc modified the milestones: Unplanned, Go1.6 Dec 17, 2015

@neild

This comment has been minimized.

Contributor

neild commented Feb 9, 2016

The magic additional incantation appears to be -gno-record-gcc-switches to keep the flags out of the binary.

@rsc

This comment has been minimized.

Contributor

rsc commented Feb 9, 2016

@rsc rsc modified the milestones: Go1.7Early, Unplanned Feb 9, 2016

@gopherbot

This comment has been minimized.

gopherbot commented Feb 18, 2016

CL https://golang.org/cl/19363 mentions this issue.

@gopherbot gopherbot closed this in 5bbb98d Feb 18, 2016

@neild

This comment has been minimized.

Contributor

neild commented Feb 18, 2016

Fixed with a caveat: The system compiler needs to support -fdebug-prefix-map for builds to be reproducible. That's gcc 4.3 and clang 3.8 (not released yet).

@neild

This comment has been minimized.

Contributor

neild commented Feb 24, 2016

Reopening, because Solaris builds remain inconsistent.

@neild neild reopened this Feb 24, 2016

gopherbot pushed a commit that referenced this issue Feb 24, 2016

cmd/go: skip consistent cgo build test on Solaris.
See #13247.

Change-Id: I06636157028d98430eb29277c822270592907856
Reviewed-on: https://go-review.googlesource.com/19910
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@binarycrusader

This comment has been minimized.

Contributor

binarycrusader commented Apr 12, 2016

Is this still a problem after my fix for #14957?

I can't reproduce the issue locally with my fix applied:
$ go build foo.go && md5sum foo; rm -f foo
2d826543fb363898a060dce93121cb46 foo
$ go build foo.go && md5sum foo; rm -f foo
2d826543fb363898a060dce93121cb46 foo

Historically, I would note that the Solaris linker would insert the working directory if no STT_FILE was found, but that's no longer true in current Solaris releases.

So I think what was causing the changes made for this bug to not work on Solaris was the problem fixed in #14957.

@bradfitz

This comment has been minimized.

Member

bradfitz commented May 5, 2016

Sounds like this is fixed then.

@bradfitz bradfitz closed this May 5, 2016

@ALTree

This comment has been minimized.

Member

ALTree commented May 5, 2016

I believe #9206 too can be closed.

@gopherbot

This comment has been minimized.

gopherbot commented Jun 3, 2016

CL https://golang.org/cl/23741 mentions this issue.

gopherbot pushed a commit that referenced this issue Jun 3, 2016

Mikio Hara
cmd/go: re-enable TestCgoConsistentResults on solaris
Updates #13247.

Change-Id: If5e4c9f4db05f58608b0eeed1a2312a04015b207
Reviewed-on: https://go-review.googlesource.com/23741
Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@mikioh

This comment has been minimized.

Contributor

mikioh commented Jun 3, 2016

No, this issue is not fixed on Solaris. See https://build.golang.org/log/efb56d2bc3049a131016966cff929c44137968d8

@mikioh mikioh reopened this Jun 3, 2016

@mikioh mikioh added this to the Go1.7Maybe milestone Jun 3, 2016

@mikioh mikioh removed this from the Go1.7Early milestone Jun 3, 2016

@rasky

This comment has been minimized.

Member

rasky commented Jun 5, 2016

I also confirm #9206 cannot be closed, I can still reproduce the issue.

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Jun 9, 2016

Postponing until 1.8.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.8, Go1.7Maybe Jun 9, 2016

@gopherbot

This comment has been minimized.

gopherbot commented Jun 24, 2016

CL https://golang.org/cl/24460 mentions this issue.

gopherbot pushed a commit that referenced this issue Jun 24, 2016

cmd/pprof: ignore symbols with address 0 and size 0
Handling a symbol with address 0 and size 0, such as an ELF STT_FILE
symbols, was causing us to disassemble the entire program.  We started
adding STT_FILE symbols to help fix issue #13247.

Fixes #16154.

Change-Id: I174b9614e66ddc3d65801f7c1af7650f291ac2af
Reviewed-on: https://go-review.googlesource.com/24460
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>

@quentinmit quentinmit added the NeedsFix label Oct 6, 2016

@rsc rsc changed the title from cmd/go: cgo-using binaries do not build reproducibly to cmd/go, cmd/cgo: repeatable builds on Solaris Oct 21, 2016

@rsc rsc modified the milestones: Go1.9, Go1.8 Oct 21, 2016

@mikioh mikioh added the OS-Solaris label Dec 21, 2016

@rsc

This comment has been minimized.

Contributor

rsc commented Jun 22, 2017

Confirmed this is still a problem on Solaris:

$ VM=$(gomote create solaris-amd64-smartosbuildlet)
$ gomote run $VM go/src/make.bash
...
$ gomote run $VM /bin/bash -c '
	/tmp/workdir/go/bin/go build -ldflags=-linkmode=external -o /tmp/gofmt1 cmd/gofmt; 
	/tmp/workdir/go/bin/go build -ldflags=-linkmode=external -o /tmp/gofmt2 cmd/gofmt; 
	od -t x1 /tmp/gofmt1 >/tmp/gofmt1.o
	od -t x1 /tmp/gofmt2 >/tmp/gofmt2.o
	diff /tmp/gofmt[12].o
'
4906c4906
< 0231220 2d 62 75 69 6c 64 30 36 37 31 39 35 33 33 35 2f
---
> 0231220 2d 62 75 69 6c 64 39 32 31 38 35 38 32 39 32 2f
140441c140441
< 10457460 f8 fd ff 6f 00 00 00 00 cb f3 00 00 00 00 00 00
---
> 10457460 f8 fd ff 6f 00 00 00 00 d2 f3 00 00 00 00 00 00
154056c154056
< 11336120 2d 62 75 69 6c 64 30 36 37 31 39 35 33 33 35 2f
---
> 11336120 2d 62 75 69 6c 64 39 32 31 38 35 38 32 39 32 2f
Error running run: exit status 1
$

The first and third diff are the work tempdir leaking into the binary. I'm not sure what the second diff is. I have no solution. If someone knows how to tell gcc on Solaris not to record the absolute object paths used during a link, please let us know.

@rsc rsc modified the milestones: Go1.10, Go1.9 Jun 22, 2017

@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017

@gopherbot gopherbot modified the milestones: Go1.11, Unplanned May 23, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment