New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: support reproducible builds regardless of build path? #16860

Open
infinity0 opened this Issue Aug 24, 2016 · 23 comments

Comments

Projects
None yet
@infinity0

infinity0 commented Aug 24, 2016

It would be good if go was able to generate bit-for-bit identical binaries, even if the build environment changes in unimportant ways. (For example, so users can reproduce binaries without requiring root.)

Up until recently, we have been able to reproduce go binaries whilst modifying many parts of the build environment, except that we kept the build-path constant. However recently, we started to also vary the build-path. (Why "recently", is irrelevant to this report, but I can go into it if you ask.)

Anyway, now we can see that .gopclntab embeds the build path into the resulting binary. This patch will get rid of this environment-specific information, but I know it's not perfect:

--- src/cmd/link/internal/ld/pcln.go
+++ src/cmd/link/internal/ld/pcln.go
@@ -41,6 +41,8 @@

 // iteration over encoded pcdata tables.

+var cwd, _ = os.Getwd()
+
 func getvarint(pp *[]byte) uint32 {
    v := uint32(0)
    p := *pp
@@ -152,7 +154,8 @@
            f.Value = int64(ctxt.Nhistfile)
            f.Type = obj.SFILEPATH
            f.Next = ctxt.Filesyms
-           f.Name = expandGoroot(f.Name)
+           //f.Name = expandGoroot(f.Name)
+           f.Name, _ = filepath.Rel(cwd, f.Name)
            ctxt.Filesyms = f
        }
    }

In particular, I'm not sure how this will interfere with readers of this information. I know that src/debug/pclntab.go carries an API for this, but I'm not sure what sorts of contracts you have published for that, that people expect.

Also, the part where I comment out expandGoroot is not strictly necessary for reproducibility, but would allow, e.g. a user to try to reproduce a binary from his own go toolchain - which is some extra assurance that the copies behave the same way.

The call to Getwd during static initialisation is also a bit dirty; I could delay this for later but that may or may not require locks, or pass it in from an parent/ancestor caller but that would require adding extra arguments to some functions.

However if you give me some guidelines on how to make this patch acceptable, I'll be happy to do this work and submit a PR.

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Aug 24, 2016

When you say that you are varying the build path, what do you mean precisely?

This seems like a restatement of #9206.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Aug 24, 2016

Yeah, let's merge conversation in #9206.

@infinity0, instructions for sending a change are at https://golang.org/doc/contribute.html#Code_review which looks like a wall of text, but it's not many actual steps. Be sure to check errors, not have commented-out code, and include tests if possible.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Aug 24, 2016

Reopening, since this is slightly different from #9206.

@bradfitz bradfitz reopened this Aug 24, 2016

@bradfitz bradfitz changed the title from Bit-for-bit deterministic / reproducible builds to cmd/go: support reproducible builds regardless of build path? Aug 24, 2016

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Aug 24, 2016

Thanks for the explanation (over on #9206). It's not clear to me that this should be a goal. I agree that reproducible builds are essential. However, it's not clear to me that reproducible builds when the sources are in different directories are essential.

For example, exactly the same thing happens when using GCC with the -g option. The source directory is included in the debug info, in the DW_AT_comp_dir attribute.

Not including the source directory will make debugging more difficult.

@infinity0

This comment has been minimized.

infinity0 commented Aug 24, 2016

For GCC, we are setting -fdebug-prefix-map=$SRC_ROOT=. dynamically for most builds, which removes it from DW_AT_comp_dir. Recently we landed a patch in GCC to make it not embed debug-prefix-map into DW_AT_producer, so this now works to achieve buildpath-independent reproducibility.

Yes, we're being quite strict with ourselves in trying to make things independent of the build path, but we think it's a good goal that would allow more people in practice to perform build verification.

Ideally, we'd want as many people to rebuild the same hash as possible, so that the rest of the world (who we assume don't want to do these rebuilds) can see that 20 people built the same hash, rather than 5 people built 4 different hashes. It is indeed unclear at the moment what the "best tradeoff" is - we're just scratching the surface of this topic ourselves - but this particular issue seemed fairly easy to me to fix.

(edit: more accurate to say SRC_ROOT instead of PWD; we set it once at the start of the build when they happen to be equal)

@infinity0

This comment has been minimized.

infinity0 commented Aug 24, 2016

Yes, "making debugging more difficult" was also my concern. FWIW, our experiments with GCC did not make things harder to debug (this was by chance, we had to try it out to see it). I can appreciate that Go is different, but if you can explain the details I could also try to think of solutions.

@jmikedupont2

This comment has been minimized.

jmikedupont2 commented Oct 23, 2016

I think this is important. Debugging is also very important. From what I understand this is the issue of two users with different goroots set. Renaming of a module/forking it to a new repo name would not be supported. So, is the question how to share debug information? How to relate debug information created in one root to another root? If you are in the same root as the debug info was created it would be nice if we could strip out all the stuff outside the root.

@whyrusleeping

This comment has been minimized.

whyrusleeping commented Mar 31, 2017

Couldnt the paths stored in the binary be somehow prefixed with $GOPATH so that when i'm debugging a binary it uses code in my gopath correctly?

@rhysh

This comment has been minimized.

Contributor

rhysh commented Apr 1, 2017

The compiler has a -trimpath flag, which might be close to what you need. When it's set to e.g. $GOPATH/src, the paths included in the binary's debug info will be relative paths with that prefix (and a /) removed. This is handy for use with go tool pprof, which will search for the relevant code in your GOPATH. Other tools might have similar behaviors when the debug info contains relative paths.

Since the -trimpath flag is on the compiler (and assembler), it works by modifying the contents of the .a files (stored in $GOPATH/pkg).

The $GOROOT_FINAL environment variable (currently) controls the path prefix used for debug info of packages within $GOROOT. With these two features combined, a command like env GOROOT_FINAL=/usr/local/go go install -a -gcflags="-trimpath=$GOPATH/src" . (untested) should normalize or remove all file paths in the debug info, assuming that $GOPATH refers to a single directory.

There are at least two other ways that directory names end up in binaries, even with cgo disabled. First, DW_AT_comp_dir, which when I last tested on OSX was set to the current working directory used for the linker command. This may be normalizable by starting with cd / and using fully-qualified paths for the build. Second, the runtime.GOROOT function is backed by a file that's generated by ./src/make.bash. That file contains the value of $GOROOT when make.bash was executed (when the Go toolchain was built). If bit-for-bit reproducible builds are more important to you than the complexity of managing custom $GOROOT settings, that might help too.

These settings combined can get very close to bit-for-bit reproducible builds. The Go toolchain's build id also needs to be constrained, maybe via -ldflags=-buildid=xxx, but I haven't figured out how to make its value be consistent across machines automatically.

To be clear, I don't recommend using these knobs ... unless their value is worth their complexity cost for your application, and you're prepared to own the pieces when it stops working or changes behavior on some future release.

@stapelberg

This comment has been minimized.

Contributor

stapelberg commented Jul 29, 2017

Thanks for the detailed reply, @rhysh.

We started specifying the -trimpath flag with version 1.22 of our build helper dh-golang (see https://anonscm.debian.org/cgit/pkg-go/packages/dh-golang.git/log/?h=debian/1.22), which just entered Debian unstable. In my tests, the flag worked as expected, but let’s see how it goes… :)

I also prepared a patch which makes cmd/link honor the BUILD_PATH_PREFIX_MAP environment variable (see https://reproducible-builds.org/specs/build-path-prefix-map/ for the spec), eliminating build paths from the DW_AT_comp_dir field and DWARF file tables. I uploaded it to our reproducible builds infrastructure and successfully verified it works on a number of packages.

I’ll send it for review once the merge window opens after the Go 1.9 release. This will at least give us some code to further inform the discussion.

@gopherbot

This comment has been minimized.

gopherbot commented Oct 26, 2017

Change https://golang.org/cl/73291 mentions this issue: cmd/link: apply BUILD_PATH_PREFIX_MAP

@rsc

This comment has been minimized.

Contributor

rsc commented Nov 6, 2017

I think we should look at this for Go 1.11. Updated milestone.

@stapelberg

This comment has been minimized.

Contributor

stapelberg commented Nov 9, 2017

@rsc Thanks for clarifying over on #22491 and here.

Just to be clear, when you say “we should look at this”, are you saying the Go team will have a look, or would you rely on external contributions?

@rsc

This comment has been minimized.

Contributor

rsc commented Nov 9, 2017

I mean I think the Go team should try to make this happen. If external contributors do the work, that's fine. But if not then the Go team should dedicate time to it. If you're interested, that's great. Please discuss the design here before sending a CL. Thanks.

@stapelberg

This comment has been minimized.

Contributor

stapelberg commented Nov 9, 2017

I think it’d be better if the Go team spent some time on this, I’m already quite over-subscribed. Just wanted to make sure this doesn’t fall through the cracks. Thanks!

steeve added a commit to znly/rules_go that referenced this issue Apr 17, 2018

Ensure cpp_flags adds fdebug-prefix-map to make DWARF reproducible
See golang/go#16860 (comment)
for more info.

Signed-off-by: Steeve Morin <steeve.morin@gmail.com>

steeve added a commit to znly/rules_go that referenced this issue Apr 17, 2018

Ensure cpp_flags adds fdebug-prefix-map to make DWARF reproducible
See golang/go#16860 (comment)
for more info.

Signed-off-by: Steeve Morin <steeve.morin@gmail.com>

jayconrod added a commit to bazelbuild/rules_go that referenced this issue Apr 18, 2018

Make the stdlib reproducible by silencing the build id (#1440)
* Make the stdlib reproducible by silencing the build id

When go install'ing the stdlib, leverage -toolexec to remove the
-buildid flag from compile invocations.

Signed-off-by: Steeve Morin <steeve.morin@gmail.com>

* Ensure cpp_flags adds fdebug-prefix-map to make DWARF reproducible

See golang/go#16860 (comment)
for more info.

Signed-off-by: Steeve Morin <steeve.morin@gmail.com>
@rsc

This comment has been minimized.

Contributor

rsc commented Aug 10, 2018

Didn't get to this for Go 1.11. For Go 1.12 then. It's important.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Oct 3, 2018

/cc @hugelgupf who was just asking about this. To see what we do inside Google, see http://go/gobuildpwdhack which sets the PWD environment to /proc/self/cwd/../google3 (which works on Linux, but isn't a portable solution)

@whyrusleeping

This comment has been minimized.

whyrusleeping commented Oct 3, 2018

@bradfitz I assume that link is only clickable by googlers?

@bradfitz

This comment has been minimized.

Member

bradfitz commented Oct 3, 2018

Yup. That's why I also described it.

@fd0

This comment has been minimized.

fd0 commented Oct 4, 2018

There may be a bug in Go 1.11: when building with Modules and using -trimpath, a new, temporary path to _gomod_.go is embedded into the binary, I've reported this as #28008

@stapelberg This may also be a problem in Debian sooner or later...

hnakamur added a commit to hnakamur/golang-deb that referenced this issue Nov 3, 2018

make builds reproducible by honoring BUILD_PATH_PREFIX_MAP
Upstream has rejected the patch in this form and promised to implement an
alternative they are happy with instead. That hasn't happened yet though.
Bug: golang/go#22491, golang/go#16860
Forwarded: https://golang.org/cl/73291 (rejected upstream though)
Last-Update: 2018-02-08

Gbp-Pq: Name 0001-reproducible-BUILD_PATH_PREFIX_MAP.patch
@bcmills

This comment has been minimized.

Member

bcmills commented Nov 15, 2018

As noted in #24976 (comment), this may require a change to the trimpath flag definition in cmd/compile.

@rsc

This comment has been minimized.

Contributor

rsc commented Nov 20, 2018

Go 1.12 got away from us due to the late start and other unanticipated time theft. I'll try to work on this during the freeze to be ready for early-in-cycle for Go 1.13.

@rsc rsc modified the milestones: Go1.12, Go1.13 Nov 20, 2018

@wmark

This comment has been minimized.

wmark commented Dec 6, 2018

I've submitted a patch for this some months ago: https://go-review.googlesource.com/c/go/+/65233/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment