Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: build ID depends on the full path of Go files passed on the command line #48557

Open
abarisani opened this issue Sep 22, 2021 · 20 comments
Open
Assignees
Labels
Milestone

Comments

@abarisani
Copy link

@abarisani abarisani commented Sep 22, 2021

When failing to create reproducible builds using go1.17.1, despite our best effort in ensuring that compiler and compilation flags are identical as well as GOROOT, we identified that the Build ID is dependent on the output path.

What version of Go are you using (go version)?

$ go version 1.17.1

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output

$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/lcars/.cache/go-build"
GOENV="/home/lcars/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/lcars/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/lcars/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17.1"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/mnt/git/public/crucible/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1855656758=/tmp/go-build -gno-record-gcc-switches"

What did you do?

$  cd /tmp/foo
$  CGO_ENABLED=0 go build -trimpath caam_tool.go && sha256sum caam_tool.go && sha256sum caam_tool
5ca6fadd863de23e3b6282d81d0af53ea328e0cdd60308299f0c6264e7de9236  caam_tool.go
8ce70082d6a861b9897a3d8b8ed6f98b18d71b764d8ca30109d016834d6e2486  caam_tool
$ cd /tmp/bar
$  CGO_ENABLED=0 go build -trimpath caam_tool.go && sha256sum caam_tool.go && sha256sum caam_tool
5ca6fadd863de23e3b6282d81d0af53ea328e0cdd60308299f0c6264e7de9236  caam_tool.go
47725ce3a6096352c9ec116a72bb423ccdb1652477bc871f3b5bb6b5076cdbf4  caam_tool

What did you expect to see?

We do not expect the build ID to be affected by the output path when building identical code under the same compiler version and flags.

What did you see instead?

The output binaries are identical with the sole exception of the build ID:

251,254c251,254
< 00000fa0: 5300 0000 0400 0000 476f 0000 4d56 7235  S.......Go..MVr5
< 00000fb0: 657a 497a 4f5a 334c 7959 5170 7748 4351  ezIzOZ3LyYQpwHCQ
< 00000fc0: 2f6d 3563 5a49 6265 2d72 5249 5938 5865  /m5cZIbe-rRIY8Xe
< 00000fd0: 4249 3846 452f 7a68 4e4d 646e 4b38 5045  BI8FE/zhNMdnK8PE
---
> 00000fa0: 5300 0000 0400 0000 476f 0000 337a 445a  S.......Go..3zDZ
> 00000fb0: 6a67 6a46 6e5a 5f62 4353 764e 4b58 704b  jgjFnZ_bCSvNKXpK
> 00000fc0: 2f49 5f59 6645 6470 4f34 796d 4937 776f  /I_YfEdpO4ymI7wo
> 00000fd0: 4e4b 6248 662f 7a68 4e4d 646e 4b38 5045  NKbHf/zhNMdnK8PE
@ALTree
Copy link
Member

@ALTree ALTree commented Sep 22, 2021

Previously: #33772.

You can use -ldflags=-buildid= (sets buildid to the empty string) as a temporary workaround, in the meantime.

@dr2chase
Copy link
Contributor

@dr2chase dr2chase commented Sep 22, 2021

Are you doing this with a go.mod file in the directory?
And can you show the output of go tool buildid caam_tool, both ways?

@bcmills is this one for you?

@abarisani
Copy link
Author

@abarisani abarisani commented Sep 23, 2021

This happens regardless of go.mod or not, this specific example is not using any external packages (see caam_tool.go).

The buildid is the same shown in my report (the binary diff), anyway here it is:

$ cd /tmp/foo && go tool buildid caam_tool
3zDZjgjFnZ_bCSvNKXpK/I_YfEdpO4ymI7woNKbHf/zhNMdnK8PE2EMdNQVWjC/4co-SLWfCN7tlCcwHIGV

$ cd /tmp/bar &&  go tool buildid caam_tool
MVr5ezIzOZ3LyYQpwHCQ/m5cZIbe-rRIY8XeBI8FE/zhNMdnK8PE2EMdNQVWjC/4co-SLWfCN7tlCcwHIGV

@seankhliao
Copy link
Contributor

@seankhliao seankhliao commented Sep 23, 2021

I think it;s the way the arguments are passed (list of files), building a directory/package within a module results in consistent build ids

@abarisani
Copy link
Author

@abarisani abarisani commented Sep 23, 2021

The presence of a go.mod or not does not change the outcome.

If you are talking about modules installed with go install does do not enforce trimpath, so they would not get the same build ID (across different machines) for other reasons.

@seankhliao
Copy link
Contributor

@seankhliao seankhliao commented Sep 23, 2021

no, but go build -trimpath . (which is only possible when you have a go.mod) will result in the same build IDs

@abarisani
Copy link
Author

@abarisani abarisani commented Sep 23, 2021

With a go.mod'ed project I get different build IDs depending on the location:

$ cd /tmp/foo
$ git clone https://github.com/f-secure-foundry/crucible && cd crucible
$ make CGO_ENABLED=0 BUILD=test crucible
go build -v \
  -trimpath -ldflags "-s -w -X 'main.Revision=ac44e57' -X 'main.Build=test'" \
  cmd/crucible/crucible.go
command-line-arguments
compiled crucible ac44e57 (test)
$ sha256sum crucible 
64b97114614fa3315742fd61f3a0ae2ab5f65bb91550e1232650440d2b267981  crucible

$ cd /tmp/bar
$ git clone https://github.com/f-secure-foundry/crucible && cd crucible
$ make CGO_ENABLED=0 BUILD=test crucible
go build -v \
  -trimpath -ldflags "-s -w -X 'main.Revision=ac44e57' -X 'main.Build=test'" \
  cmd/crucible/crucible.go
command-line-arguments
compiled crucible ac44e57 (test)
$ sha256sum crucible 
b51b76f53279df695722ce54bbf079243414215b045174955c9f02584dab4e4a  crucible

However the earlier caam_tool.go example using go build -trimpath . after go mod init does lead to the same build.

The rules are not clear to me however, what's the difference between the two cases. In my mind, they should behave identically.

@FiloSottile FiloSottile changed the title Build ID depends on the output path. cmd/go: build ID depends on the output path Sep 23, 2021
@FiloSottile
Copy link
Contributor

@FiloSottile FiloSottile commented Sep 23, 2021

We did some debugging with @abarisani and we realized it's not the output path that ends up hashed in the Build ID, but the full path of the input files, because go build cmd/crucible/crucible.go uses explicit Go files on the command line (the synthetic command-line-arguments package).

With -trimpath, the paths to these files should be made relative to the main module (assuming the files are within it). In fact, it looks like they are relative as embedded in the binary, because otherwise we'd see other differences, so it's probably a BuildID-specific issue.

(I wish we were more aggressive in discouraging the use of Go files on the command line. A lot of things, from build tags to multi-file packages if a file is left out, subtly breaks.)

@FiloSottile FiloSottile changed the title cmd/go: build ID depends on the output path cmd/go: build ID depends on the full path of Go files passed on the command line Sep 23, 2021
@FiloSottile
Copy link
Contributor

@FiloSottile FiloSottile commented Sep 23, 2021

It would make this debugging much easier if go tool buildid had a -v flag that listed all that gets hashed. Should I open an issue for it?

@jayconrod
Copy link
Contributor

@jayconrod jayconrod commented Sep 23, 2021

Setting GODEBUG=gocachehash=1 when building a package should cause the go command to print everything that goes into cache keys. go tool buildid won't have that information. go help cache says a bit more about this.

@jayconrod
Copy link
Contributor

@jayconrod jayconrod commented Sep 23, 2021

I tried this in a quick example. Here's the cache key I'm seeing:

HASH[build command-line-arguments]
HASH[build command-line-arguments]: "devel go1.18-8ff94bbecb Wed Sep 22 13:48:21 2021 -0700"
HASH[build command-line-arguments]: "compile\n"
HASH[build command-line-arguments]: "goos linux goarch amd64\n"
HASH[build command-line-arguments]: "import \"command-line-arguments\"\n"
HASH[build command-line-arguments]: "omitdebug false standard false local true prefix \"_/home/jay/Code/test/a\"\n"
HASH[build command-line-arguments]: "trimpath\n"
HASH[build command-line-arguments]: "modinfo \"path\\tcommand-line-arguments\\n\"\n"
HASH[build command-line-arguments]: "compile tdLvclvHAV-OCkdMX9DV [] []\n"
HASH[build command-line-arguments]: "GOAMD64=v1\n"
HASH /home/jay/Code/test/a/main.go: 55a60bb97151b2b4b680462447ce60ec34511b14fa10d77440c97b9777101566
HASH[build command-line-arguments]: "file main.go VaYLuXFRsrS2gEYkR85g\n"
HASH[build command-line-arguments]: "import runtime -vV7n2SyuxUHQNgBVBri\n"
HASH[build command-line-arguments]: 7b6cdecf56c1e3618c9976560d07fa292928f32ba7b821fa31d29891f8114128

The bold string above is an absolute path to the package's local prefix, which is used to resolve relative imports like:

import "../foo"

Those are only allowed in packages specified as a list of files on the command line. I agree with @FiloSottile: I wish these weren't a thing.

I can't think of a good reason for the local prefix to be part of the cache key, and it doesn't really seem like it should be a distinct field from the package directory anyway. By the time the go command computes the cache key, it should already have resolved imports, and the output ids from those imports will be part of the cache key.

We do pass the local prefix to the compiler in gc.go. Not sure why the compiler needs it or if it affects the output.

@bcmills
Copy link
Member

@bcmills bcmills commented Sep 23, 2021

We do pass the local prefix to the compiler in gc.go. Not sure why the compiler needs it or if it affects the output.

Let's try not doing that and see what breaks! 😁

@dr2chase
Copy link
Contributor

@dr2chase dr2chase commented Sep 23, 2021

The flag is consumed here:
https://cs.opensource.google/go/go/+/refs/tags/go1.17.1:src/cmd/compile/internal/noder/import.go;l=160
Any absolute paths are certainly visible to the compiler; w/o them, debugging gets a little harder.

@bcmills
Copy link
Member

@bcmills bcmills commented Sep 23, 2021

Yeah. Perhaps we can start by just not passing that flag when -trimpath is set.

@jayconrod
Copy link
Contributor

@jayconrod jayconrod commented Sep 23, 2021

@dr2chase Does the -D flag have any influence on the output, or is it only used in error messages?

I'd expect that cmd/go could pass in an importcfg file that just maps import strings to .a file paths. Ideally the compiler wouldn't need -D at all when -importcfg is used.

@dr2chase
Copy link
Contributor

@dr2chase dr2chase commented Sep 23, 2021

I am not sure. There's a lot going on. The fact that debugging seems to be able to find the source file (in the default case, at least) suggests that at least some absolute paths are embedded in the debugging information. E.g.:

Breakpoint 1 (enabled) set at 0x1054c8a for main.main() /Users/drchase/work/wtf/wtf.go:8

(this in a an executable that I copied into /tmp/bar, also debugged in that directory)

@dr2chase
Copy link
Contributor

@dr2chase dr2chase commented Sep 23, 2021

But with -go build -trimpath ., it declares that the file is in wtf/wtf.go

@dr2chase
Copy link
Contributor

@dr2chase dr2chase commented Sep 23, 2021

The leading wtf is the directory prefix, not the module name. I am not sure why, if we put a directory there, wtf is any better than ..

Debugging in the directory where I built it, delve is kinda puzzled:

cd $wtf
# drchase-macbookpro1 ~/work/wtf 21-09-23 17:47:52
dlv exec ./wtf
Type 'help' for list of commands.
(dlv) b main.main
Breakpoint 1 (enabled) set at 0x1054c8a for main.main() wtf/wtf.go:8
(dlv) c
> main.main() wtf/wtf.go:8 (hits goroutine(1):1 total:1) (PC: 0x1054c8a)
Warning: debugging optimized function
(dlv) list
> main.main() wtf/wtf.go:8 (hits goroutine(1):1 total:1) (PC: 0x1054c8a)
Warning: debugging optimized function
Command failed: open wtf/wtf.go: not a directory

@gopherbot
Copy link

@gopherbot gopherbot commented Sep 28, 2021

Change https://golang.org/cl/352810 mentions this issue: cmd/go: do not pass a local prefix to the compiler in module mode

@bcmills
Copy link
Member

@bcmills bcmills commented Sep 28, 2021

CL 352810 should fix this in module mode.

A fix in GOPATH mode would be much more invasive and doesn't seem worth doing: as far as I can tell this isn't a regression from previous Go releases, and triggering the problematic behavior would require the combination of GOPATH mode and local imports — each of which should be rare and becoming rarer.

@bcmills bcmills self-assigned this Sep 28, 2021
@bcmills bcmills added this to the Go1.18 milestone Sep 28, 2021
gopherbot pushed a commit that referenced this issue Oct 6, 2021
In GOPATH mode, source files may import other packages using relative
(“local”) paths. In module mode, relative imports are never allowed:
import paths must always be fully specified.

When local imports are allowed, we pass a local-import prefix to the
compiler using the '-D' flag. That could theoretically change the
compiler's output, so it must be included in the cache key even when
-trimpath is set. (TODO: when -trimpath is set, the local-import
prefix ought to be trimmed anyway, so it still shouldn't matter.)

However, when local imports are disallowed, we should not pass the
local-import prefix and it should not affect cmd/go's cache key or the
final build ID of any artifact.

For #48557

Change-Id: I2d627d67d13e5da2cac6d411cd4e2d87e510876c
Reviewed-on: https://go-review.googlesource.com/c/go/+/352810
Trust: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jay Conrod <jayconrod@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
8 participants