Skip to content

cmd/compile: builds not reproducible on single-cpu vs multiple-cpu machines #38068

@mjl-

Description

@mjl-

What version of Go are you using (go version)?

$ go version
go1.14.1

Also tested with all versions from go1.13 and up, same problem.

Does this issue reproduce with the latest release?

Latest release, yes. Haven't tried tip.

What operating system and processor architecture are you using (go env)?

I ran into this when building the same code on many different machines: freebsd12 freebsd/386, debian10-stable linux/386, ubuntu18lts linux/amd64, debian-unstable linux/amd64, macos-latest darwin/amd64.

What did you do?

I built the same code on all the machines, with options that should result in the exact same binary, but got different results on same of the machines. I tracked it down to the machines with a single cpu creating a different binary. Building with GO19CONCURRENTCOMPILATION=0 on the multi-cpu machines resulted in binaries identical to the ones produced by the single-cpu machines.

Here is an example to run on a multi-cpu machine (I used linux/amd64, but should not make a difference):

git clone https://github.com/golang/tools
cd tools
git checkout a49f79bcc2246daebe8647db377475ffc1523d7b  # latest at time of writing, no special meaning
cd cmd/present

# build
GO111MODULE=on CGO_ENABLED=0 GOOS=linux GOARCH=amd64 $HOME/sdk/go1.14.1/bin/go build -trimpath -a -ldflags=-buildid=

sha256sum present
# e14b34025c0e9bc8256f05a09d245449cfd382c2014806bd1d676e7b0794a89f  present

cp present present.morecpu  # keep for diff, later on

# build, now with GO19CONCURRENTCOMPILATION=0
GO19CONCURRENTCOMPILATION=0 GO111MODULE=on CGO_ENABLED=0 GOOS=linux GOARCH=amd64 $HOME/sdk/go1.14.1/bin/go build -trimpath -a -ldflags=-buildid=

sha256sum present
# 44e21497225e8d88500b008ec961d64907ca496104a18802aaee817397c4fb11  present

What did you expect to see?

The exact same binary.

What did you see instead?

Different binaries.

Details

I found this problem by building on a single-cpu VM, where NumCPU is 1. That probably prevents concurrency during compiles. GO19CONCURRENTCOMPILATION=0 disables concurrent compiles. The compiler and linker use runtime.NumCPU() in a few places, and perhaps GO19CONCURRENTCOMPILATION=0 isn't enough but just hides the symptoms.

FYI, so far I always got the same binary on machines with multiple cpu's, whether 2 cpu's or more, like 8 cpu's. But perhaps that's just because a high level of parallelism isn't reached.

The problem does not manifest for very simple programs (e.g. goimports in the same git checkout). Perhaps because there isn't enough to parallelize.

When building with -ldflags=-w, omitting the DWARF symbol table, the two build commands produce the same output again. I looked into that because of the diff below. I've seen earlier "reproducible build commands" that included -ldflags="-s -w". I don't expect those to be required to get reproducible builds.

$ diff <(objdump -x present.morecpu) <(objdump -x present)
2,3c2,3
< present.morecpu:     file format elf64-x86-64
< present.morecpu
---
> present:     file format elf64-x86-64
> present
60c60
<  18 .zdebug_loc   00338d92  0000000001049464  0000000001049464  00c19464  2**0
---
>  18 .zdebug_loc   00338d92  0000000001049457  0000000001049457  00c19457  2**0
62c62
<  19 .zdebug_ranges 0010d750  00000000010e2de1  00000000010e2de1  00cb2de1  2**0
---
>  19 .zdebug_ranges 0010d750  00000000010e2dd6  00000000010e2dd6  00cb2dd6  2**0

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsFixThe path to resolution is known, but the work has not been done.okay-after-beta1Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1release-blocker

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions