Skip to content

cmd/go: support reproducible buildid when building with -trimpath #34186

@longsleep

Description

@longsleep

When building with the new -trimpath in 1.13, resulting binaries are almost reproducible. The only thing different is the buildid which also gets written inside the resulting build artifacts.

#16860 implemented the -trimpath flag, stripping the paths successfully. But apparently the buildid still gets the path taken into account in its actionID parts.

// The "one-element cache" purpose is a bit more complex for installed
// binaries. For a binary, like cmd/gofmt, there are two steps: compile
// cmd/gofmt/*.go into main.a, and then link main.a into the gofmt binary.
// We do not install gofmt's main.a, only the gofmt binary. Being able to
// decide that the gofmt binary is up-to-date means computing the action ID
// for the final link of the gofmt binary and comparing it against the
// already-installed gofmt binary. But computing the action ID for the link
// means knowing the content ID of main.a, which we did not keep.
// To sidestep this problem, each binary actually stores an expanded build ID:
//
//    actionID(binary)/actionID(main.a)/contentID(main.a)/contentID(binary)

Resulting in different non-reproducible builds when building the same source from different paths.

For example CGO_ENABLED=0 go build -trimpath -o bin/tool ./cmd/tool yields

a: t6sFhx64vDfJLhVcRcjW/J56i1RIbPcgOquS7FGtO/aDD2rPWxLW9E5uImuM8n/V-uwd08cQ2E04lQdOuy0

For source in folder a

and

b: EnJtrdNkod77vx69dugv/lCsJ2qPEOefIbrg_fZ5U/aDD2rPWxLW9E5uImuM8n/V-uwd08cQ2E04lQdOuy0

for source in folder b.

This is the only thing different in the resulting binary. Is that intentional? It would be nice for reproducible builds if the buildid could be the same no matter in what folder the source is actually built.

For the time being i use a small Python script to override the different parts in the resulting binary like this

    buildid = subprocess.check_output([go, 'tool', 'buildid', fn]).strip()
    actionid = b'/'.join(buildid.split(b'/', 2)[:2])
    with open(fn, 'r+b') as f:
        data = f.read()
        idx = data.find(actionid)
        if idx == -1:
            raise ValueError('actionid not found in file')
        f.seek(idx)
        f.write(b'0'*(len(actionid)-2))
        f.write(b'/0')

but this might not be very reliable. Can this be improved?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions